Sensorimotor Control for Locomotion
Background
Biological locomotion is the movement of an animal from one location to another, through periodic changes in the shape of the body, along with interaction with the environment. The periodic motion of the shape, which constitutes the building block of locomotion, is called the locomotion gait. Examples of the locomotion gait are legged locomotion, flapping of the wings for flying, or wavelike motion of the fish for swimming.
We construct a bio-inspired central pattern generator (CPG)-type architecture for learning optimal maneuvering control of the periodic locomotion gait. The architecture is presented here with the aid of two examples involving planar locomotion of coupled rigid body systems.
Modeling
Example 1 is a system of two planar rigid bodies, the head body and the tail body, connected by a single degree of freedom pin joint. The shape variable is the relative orientation between two bodies and the group variable is the absolute orientation of the frame rigidly affixed to head body. There is no external force applied to the system, and thus the total angular momentum is conserved.
Example 2 is a snake system modeled by n (n>2) planar rigid links, connected by n-1 single degree of freedom joints. The shape variable is the vector of relative angle between links and the group variable is the average orientation (global orientation) of the system and the position of center of mass. The system is subject to friction to the ground so that the snake can move forward or backward.
Both systems have an open loop periodic input at each joint as the torque actuation. This control has no purpose except to have the shape variable oscillate in a periodic manner.
Noisy observations
Each joint of the system is equipped with a sensor that provides noisy measurements of the shape variable. Take the snake system as an example, the model for the sensor at the j-th joint is
Control Objective
For two-body system, the control input is the length of the tail body and we want the system to turn the head body clockwise.
For the snake system, the control input enters via change in friction coefficients. We want the system to turn its direction while moving forward.
Step 1 - Phase Reduction
Under the periodic torque input, we assume that the trajectories of the shape variable and its time derivative at each joint form an isolated asymptotically stable periodic orbit (limit cycle). The partially observed shape variable can now be reduced to a phase variable.
Two-body system
Snake system
Step 2 - Feedback Particle Filter (FPF)
For two-body system, we have one feedback particle filter at the sensor of the joint. For the snake system, we assume that each joint has a sensor and are independent of each other. The FPF algorithm is decomposed to n-1 independent filters. The evolution of particles for the j-th filter is
Step 3 - Q-Learning
The analogue of the Q-function for continuous-time system is the Hamiltonian function. The Q-learning problem is to solve the fixed-point equation. The Hamiltonian function is approximated using linear function approximation. We define the point-wise Bellman error and use a gradient descent algorithm to learn the weights of the approximated Hamiltonian function.
Simulation Results
Publications
T. Wang, A. Taghvaei, and P. G. Mehta. Bio-inspired learning of sensorimotor control for locomotion. arXiv preprint arXiv:1910.02556, 2019.
T. Wang, A. Taghvaei, and P. G. Mehta. Q-learning for POMDP: An application to locomotion gaits. In 58th IEEE Conference on Decision and Control (CDC). IEEE, 2019.
Acknowledgements
Financial support from the NSF grant CMMI-1462773, the ONR MURI grant N00014-19-1-2373 and the ARO grant W911NF1810334 is gratefully acknowledged.