Skip to main content Skip to main navigation


Learning Control Policies from Optimal Trajectories

Christoph Zelch; Jan Peters; Oskar von Stryk
In: 2020 IEEE International Conference on Robotics and Automation, ICRA 2020, Paris, France, May 31 - August 31, 2020. IEEE International Conference on Robotics and Automation (ICRA), Pages 2529-2535, IEEE, 2020.


The ability to optimally control robotic systems offers significant advantages for their performance. While time-dependent optimal trajectories can numerically be computed for high dimensional nonlinear system dynamic models, constraints and objectives, finding optimal feedback control policies for such systems is hard. This is unfortunate, as without a policy, the control of real-world systems requires frequent correction or replanning to compensate for disturbances and model errors.In this paper, a feedback control policy is learned from a set of optimal reference trajectories using Gaussian processes. Information from existing trajectories and the current policy is used to find promising start points for the computation of further optimal trajectories. This aspect is important as it avoids exhaustive sampling of the complete state space, which is impractical due to the high dimensional state space, and to focus on the relevant region.The presented method has been applied in simulation to a swing-up problem of an underactuated pendulum and an energy-minimal point-to-point movement of a 3-DOF industrial robot.

Weitere Links