Publication
Safe Reinforcement Learning on the Constraint Manifold: Theory and Applications
Puze Liu; Haitham Bou-Ammar; Jan Peters; Davide Tateo
In: Computing Research Repository eprint Journal (CoRR), Vol. abs/2404.09080, Pages 1-20, arXiv, 2024.
Abstract
Integrating learning-based techniques, especially re-
inforcement learning, into robotics is promising for solving
complex problems in unstructured environments. However, most
existing approaches are trained in well-tuned simulators and
subsequently deployed on real robots without online fine-tuning.
In this setting, extensive engineering is required to mitigate
the sim-to-real gap, which can be challenging for complex
systems. Instead, learning with real-world interaction data offers
a promising alternative: it not only eliminates the need for a fine-
tuned simulator but also applies to a broader range of tasks where
accurate modeling is unfeasible. One major problem for on-
robot reinforcement learning is ensuring safety, as uncontrolled
exploration can cause catastrophic damage to the robot or
the environment. Indeed, safety specifications, often represented
as constraints, can be complex and non-linear, making safety
challenging to guarantee in learning systems. In this paper,
we show how we can impose complex safety constraints on
learning-based robotics systems in a principled manner, both
from theoretical and practical points of view. Our approach is
based on the concept of the Constraint Manifold, representing the
set of safe robot configurations. Exploiting differential geometry
techniques, i.e., the tangent space, we can construct a safe action
space, allowing learning agents to sample arbitrary actions while
ensuring safety. We demonstrate the method’s effectiveness in a
real-world Robot Air Hockey task, showing that our method can
handle high-dimensional tasks with complex constraints. Videos
of the real robot experiments are available publicly.
