Skip to main content Skip to main navigation


Hierarchical Policy Blending As Optimal Transport

An T. Le; Kay Hansel; Jan Peters; Georgia Chalvatzaki
In: Nikolai Matni; Manfred Morari; George J. Pappas (Hrsg.). Proceedings of The 5th Annual Learning for Dynamics and Control Conference. Learning for Dynamics and Control Conference (L4DC-2023), June 14-16, Philadelphia, PA, USA, Pages 797-812, Proceedings of Machine Learning Research (PMLR), Vol. 211, PMLR, 2023.


We present hierarchical policy blending as optimal transport (HiPBOT). HiPBOT hierarchically adjusts the weights of low-level reactive expert policies of different agents by adding a look-ahead planning layer on the parameter space. The high-level planner renders policy blending as unbalanced optimal transport consolidating the scaling of the underlying Riemannian motion policies. As a result, HiPBOT effectively decides the priorities between expert policies and agents, ensuring the task's success and guaranteeing safety. Experimental results in several application scenarios, from low-dimensional navigation to high-dimensional whole-body control, show the efficacy and efficiency of HiPBOT. Our method outperforms state-of-the-art baselines -- either adopting probabilistic inference or defining a tree structure of experts -- paving the way for new applications of optimal transport to robot control. More material at

Weitere Links