Publication

Approximate Value Iteration Based on Numerical Quadrature

Julia Vinogradska; Bastian Bischoff; Jan Peters

In: IEEE Robotics and Automation Letters (RA-L), Vol. 3, No. 2, Pages 1330-1337, IEEE, 2018.

Abstract

Learning control policies has become an appealing alternative to the derivation of control laws based on classic control theory. Value iteration approaches have proven an outstanding flexibility, while maintaining high data efficiency when combined with probabilistic models to eliminate model bias. However, a major difficulty for these methods is that the state and action spaces must typically be discretized and often the value function update is analytically intractable. In this letter, we propose a projection based approximate value iteration approach, that employs numerical quadrature for the value function update step. It can handle continuous state and action spaces and noisy measurements of the system dynamics while learning globally optimal control from scratch. In addition, the proposed approximation technique allows for upper bounds on the approximation error, which can be used to guarantee convergence of the proposed approach to an optimal policy under some assumptions. Empirical evaluations on the mountain benchmark problem show the efficiency of the proposed approach and support our theoretical results.

Approximate Value Iteration Based on Numerical Quadrature

Abstract

More links