Model-based Direct Policy Search for Skill Learning in Continuous Domains

Jan Hendrik Metzen

In: Proceedings of the 10th European Workshop on Reinforcement Learning. European Workshop on Reinforcement Learning (EWRL-12) 10th June 30-July 1 Edinburgh United Kingdom 6/2012.


One interesting problem domain for reinforcement learning (RL) are real-world robotic control applications. These domains can be modeled as (potentially partially observable or noisy) Markov Decision Processes with both continuous state and action spaces (cMDPs). Several authors (Togelius et al., 2009; Kalyanakrishnan and Stone, 2009) argue that for such continuous and noisy domains, direct policy search (DPS) methods may outperform value-function based RL.

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence