Local-utopia policy selection for multi-objective reinforcement learningSimone Parisi; Alexander Blank; Tobias Viernickel; Jan Peters
In: 2016 IEEE Symposium Series on Computational Intelligence. IEEE Symposium Series on Computational Intelligence (SSCI-2016), December 6-9, Athens, Greece, Pages 1-7, IEEE, 2016.
Many real-world applications are characterized by multiple conflicting objectives. In such problems, optimality is replaced by Pareto optimality and the goal is to find the Pareto frontier, a set of solutions representing different compromises among the objectives. Despite recent advances in multi-objective optimization, the selection, given the Pareto frontier, of a Pareto-optimal policy is still an important problem, prominent in practical applications such as economics and robotics. In this paper, we present a versatile approach for selecting a policy from the Pareto frontier according to user-defined preferences. Exploiting a novel scalarization function and heuristics, our approach provides an easy-to-use and effective method for Pareto-optimal policy selection. Furthermore, the scalarization is applicable in multiple-policy learning strategies for approximating Pareto frontiers. To show the simplicity and effectiveness of our algorithm, we evaluate it on two problems and compare it to classical multi-objective reinforcement learning approaches.