Publication
Peer Learning: Learning Complex Policies in Groups from Scratch via Action Recommendations
C. Derstroff; J. Brugger; M. Cerrato; Jan Peters; S. Kramer
In: The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24). AAAI Conference on Artificial Intelligence (AAAI-2024), AAAI Press, Washington, DC, USA, 2024.
Abstract
Abstract
Peer learning is a novel high-level reinforcement learning
framework for agents learning in groups. While standard re-
inforcement learning trains an individual agent in trial-and-
error fashion, all on its own, peer learning addresses a re-
lated setting in which a group of agents, i.e., peers, learns
to master a task simultaneously together from scratch. Peers
are allowed to communicate only about their own states and
actions recommended by others: “What would you do in my
situation?”. Our motivation is to study the learning behavior
of these agents. We formalize the teacher selection process in
the action advice setting as a multi-armed bandit problem and
therefore highlight the need for exploration. Eventually, we
analyze the learning behavior of the peers and observe their
ability to rank the agents’ performance within the study group
and understand which agents give reliable advice. Further, we
compare peer learning with single agent learning and a state-
of-the-art action advice baseline. We show that peer learning
is able to outperform single-agent learning and the baseline
in several challenging discrete and continuous OpenAI Gym
domains. Doing so, we also show that within such a frame-
work complex policies from action recommendations beyond
discrete action spaces can evolve.