A Comprehensive Survey of Multiagent Reinforcement Learning
Lucian Busoniu,Robert Babuska,B. De Schutter +2 more
- 01 Mar 2008
- Vol. 38, Iss: 2, pp 156-172
TL;DR: The benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied, and an outlook for the field is provided.
read more
Abstract: Multiagent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, and economics. The complexity of many tasks arising in these domains makes them difficult to solve with preprogrammed agent behaviors. The agents must, instead, discover a solution on their own, using learning. A significant part of the research on multiagent learning concerns reinforcement learning techniques. This paper provides a comprehensive survey of multiagent reinforcement learning (MARL). A central issue in the field is the formal statement of the multiagent learning goal. Different viewpoints on this issue have led to the proposal of many different goals, among which two focal points can be distinguished: stability of the agents' learning dynamics, and adaptation to the changing behavior of the other agents. The MARL algorithms described in the literature aim---either explicitly or implicitly---at one of these two goals or at a combination of both, in a fully cooperative, fully competitive, or more general setting. A representative selection of these algorithms is discussed in detail in this paper, together with the specific issues that arise in each category. Additionally, the benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied. Finally, an outlook for the field is provided.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Fig. 4. (Left) An agent (◦) attempting to reach a goal (×) while avoiding capture by another agent (•). (Right) The Q-values of agent 1 for the state depicted to the left (Q2 = −Q1 ). 
Fig. 1. Breakdown of MARL algorithms by the type of task they address. 
TABLE I STABILITY AND ADAPTATION IN MARL 
Fig. 2. MARL encompasses temporal-difference reinforcement learning, game theory, and direct policy search techniques. 
Fig. 5. (Left) Two cleaning robots negotiating their assignment to different wings of a building. Both robots prefer to clean the smaller left wing. (Right) The Q-values of the two robots for the state depicted to the left. 
TABLE II BREAKDOWN OF MARL ALGORITHMS BY TASK TYPE AND DEGREE OF AGENT AWARENESS
Citations
Deep Reinforcement Learning: A Brief Survey
TL;DR: Deep reinforcement learning (DRL) is poised to revolutionize the field of artificial intelligence (AI) and represents a step toward building autonomous systems with a higher-level understanding of the visual world as discussed by the authors.
3.1K
•Posted Content
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
TL;DR: An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.
2.9K
A brief survey of deep reinforcement learning
TL;DR: This survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via RL.
DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents
Namhoon Lee,Wongun Choi,Paul Vernaza,Christopher Choy,Philip H. S. Torr,Manmohan Chandraker +5 more
- 14 Apr 2017
TL;DR: The proposed Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes significantly improves the prediction accuracy compared to other baseline methods.
•Posted Content
Deep Reinforcement Learning: An Overview
TL;DR: This work discusses core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration, and important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.
References
Reinforcement learning of coordination in cooperative multi-agent systems
Spiros Kapetanakis,Daniel Kudenko +1 more
- 28 Jul 2002
TL;DR: This investigation of reinforcement learning techniques for the learning of coordination in cooperative multi-agent systems focuses on a novel action selection strategy for Q-learning (Watkins 1989), and demonstrates empirically that this extension causes the agents to converge almost always to the optimal joint action even in these difficult cases.
Hierarchical MultiAgent Reinforcement Learning
Mohammad Ghavamzadeh,Sridhar Mahadevan +1 more
- 25 Jan 2004
TL;DR: The use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multiagent tasks is investigated and a hierarchical multiagent RL algorithm called Cooperative HRL is proposed.
186
Pricing in Agent Economies Using Multi-Agent Q-Learning
TL;DR: This paper studies simultaneous Q-learning by two competing seller agents in three moderately realistic economic models and finds that, despite the lack of theoretical guarantees, simultaneous convergence to self-consistent optimal solutions is obtained in each model, at least for small values of the discount parameter.
174
Hierarchical multi-agent reinforcement learning
TL;DR: The multi-agent HRL framework is extended to include communication decisions and a cooperative multi- agent HRL algorithm called COM-Cooperative HRL is proposed, which allows agents to learn coordination faster by sharing information at the level of cooperative subtasks.
173
Related Papers (5)
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Mar 1998
Chris Watkins,Peter Dayan +1 more