A Comprehensive Survey of Multiagent Reinforcement Learning

doi:10.1109/TSMCC.2007.913919

Open AccessJournal Article10.1109/TSMCC.2007.913919

A Comprehensive Survey of Multiagent Reinforcement Learning

Lucian Busoniu, +2 more

- 01 Mar 2008

- Vol. 38, Iss: 2, pp 156-172

2.2K

TL;DR: The benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied, and an outlook for the field is provided.

Abstract: Multiagent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, and economics. The complexity of many tasks arising in these domains makes them difficult to solve with preprogrammed agent behaviors. The agents must, instead, discover a solution on their own, using learning. A significant part of the research on multiagent learning concerns reinforcement learning techniques. This paper provides a comprehensive survey of multiagent reinforcement learning (MARL). A central issue in the field is the formal statement of the multiagent learning goal. Different viewpoints on this issue have led to the proposal of many different goals, among which two focal points can be distinguished: stability of the agents' learning dynamics, and adaptation to the changing behavior of the other agents. The MARL algorithms described in the literature aim---either explicitly or implicitly---at one of these two goals or at a combination of both, in a fully cooperative, fully competitive, or more general setting. A representative selection of these algorithms is discussed in detail in this paper, together with the specific issues that arise in each category. Additionally, the benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied. Finally, an outlook for the field is provided.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Fig. 4. (Left) An agent (◦) attempting to reach a goal (×) while avoiding capture by another agent (•). (Right) The Q-values of agent 1 for the state depicted to the left (Q2 = −Q1 ).

Fig. 1. Breakdown of MARL algorithms by the type of task they address.

TABLE I STABILITY AND ADAPTATION IN MARL

Fig. 2. MARL encompasses temporal-difference reinforcement learning, game theory, and direct policy search techniques.

Fig. 5. (Left) Two cleaning robots negotiating their assignment to different wings of a building. Both robots prefer to clean the smaller left wing. (Right) The Q-values of the two robots for the state depicted to the left.

TABLE II BREAKDOWN OF MARL ALGORITHMS BY TASK TYPE AND DEGREE OF AGENT AWARENESS

Citations

•Journal Article•10.1109/MSP.2017.2743240

Deep Reinforcement Learning: A Brief Survey

Kai Arulkumaran, +3 more

- 09 Nov 2017

- IEEE Signal Processing Magazine

TL;DR: Deep reinforcement learning (DRL) is poised to revolutionize the field of artificial intelligence (AI) and represents a step toward building autonomous systems with a higher-level understanding of the visual world as discussed by the authors.

...read moreread less

3.1K

•Posted Content

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Ryan Lowe, +5 more

- 07 Jun 2017

- arXiv: Learning

TL;DR: An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.

...read moreread less

2.9K

•Journal Article•10.1109/MSP.2017.2743240

A brief survey of deep reinforcement learning

Kai Arulkumaran, +3 more

- 09 Nov 2017

- arXiv: Learning

TL;DR: This survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via RL.

...read moreread less

2.6K

•Proceedings Article•10.1109/CVPR.2017.233

DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

Namhoon Lee, +5 more

- 14 Apr 2017

TL;DR: The proposed Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes significantly improves the prediction accuracy compared to other baseline methods.

...read moreread less

1.3K

•Posted Content

Deep Reinforcement Learning: An Overview

Yuxi Li

- 25 Jan 2017

- arXiv: Learning

TL;DR: This work discusses core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration, and important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.

...read moreread less

1.2K

...

Expand

References

•Proceedings Article•10.5555/777092.777145

Reinforcement learning of coordination in cooperative multi-agent systems

Spiros Kapetanakis, +1 more

- 28 Jul 2002

TL;DR: This investigation of reinforcement learning techniques for the learning of coordination in cooperative multi-agent systems focuses on a novel action selection strategy for Q-learning (Watkins 1989), and demonstrates empirically that this extension causes the agents to converge almost always to the optimal joint action even in these difficult cases.

...read moreread less

236

•Book

Industrial and practical applications of DAI

H. Van Dyke Parunak

- 01 Jan 1999

188

Report•10.21236/ADA440418

Hierarchical MultiAgent Reinforcement Learning

Mohammad Ghavamzadeh, +1 more

- 25 Jan 2004

TL;DR: The use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multiagent tasks is investigated and a hierarchical multiagent RL algorithm called Cooperative HRL is proposed.

...read moreread less

186

Journal Article•10.1023/A:1015504423309

Pricing in Agent Economies Using Multi-Agent Q-Learning

Gerald Tesauro, +1 more

- 01 Sep 2002

- Autonomous Agents and Multi-Agent System...

TL;DR: This paper studies simultaneous Q-learning by two competing seller agents in three moderately realistic economic models and finds that, despite the lack of theoretical guarantees, simultaneous convergence to self-consistent optimal solutions is obtained in each model, at least for small values of the discount parameter.

...read moreread less

174

Journal Article•10.1007/S10458-006-7035-4

Hierarchical multi-agent reinforcement learning

Mohammad Ghavamzadeh, +2 more

- 01 Sep 2006

- Autonomous Agents and Multi-Agent System...

TL;DR: The multi-agent HRL framework is extended to include communication decisions and a cooperative multi- agent HRL algorithm called COM-Cooperative HRL is proposed, which allows agents to learn coordination faster by sharing information at the level of cooperative subtasks.

...read moreread less

173