Top 21 papers published in the topic of Multi-agent planning in 2018

Showing papers on "Multi-agent planning published in 2018"

Journal Article•10.1177/0278364918774135•

Simultaneous task allocation and planning for temporal logic goals in heterogeneous multi-robot systems

[...]

Philipp Schillinger¹, Philipp Schillinger², Mathias Bürger², Dimos V. Dimarogonas¹•Institutions (2)

23 May 2018-The International Journal of Robotics Research

TL;DR: The proposed framework avoids the need to compute a combinatorial number of possible assignment costs, where each computation itself requires solving a complex planning problem, and can improve computational efficiency compared with classical assignment solutions, in particular for on-demand missions where task costs are unknown in advance.

...read moreread less

Abstract: This paper describes a framework for automatically generating optimal action-level behavior for a team of robots based on temporal logic mission specifications under resource constraints. The propo...

...read moreread less

160 citations

Book Chapter•10.1007/978-3-319-73008-0_18•

Decomposition of Finite LTL Specifications for Efficient Multi-agent Planning

[...]

Philipp Schillinger¹, Philipp Schillinger², Mathias Bürger², Dimos V. Dimarogonas¹•Institutions (2)

Royal Institute of Technology¹, Bosch²

1 Jan 2018

TL;DR: This work proposes an automata-based approach to automatically identify possible decompositions of the LTL specification into sets of independently executable task specifications, which leads directly to the construction of a team model with significantly lower complexity than other representations constructed with conventional methods.

...read moreread less

Abstract: Generating verifiably correct execution strategies from Linear Temporal Logic (LTL) mission specifications avoids the need for manually designed robot behaviors. However, when incorporating a team of robot agents, the additional model complexity becomes a critical issue. Given a single finite LTL mission and a team of robots, we propose an automata-based approach to automatically identify possible decompositions of the LTL specification into sets of independently executable task specifications. Our approach leads directly to the construction of a team model with significantly lower complexity than other representations constructed with conventional methods. Thus, it enables efficient search for an optimal decomposition and allocation of tasks to the robot agents.

...read moreread less

64 citations

Journal Article•10.1145/3133326•

Quantifying Privacy Leakage in Multi-Agent Planning

[...]

Michal Štolba¹, Jan Tožička¹, Antonín Komenda¹•Institutions (1)

Czech Technical University in Prague¹

05 Feb 2018-ACM Transactions on Internet Technology

TL;DR: This article expands on a privacy measure based on information leakage introduced in previous work, and presents a general approach to computing privacy leakage of search-based multi-agent planners by utilizing search-tree reconstruction and classification of leaked superfluous information about the applicability of actions.

...read moreread less

Abstract: Multi-agent planning using MA-STRIPS–related models is often motivated by the preservation of private information Such a motivation is not only natural for multi-agent systems but also is one of the main reasons multi-agent planning problems cannot be solved with a centralized approach Although the motivation is common in the literature, the formal treatment of privacy is often missing In this article, we expand on a privacy measure based on information leakage introduced in previous work, where the leaked information is measured in terms of transition systems represented by the public part of the problem with regard to the information obtained during the planning process Moreover, we present a general approach to computing privacy leakage of search-based multi-agent planners by utilizing search-tree reconstruction and classification of leaked superfluous information about the applicability of actions Finally, we present an analysis of the privacy leakage of two well-known algorithms—multi-agent forward search (MAFS) and Secure-MAFS—both in general and on a particular example The results of the analysis show that Secure-MAFS leaks less information than MAFS

...read moreread less

18 citations

Proceedings Article•10.1109/IECON.2018.8591390•

Intelligent Mechatronic System with Decentralised Control and Multi-Agent Planning

[...]

Andrei Kalachev¹, Gulnara Zhabelova¹, Valeriy Vyatkin¹, Dennis Jarvis², Cheng Pang - Show less +1 more•Institutions (2)

Luleå University of Technology¹, Central Queensland University²

26 Dec 2018

TL;DR: The presented study explores an approach where MAS realizes high level coordination tasks while IMCs provides embedded services while MAS is realized in GORITE goal-oriented team programming framework deployed as web service in the Cloud.

...read moreread less

Abstract: Flexibility and reconfigurability of production systems requires intelligent devices and products that enables easy integration and reconfiguration eliminating the need of explicit programing the functionality of resulting system. This lead to the development of such concept as Intelligent Mechatronic Component. However, coordinating such distributed self-contained components into the desired logic of operation is a challenging task. Multi-agent systems (MAS) architecture provides necessary features for seamless integration of individual functionalities of agents into system's behaviour by self-configuration. The presented study explores an approach where MAS realizes high level coordination tasks while IMCs provides embedded services. MAS is realized in GORITE goal-oriented team programming framework deployed as web service in the Cloud. Low-level control of IMCs is developed in IEC 61499. The paper presents a case study of a Pick and Place manipulator composed of intelligent cylinders.

...read moreread less

11 citations

Journal Article•10.1007/S10458-017-9372-X•

Severity-sensitive norm-governed multi-agent planning

[...]

Luca Gasparini¹, Timothy J. Norman², Martin J. Kollingbaum¹•Institutions (2)

University of Aberdeen¹, University of Southampton²

01 Jan 2018-Autonomous Agents and Multi-Agent Systems

TL;DR: This paper model the normative requirements of a system through contrary-to-duty obligations and violation severity levels, and proposes a novel multi-agent planning mechanism based on Decentralised POMDPs that uses a qualitative reward function to capture levels of compliance: N-Dec-POMDPS.

...read moreread less

Abstract: In making practical decisions, agents are expected to comply with ideals of behaviour, or norms. In reality, it may not be possible for an individual, or a team of agents, to be fully compliant—actual behaviour often differs from the ideal. The question we address in this paper is how we can design agents that act in such a way that they select collective strategies to avoid more critical failures (norm violations), and mitigate the effects of violations that do occur. We model the normative requirements of a system through contrary-to-duty obligations and violation severity levels, and propose a novel multi-agent planning mechanism based on Decentralised POMDPs that uses a qualitative reward function to capture levels of compliance: N-Dec-POMDPs. We develop mechanisms for solving this type of multi-agent planning problem and show, through empirical analysis, that joint policies generated are equally as good as those produced through existing methods but with significant reductions in execution time.

...read moreread less

11 citations

Journal Article•10.1080/0952813X.2018.1456786•

A privacy-preserving model for multi-agent propositional planning

[...]

Andrea Bonisoli¹, Alfonso Gerevini¹, Alessandro Saetti¹, Ivan Serina¹•Institutions (1)

University of Brescia¹

14 Apr 2018-Journal of Experimental and Theoretical Artificial Intelligence

TL;DR: This paper proposes a model of the MA-planning tasks that preserves the privacy of the involved agents when this happens, and investigates an algorithm based on best first search for a model that uses some new heuristics providing a trade-off between accuracy and agents’ privacy.

...read moreread less

Abstract: Over the last years, the planning community has formalised several models and approaches to multi-agent (MA) propositional planning. One of the main motivations in MA planning is that some or all a...

...read moreread less

8 citations

Book Chapter•10.1007/978-3-030-00111-7_8•

Efficient Auction Based Coordination for Distributed Multi-agent Planning in Temporal Domains Using Resource Abstraction

[...]

Andreas Hertle¹, Bernhard Nebel¹•Institutions (1)

University of Freiburg¹

24 Sep 2018

TL;DR: This work demonstrates how out of the box temporal planning systems can be employed to increase plan quality for temporal multi-robot tasks and evaluates the approach on two planning domains and finds significant improvements in solution coverage and plan quality.

...read moreread less

Abstract: Recent advances in mobile robotics and AI promise to revolutionize industrial production. As autonomous robots are able to solve more complex tasks, the difficulty of integrating various robot skills and coordinating groups of robots increases dramatically. Domain independent planning promises a possible solution. For single robot systems a number of successful demonstrations can be found in scientific literature. However our experiences at the RoboCup Logistics League in 2017 highlighted a severe lack in plan quality when coordinating multiple robots. In this work we demonstrate how out of the box temporal planning systems can be employed to increase plan quality for temporal multi-robot tasks. An abstract plan is generated first and sub-tasks in the plan are auctioned off to robots, which in turn employ planning to solve these tasks and compute bids. We evaluate our approach on two planning domains and find significant improvements in solution coverage and plan quality.

...read moreread less

6 citations

Journal Article•10.1016/J.KNOSYS.2018.01.013•

FMAP: A platform for the development of distributed multi-agent planning systems

[...]

Alejandro Torreño¹, Oscar Sapena¹, Eva Onaindia¹•Institutions (1)

Polytechnic University of Valencia¹

01 Apr 2018-Knowledge Based Systems

TL;DR: FMAP is presented, a platform aimed at developing distributed MAP solvers such as MAP-POP, FMAP and MH-FMAP, among others, that make use of multi-agent communication protocols.

...read moreread less

Abstract: This work is supported by the Spanish MINECO under projects TIN2014-55637-C2-2-R and TIN2017-88476-C2-1-R. The first author was funded by the Spanish SEPE.

...read moreread less

5 citations

Journal Article•10.1007/S10458-018-9394-Z•

Action dependencies in privacy-preserving multi-agent planning

[...]

Shlomi Maliah¹, Guy Shani¹, Roni Stern¹•Institutions (1)

Ben-Gurion University of the Negev¹

07 Aug 2018-Autonomous Agents and Multi-Agent Systems

TL;DR: A novel form of strong privacy is introduced, which is call object-cardinality privacy, that is motivated by real-world requirements and able to solve more benchmark problems than any other state-of-the-art privacy-preserving planner.

...read moreread less

Abstract: Collaborative privacy-preserving planning (CPPP) is a multi-agent planning task in which agents need to achieve a common set of goals without revealing certain private information. In many CPPP algorithms, the individual agents reason about a projection of the multi-agent problem onto a single-agent classical planning problem. For example, an agent can plan as if it controls the public actions of other agents, ignoring any private preconditions and effects theses actions may have, and use the cost of this plan as a heuristic estimate of the cost of the full, multi-agent plan. Using such a projection, however, ignores some dependencies between agents’ public actions. In particular, it does not contain dependencies between public actions of other agents caused by their private facts. We propose a projection in which these private dependencies are maintained. The benefit of our dependency-preserving projection is demonstrated by using it to produce high-level plans in a new privacy-preserving planner, and as a heuristic for guiding forward search privacy-preserving algorithms. Both are able to solve more benchmark problems than any other state-of-the-art privacy-preserving planner. This more informed projection does not explicitly expose any private fact, action, or precondition. In addition, we show that even if an adversary agent knows that an agent has some private objects of a given type (e.g., trucks), it cannot infer the number of such private objects that the agent controls. This introduces a novel form of strong privacy, which we call object-cardinality privacy, that is motivated by real-world requirements.

...read moreread less

5 citations

Proceedings Article•10.1145/3287921.3287947•

Formal Verification of ALICA Multi-agent Plans Using Model Checking

[...]

Thao Nguyen Van¹, Nugroho Fredivianus¹, Huu Tam Tran¹, Kurt Geihs¹, Thi Thanh Binh Huynh² - Show less +1 more•Institutions (2)

University of Kassel¹, Hanoi University of Science and Technology²

6 Dec 2018

TL;DR: This work verifies plans composed in a language called ALICA (A Language for Interactive Cooperative Agents) that controls the agents' behavior by creating a translation tool that implements an algorithm for translating ALICA plans into the format used by the real-time model checker UPPAAL.

...read moreread less

Abstract: In multi-agent systems (MAS), plans consisting of sequences of actions are used to accomplish the team task. A critical issue for this approach is avoiding problems such as deadlocks and safety violations. Our recent work addresses that matter by verifying plans composed in a language called ALICA (A Language for Interactive Cooperative Agents) that controls the agents' behavior. The investigation is conducted by creating a translation tool that implements an algorithm for translating ALICA plans into the format used by the real-time model checker UPPAAL. We tested our concept using several cases, and the result is promising to get further insight on multi-agent model checking.

...read moreread less

4 citations

Dissertation•

A decentralised online multi-agent planning framework for multi-agent systems

[...]

Rafael Cauê Cardoso

27 Mar 2018

TL;DR: Experiments with three loosely-coupled planning domains show that DOMAP outperforms four other state-of-the-art multi agent planners with regards to both planning and execution time, particularly in the most difficult problems.

...read moreread less

Abstract: Multi-agent systems often contain dynamic and complex environments where agents’ course of action (plans) can fail at any moment during execution of the system. Furthermore, new goals can emerge for which there are no known plan available in any of the agents’ plan library. Automated planning techniques are well suited to tackle both of these issues. Extensive research has been done in centralised planning for singleagents, however, so far multi-agent planning has not been fully explored in practice. Multi-agent platforms typically provide various mechanisms for runtime coordination, which are often required in online planning (i.e., planning during runtime). In this context, decentralised multi-agent planning can be efficient as well as effective, especially in loosely-coupled domains, besides also ensuring important properties in agent systems such as privacy and autonomy. We address this issue by putting forward an approach to online multi-agent planning that combines goal allocation, individual Hierarchical Task Network (HTN) planning, and coordination during runtime in order to support the achievement of social goals in multi-agent systems. In particular, we present a planning and execution framework called Decentralised Online Multi-Agent Planning (DOMAP). Experiments with three loosely-coupled planning domains show that DOMAP outperforms four other state-of-the-art multi agent planners with regards to both planning and execution time, particularly in the most difficult problems.

...read moreread less

Proceedings Article•10.1109/DYSPAN.2018.8610414•

Multi-Agent Planning with Cardinality: Towards Autonomous Enforcement of Spectrum Policies

[...]

Maqsood Ahamed Abdul Careem¹, Aveek Dutta¹, Weifu Wang¹•Institutions (1)

University at Albany, SUNY¹

1 Oct 2018

TL;DR: By estimating spatial orientation of the agents with single antenna, the accuracy is improved by 96% over crowdsourcing only and the scheduling problem is solved with a 3-approximation ratio in polynomial time that exhibits statistically similar performance under variety of urban locale across multiple continents.

...read moreread less

Abstract: The distributed nature of policy violations in spectrum sharing necessitate the use of mobile autonomous agents (e.g., UAVs, self-driving cars, crowdsourcing) to implement cost-effective enforcement systems. We define this problem as Multi-agent Planning with Cardinality (MPC), where Cardinality represents multiple, unique agents visiting each infraction location to collectively improve the accuracy of the enforcement tasks. Designed as a practical and deployable system, our solution leverages crowdsourced information to determine the optimum Cardinality and provide a routing schedule for the agents to achieve the desired level of accuracy of detection and localization at minimum possible cost. We show that by estimating spatial orientation of the agents with single antenna, the accuracy is improved by 96% over crowdsourcing only. Using geographical maps as the basis, we solve the scheduling problem with a 3-approximation ratio in polynomial time that exhibits statistically similar performance under variety of urban locale across multiple continents. The longest path traversed by an agent on average is 1.2km per unit diagonal length of a rectangular geographic area, even when there are twice as many infractions as agents.

...read moreread less

Proceedings Article•10.1109/CEC.2018.8477856•

A Multi-agent Planning Model Applied to Teamwork Management

[...]

Leonardo Henrique Moreira¹, Célia Ghedini Ralha¹•Institutions (1)

University of Brasília¹

1 Jul 2018

TL;DR: This work selected and evaluated the application of the Lightweight Coordination Multi-Agent Planning (LCMAP) model to aid teamwork management and highlighted that LCMAP can be applied as a solution for teamwork management.

...read moreread less

Abstract: In multi-agent systems, the decision making process is composed of five stages defined as teamwork phases: potential recognition, team formation, plan formation, team action, and reconfiguration. Furthermore, the management of available resources in order to achieve synergy may require huge efforts from authorities, especially in an environment where resources are heterogeneous and their relationships are complex. Therefore, solutions were prospected in Artificial Intelligence research areas, such as multi-agent planning, in order to find possible solutions for this issue. After a systematic mapping study, some related work was highlighted and checked about the adequacy of supporting teamwork phases. Thus, this work selected and evaluated the application of the Lightweight Coordination Multi-Agent Planning (LCMAP) model to aid teamwork management. The LCMAP characteristics were detailed and discussed mapping them to teamwork phases. LCMAP evaluation was carried out using the wumpus world adapted to represent military teamwork operations. Results highlighted that LCMAP can be applied as a solution for teamwork management.

...read moreread less

Proceedings Article•10.1109/ICARSC.2018.8374186•

Heterogeneous multi-agent planning using actuation maps

[...]

Tiago Pereira¹, Nerea Luis², António Paulo Moreira¹, Daniel Borrajo², Manuela Veloso³, Susana Fernández² - Show less +2 more•Institutions (3)

University of Porto¹, Charles III University of Madrid², Carnegie Mellon University³

25 Apr 2018

TL;DR: Experiments show that when information extracted from AMs is provided to the MultiAgent planner, goal assignment is significantly faster, speeding-up the planning process considerably, and this approach greatly outperforms classical centralized planning.

...read moreread less

Abstract: Many real-world robotic scenarios require performing task planning to decide courses of actions to be executed by (possibly heterogeneous) robots. A classical centralized planning approach that considers in the same search space all combinations of robots and goals could lead to inefficient solutions that do not scale well. Multi-Agent Planning (MAP) provides a good framework to solve this kind of tasks efficiently. Some MAP techniques have proposed to previously assign goals to agents (robots) so that the planning effort decreases. However, these techniques do not scale when the number of agents and goals grow, as in most real world scenarios with big maps or goals that cannot be reached by subsets of robots. In this paper we propose to help the computation of which goals should be assigned to each agent by using Actuation Maps (AMs). Given a map, AMs can determine the regions each agent can actuate on. They help on alleviating the effort of MAP techniques knowing which goals can be tackled by each agent, as well as cheaply estimating the cost of using each agent to achieve every goal. Experiments show that when information extracted from AMs is provided to the MultiAgent planner, goal assignment is significantly faster, speeding-up the planning process considerably. Experiments also show that this approach greatly outperforms classical centralized planning.

...read moreread less

Proceedings Article•

COORDINATION OF SELF-OPTIMIZING MECHATRONIC SYSTEMS - A New Application for Multi-Agent Planning

[...]

Benjamin Klöpper¹, Wilhelm Dangelmaier²•Institutions (2)

National Institute of Informatics¹, University of Paderborn²

8 Sep 2018

TL;DR: This paper introduces the application area self-optimizing mechatronic systems and identifies the arising coordination problems and shows that multi-agent technology and in particular multi- agent planning can be applied to solve both coordination scenarios.

...read moreread less

Abstract: The paradigm of self-optimization introduces flexible and highly adaptive mechatronic systems. During the exploiation of this flexibility, new problems arise. One of these problems is the coordination of mechatronics systems and subsystems. This paper introduces the application area self-optimizing mechatronic systems and identifies the arising coordination problems. Two main scenarios are identified: coordination of autonomous mechatronic systems and coordination of several subsystems within an autonomous mechatronic system. We will show that multi-agent technology and in particular multi-agent planning can be applied to solve both coordination scenarios.

...read moreread less

Posted Content•

Privacy Preserving Multi-Agent Planning with Provable Guarantees.

[...]

Amos Beimel, Ronen I. Brafman

31 Oct 2018-arXiv: Artificial Intelligence

TL;DR: A precise notion of secure computation for search-based algorithms is formulated and it is proved that Secure MAFS has this property in all domains.

...read moreread less

Abstract: In privacy-preserving multi-agent planning, a group of agents attempt to cooperatively solve a multi-agent planning problem while maintaining private their data and actions. Although much work was carried out in this area in past years, its theoretical foundations have not been fully worked out. Specifically, although algorithms with precise privacy guarantees exist, even their most efficient implementations are not fast enough on realistic instances, whereas for practical algorithms no meaningful privacy guarantees exist. Secure-MAFS, a variant of the multi-agent forward search algorithm (MAFS) is the only practical algorithm to attempt to offer more precise guarantees, but only in very limited settings and with proof sketches only. In this paper we formulate a precise notion of secure computation for search-based algorithms and prove that Secure MAFS has this property in all domains.

...read moreread less

Proceedings Article•

Human-UAV Teaming in Dynamic and Uncertain Environments

[...]

Alper T. Alan¹, Chang Liu¹, Elliot Salisbury¹, Stephen D. Prior¹, Sarvapali D. Ramchurn¹, Feng Wu², Kerry Tatlock³, Gareth Rees³ - Show less +4 more•Institutions (3)

University of Southampton¹, University of Science and Technology of China², MBDA³

9 Jul 2018

TL;DR: In this demonstrator, an algorithm developed for human-agent coordination can be used to coordinate human actors on the ground and unmanned aerial vehicles in a rescue mission.

...read moreread less

Abstract: In this demonstrator we show how an algorithm developed for human-agent coordination can be used to coordinate human actors on the ground and unmanned aerial vehicles in a rescue mission. A video can be found here: http://goo.gl/QLQD7q.

...read moreread less

Book Chapter•10.1007/978-981-10-8642-7_2•

Agents and Multi-agent Coordination

[...]

Pratyusha Rakshit¹, Amit Konar¹•Institutions (1)

Jadavpur University¹

1 Jan 2018

TL;DR: This chapter introduces the basic concepts of cooperative and competitive multi-agent coordination, and begins with formal definitions of agency and elaborately discusses the perceptual and learning capability of an agent based on its architecture.

...read moreread less

Abstract: An intelligent agent is an entity that performs its task in a given environment by exploiting the knowledge acquired from its interaction with the environment during problem-solving process. Over the past two decades, multi-agent systems have emerged as a new methodology to address the issue of organizing a large-scale system by assembling and coordinating individual agents to achieve a goal jointly. Remarkable features of multi-agent systems resulting in their immense real-world applications include low implementation cost, adaptability with dynamicity of environment, enhanced flexibility, great robustness, and ease of maintenance. A multi-agent system is primarily characterized by goal-oriented coordination among its agents, both in cooperative and in competitive circumstances. This chapter introduces the basic concepts of cooperative and competitive multi-agent coordination. It begins with formal definitions of agency and elaborately discusses the perceptual and learning capability of an agent based on its architecture. Gradually, the chapter unveils the emergence of multi-agent coordination due to handshaking of distributed artificial intelligence and machine intelligence. The chapter then highlights the significance of planning and learning in multi-agent coordination to solve real-world problems. The chapter next demonstrates the scope of evolutionary optimization algorithms to maximize coordination efficiency in multi-agent robotics by optimal utilization of system resources. The chapter ends with a discussion on enhancing the performance of traditional evolutionary optimization algorithms to handle measurement noise in real-world multi-robot coordination problems.

...read moreread less

Proceedings Article•

A Hybrid Approach to Planning and Execution in Dynamic Environments Through Hierarchical Task Networks and Behavior Trees

[...]

Xenija Neufeld¹, Sanaz Mostaghim¹, Sandy Brand•Institutions (1)

Otto-von-Guericke University Magdeburg¹

25 Sep 2018

TL;DR: This work proposes a hybrid approach combining a Hierarchical Task Network planner for high-level planning while delegating low-level decision making and acting to Behavior Trees, and compares this approach with a pure planner in a multi-agent environment.

...read moreread less

Abstract: Intelligent autonomous agents that are acting in dynamic environmentsin real-time are often required to follow long-termstrategies while also remaining reactive and being able to actdeliberately. In order to create intelligent behaviors for videogame characters, there are two common approaches – plannersare used for long-term strategical planning, whereas BehaviorTrees allow for reactive acting. Although both methodologieshave their advantages, when used on their own, theyfail to fully achieve both requirements described above. Inthis work, we propose a hybrid approach combining a HierarchicalTask Network planner for high-level planning whiledelegating low-level decision making and acting to BehaviorTrees. Furthermore, we compare this approach with a pureplanner in a multi-agent environment.

...read moreread less

Proceedings Article•

Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation

[...]

Thomy Phan¹, Lenz Belzner¹, Thomas Gabor¹, Kyrill Schmid¹•Institutions (1)

Ludwig Maximilian University of Munich¹

9 Jul 2018

TL;DR: This paper proposes Emergent Value function Approximation for Distributed Environments (EVADE), an approach to integrate global experience into multi-agent online planning in stochastic domains to consider global effects during local planning.

...read moreread less

Abstract: Making decisions is a great challenge in distributed autonomous environments due to enormous state spaces and uncertainty. Many online planning algorithms rely on statistical sampling to avoid searching the whole state space, while still being able to make acceptable decisions. However, planning often has to be performed under strict computational constraints making online planning in multi-agent systems highly limited, which could lead to poor system performance, especially in stochastic domains. In this paper, we propose Emergent Value function Approximation for Distributed Environments (EVADE), an approach to integrate global experience into multi-agent online planning in stochastic domains to consider global effects during local planning. For this purpose, a value function is approximated online based on the emergent system behaviour by using methods of reinforcement learning. We empirically evaluated EVADE with two statistical multi-agent online planning algorithms in a highly complex and stochastic smart factory environment, where multiple agents need to process various items at a shared set of machines. Our experiments show that EVADE can effectively improve the performance of multi-agent online planning while offering efficiency w.r.t. the breadth and depth of the planning process.

...read moreread less

Proceedings Article•

Credit Assignment For Collective Multiagent RL With Global Rewards

[...]

Duc Thien Nguyen¹, Akshat Kumar¹, Hoong Chuin Lau¹•Institutions (1)

Singapore Management University¹

1 Jan 2018

TL;DR: This work develops collective actor-critic RL approaches for this setting, and addresses the problem of multiagent credit assignment, and computing low variance policy gradient estimates that result in faster convergence to high quality solutions.

...read moreread less

Abstract: Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment. We focus on a multiagent planning model subclass, relevant to urban settings, where agent interactions are dependent on their ``collective influence'' on each other, rather than their identities. Unlike previous work, we address a general setting where system reward is not decomposable among agents. We develop collective actor-critic RL approaches for this setting, and address the problem of multiagent credit assignment, and computing low variance policy gradient estimates that result in faster convergence to high quality solutions. We also develop difference rewards based credit assignment methods for the collective setting. Empirically our new approaches provide significantly better solutions than previous methods in the presence of global rewards on two real world problems modeling taxi fleet optimization and multiagent patrolling, and a synthetic grid navigation domain.

...read moreread less