Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Multi-agent planning
  4. 2018
  1. Home
  2. Topics
  3. Multi-agent planning
  4. 2018
Showing papers on "Multi-agent planning published in 2018"
Journal Article•10.1177/0278364918774135•
Simultaneous task allocation and planning for temporal logic goals in heterogeneous multi-robot systems

[...]

Philipp Schillinger1, Philipp Schillinger2, Mathias Bürger2, Dimos V. Dimarogonas1•
Royal Institute of Technology1, Bosch2
23 May 2018-The International Journal of Robotics Research
TL;DR: The proposed framework avoids the need to compute a combinatorial number of possible assignment costs, where each computation itself requires solving a complex planning problem, and can improve computational efficiency compared with classical assignment solutions, in particular for on-demand missions where task costs are unknown in advance.
Abstract: This paper describes a framework for automatically generating optimal action-level behavior for a team of robots based on temporal logic mission specifications under resource constraints. The propo...

160 citations

Book Chapter•10.1007/978-3-319-73008-0_18•
Decomposition of Finite LTL Specifications for Efficient Multi-agent Planning

[...]

Philipp Schillinger1, Philipp Schillinger2, Mathias Bürger2, Dimos V. Dimarogonas1•
Royal Institute of Technology1, Bosch2
1 Jan 2018
TL;DR: This work proposes an automata-based approach to automatically identify possible decompositions of the LTL specification into sets of independently executable task specifications, which leads directly to the construction of a team model with significantly lower complexity than other representations constructed with conventional methods.
Abstract: Generating verifiably correct execution strategies from Linear Temporal Logic (LTL) mission specifications avoids the need for manually designed robot behaviors. However, when incorporating a team of robot agents, the additional model complexity becomes a critical issue. Given a single finite LTL mission and a team of robots, we propose an automata-based approach to automatically identify possible decompositions of the LTL specification into sets of independently executable task specifications. Our approach leads directly to the construction of a team model with significantly lower complexity than other representations constructed with conventional methods. Thus, it enables efficient search for an optimal decomposition and allocation of tasks to the robot agents.

64 citations

Journal Article•10.1145/3133326•
Quantifying Privacy Leakage in Multi-Agent Planning

[...]

Michal Štolba1, Jan Tožička1, Antonín Komenda1•
Czech Technical University in Prague1
05 Feb 2018-ACM Transactions on Internet Technology
TL;DR: This article expands on a privacy measure based on information leakage introduced in previous work, and presents a general approach to computing privacy leakage of search-based multi-agent planners by utilizing search-tree reconstruction and classification of leaked superfluous information about the applicability of actions.
Abstract: Multi-agent planning using MA-STRIPS–related models is often motivated by the preservation of private information Such a motivation is not only natural for multi-agent systems but also is one of the main reasons multi-agent planning problems cannot be solved with a centralized approach Although the motivation is common in the literature, the formal treatment of privacy is often missing In this article, we expand on a privacy measure based on information leakage introduced in previous work, where the leaked information is measured in terms of transition systems represented by the public part of the problem with regard to the information obtained during the planning process Moreover, we present a general approach to computing privacy leakage of search-based multi-agent planners by utilizing search-tree reconstruction and classification of leaked superfluous information about the applicability of actions Finally, we present an analysis of the privacy leakage of two well-known algorithms—multi-agent forward search (MAFS) and Secure-MAFS—both in general and on a particular example The results of the analysis show that Secure-MAFS leaks less information than MAFS

18 citations

Proceedings Article•10.1109/IECON.2018.8591390•
Intelligent Mechatronic System with Decentralised Control and Multi-Agent Planning

[...]

Andrei Kalachev1, Gulnara Zhabelova1, Valeriy Vyatkin1, Dennis Jarvis2, Cheng Pang •
Luleå University of Technology1, Central Queensland University2
26 Dec 2018
TL;DR: The presented study explores an approach where MAS realizes high level coordination tasks while IMCs provides embedded services while MAS is realized in GORITE goal-oriented team programming framework deployed as web service in the Cloud.
Abstract: Flexibility and reconfigurability of production systems requires intelligent devices and products that enables easy integration and reconfiguration eliminating the need of explicit programing the functionality of resulting system. This lead to the development of such concept as Intelligent Mechatronic Component. However, coordinating such distributed self-contained components into the desired logic of operation is a challenging task. Multi-agent systems (MAS) architecture provides necessary features for seamless integration of individual functionalities of agents into system's behaviour by self-configuration. The presented study explores an approach where MAS realizes high level coordination tasks while IMCs provides embedded services. MAS is realized in GORITE goal-oriented team programming framework deployed as web service in the Cloud. Low-level control of IMCs is developed in IEC 61499. The paper presents a case study of a Pick and Place manipulator composed of intelligent cylinders.

11 citations

Journal Article•10.1007/S10458-017-9372-X•
Severity-sensitive norm-governed multi-agent planning

[...]

Luca Gasparini1, Timothy J. Norman2, Martin J. Kollingbaum1•
University of Aberdeen1, University of Southampton2
01 Jan 2018-Autonomous Agents and Multi-Agent Systems
TL;DR: This paper model the normative requirements of a system through contrary-to-duty obligations and violation severity levels, and proposes a novel multi-agent planning mechanism based on Decentralised POMDPs that uses a qualitative reward function to capture levels of compliance: N-Dec-POMDPS.
Abstract: In making practical decisions, agents are expected to comply with ideals of behaviour, or norms. In reality, it may not be possible for an individual, or a team of agents, to be fully compliant—actual behaviour often differs from the ideal. The question we address in this paper is how we can design agents that act in such a way that they select collective strategies to avoid more critical failures (norm violations), and mitigate the effects of violations that do occur. We model the normative requirements of a system through contrary-to-duty obligations and violation severity levels, and propose a novel multi-agent planning mechanism based on Decentralised POMDPs that uses a qualitative reward function to capture levels of compliance: N-Dec-POMDPs. We develop mechanisms for solving this type of multi-agent planning problem and show, through empirical analysis, that joint policies generated are equally as good as those produced through existing methods but with significant reductions in execution time.

11 citations

Journal Article•10.1080/0952813X.2018.1456786•
A privacy-preserving model for multi-agent propositional planning

[...]

Andrea Bonisoli1, Alfonso Gerevini1, Alessandro Saetti1, Ivan Serina1•
University of Brescia1
14 Apr 2018-Journal of Experimental and Theoretical Artificial Intelligence
TL;DR: This paper proposes a model of the MA-planning tasks that preserves the privacy of the involved agents when this happens, and investigates an algorithm based on best first search for a model that uses some new heuristics providing a trade-off between accuracy and agents’ privacy.
Abstract: Over the last years, the planning community has formalised several models and approaches to multi-agent (MA) propositional planning. One of the main motivations in MA planning is that some or all a...

8 citations

Book Chapter•10.1007/978-3-030-00111-7_8•
Efficient Auction Based Coordination for Distributed Multi-agent Planning in Temporal Domains Using Resource Abstraction

[...]

Andreas Hertle1, Bernhard Nebel1•
University of Freiburg1
24 Sep 2018
TL;DR: This work demonstrates how out of the box temporal planning systems can be employed to increase plan quality for temporal multi-robot tasks and evaluates the approach on two planning domains and finds significant improvements in solution coverage and plan quality.
Abstract: Recent advances in mobile robotics and AI promise to revolutionize industrial production. As autonomous robots are able to solve more complex tasks, the difficulty of integrating various robot skills and coordinating groups of robots increases dramatically. Domain independent planning promises a possible solution. For single robot systems a number of successful demonstrations can be found in scientific literature. However our experiences at the RoboCup Logistics League in 2017 highlighted a severe lack in plan quality when coordinating multiple robots. In this work we demonstrate how out of the box temporal planning systems can be employed to increase plan quality for temporal multi-robot tasks. An abstract plan is generated first and sub-tasks in the plan are auctioned off to robots, which in turn employ planning to solve these tasks and compute bids. We evaluate our approach on two planning domains and find significant improvements in solution coverage and plan quality.

6 citations

Journal Article•10.1016/J.KNOSYS.2018.01.013•
FMAP: A platform for the development of distributed multi-agent planning systems

[...]

Alejandro Torreño1, Oscar Sapena1, Eva Onaindia1•
Polytechnic University of Valencia1
01 Apr 2018-Knowledge Based Systems
TL;DR: FMAP is presented, a platform aimed at developing distributed MAP solvers such as MAP-POP, FMAP and MH-FMAP, among others, that make use of multi-agent communication protocols.
Abstract: This work is supported by the Spanish MINECO under projects TIN2014-55637-C2-2-R and TIN2017-88476-C2-1-R. The first author was funded by the Spanish SEPE.

5 citations

Journal Article•10.1007/S10458-018-9394-Z•
Action dependencies in privacy-preserving multi-agent planning

[...]

Shlomi Maliah1, Guy Shani1, Roni Stern1•
Ben-Gurion University of the Negev1
07 Aug 2018-Autonomous Agents and Multi-Agent Systems
TL;DR: A novel form of strong privacy is introduced, which is call object-cardinality privacy, that is motivated by real-world requirements and able to solve more benchmark problems than any other state-of-the-art privacy-preserving planner.
Abstract: Collaborative privacy-preserving planning (CPPP) is a multi-agent planning task in which agents need to achieve a common set of goals without revealing certain private information. In many CPPP algorithms, the individual agents reason about a projection of the multi-agent problem onto a single-agent classical planning problem. For example, an agent can plan as if it controls the public actions of other agents, ignoring any private preconditions and effects theses actions may have, and use the cost of this plan as a heuristic estimate of the cost of the full, multi-agent plan. Using such a projection, however, ignores some dependencies between agents’ public actions. In particular, it does not contain dependencies between public actions of other agents caused by their private facts. We propose a projection in which these private dependencies are maintained. The benefit of our dependency-preserving projection is demonstrated by using it to produce high-level plans in a new privacy-preserving planner, and as a heuristic for guiding forward search privacy-preserving algorithms. Both are able to solve more benchmark problems than any other state-of-the-art privacy-preserving planner. This more informed projection does not explicitly expose any private fact, action, or precondition. In addition, we show that even if an adversary agent knows that an agent has some private objects of a given type (e.g., trucks), it cannot infer the number of such private objects that the agent controls. This introduces a novel form of strong privacy, which we call object-cardinality privacy, that is motivated by real-world requirements.

5 citations

Proceedings Article•10.1145/3287921.3287947•
Formal Verification of ALICA Multi-agent Plans Using Model Checking

[...]

Thao Nguyen Van1, Nugroho Fredivianus1, Huu Tam Tran1, Kurt Geihs1, Thi Thanh Binh Huynh2 •
University of Kassel1, Hanoi University of Science and Technology2
6 Dec 2018
TL;DR: This work verifies plans composed in a language called ALICA (A Language for Interactive Cooperative Agents) that controls the agents' behavior by creating a translation tool that implements an algorithm for translating ALICA plans into the format used by the real-time model checker UPPAAL.
Abstract: In multi-agent systems (MAS), plans consisting of sequences of actions are used to accomplish the team task. A critical issue for this approach is avoiding problems such as deadlocks and safety violations. Our recent work addresses that matter by verifying plans composed in a language called ALICA (A Language for Interactive Cooperative Agents) that controls the agents' behavior. The investigation is conducted by creating a translation tool that implements an algorithm for translating ALICA plans into the format used by the real-time model checker UPPAAL. We tested our concept using several cases, and the result is promising to get further insight on multi-agent model checking.

4 citations

Dissertation•
A decentralised online multi-agent planning framework for multi-agent systems

[...]

Rafael Cauê Cardoso
27 Mar 2018
TL;DR: Experiments with three loosely-coupled planning domains show that DOMAP outperforms four other state-of-the-art multi agent planners with regards to both planning and execution time, particularly in the most difficult problems.
Abstract: Multi-agent systems often contain dynamic and complex environments where agents’ course of action (plans) can fail at any moment during execution of the system. Furthermore, new goals can emerge for which there are no known plan available in any of the agents’ plan library. Automated planning techniques are well suited to tackle both of these issues. Extensive research has been done in centralised planning for singleagents, however, so far multi-agent planning has not been fully explored in practice. Multi-agent platforms typically provide various mechanisms for runtime coordination, which are often required in online planning (i.e., planning during runtime). In this context, decentralised multi-agent planning can be efficient as well as effective, especially in loosely-coupled domains, besides also ensuring important properties in agent systems such as privacy and autonomy. We address this issue by putting forward an approach to online multi-agent planning that combines goal allocation, individual Hierarchical Task Network (HTN) planning, and coordination during runtime in order to support the achievement of social goals in multi-agent systems. In particular, we present a planning and execution framework called Decentralised Online Multi-Agent Planning (DOMAP). Experiments with three loosely-coupled planning domains show that DOMAP outperforms four other state-of-the-art multi agent planners with regards to both planning and execution time, particularly in the most difficult problems.
Proceedings Article•10.1109/DYSPAN.2018.8610414•
Multi-Agent Planning with Cardinality: Towards Autonomous Enforcement of Spectrum Policies

[...]

Maqsood Ahamed Abdul Careem1, Aveek Dutta1, Weifu Wang1•
University at Albany, SUNY1
1 Oct 2018
TL;DR: By estimating spatial orientation of the agents with single antenna, the accuracy is improved by 96% over crowdsourcing only and the scheduling problem is solved with a 3-approximation ratio in polynomial time that exhibits statistically similar performance under variety of urban locale across multiple continents.
Abstract: The distributed nature of policy violations in spectrum sharing necessitate the use of mobile autonomous agents (e.g., UAVs, self-driving cars, crowdsourcing) to implement cost-effective enforcement systems. We define this problem as Multi-agent Planning with Cardinality (MPC), where Cardinality represents multiple, unique agents visiting each infraction location to collectively improve the accuracy of the enforcement tasks. Designed as a practical and deployable system, our solution leverages crowdsourced information to determine the optimum Cardinality and provide a routing schedule for the agents to achieve the desired level of accuracy of detection and localization at minimum possible cost. We show that by estimating spatial orientation of the agents with single antenna, the accuracy is improved by 96% over crowdsourcing only. Using geographical maps as the basis, we solve the scheduling problem with a 3-approximation ratio in polynomial time that exhibits statistically similar performance under variety of urban locale across multiple continents. The longest path traversed by an agent on average is 1.2km per unit diagonal length of a rectangular geographic area, even when there are twice as many infractions as agents.
Proceedings Article•10.1109/CEC.2018.8477856•
A Multi-agent Planning Model Applied to Teamwork Management

[...]

Leonardo Henrique Moreira1, Célia Ghedini Ralha1•
University of Brasília1
1 Jul 2018
TL;DR: This work selected and evaluated the application of the Lightweight Coordination Multi-Agent Planning (LCMAP) model to aid teamwork management and highlighted that LCMAP can be applied as a solution for teamwork management.
Abstract: In multi-agent systems, the decision making process is composed of five stages defined as teamwork phases: potential recognition, team formation, plan formation, team action, and reconfiguration. Furthermore, the management of available resources in order to achieve synergy may require huge efforts from authorities, especially in an environment where resources are heterogeneous and their relationships are complex. Therefore, solutions were prospected in Artificial Intelligence research areas, such as multi-agent planning, in order to find possible solutions for this issue. After a systematic mapping study, some related work was highlighted and checked about the adequacy of supporting teamwork phases. Thus, this work selected and evaluated the application of the Lightweight Coordination Multi-Agent Planning (LCMAP) model to aid teamwork management. The LCMAP characteristics were detailed and discussed mapping them to teamwork phases. LCMAP evaluation was carried out using the wumpus world adapted to represent military teamwork operations. Results highlighted that LCMAP can be applied as a solution for teamwork management.
Proceedings Article•10.1109/ICARSC.2018.8374186•
Heterogeneous multi-agent planning using actuation maps

[...]

Tiago Pereira1, Nerea Luis2, António Paulo Moreira1, Daniel Borrajo2, Manuela Veloso3, Susana Fernández2 •
University of Porto1, Charles III University of Madrid2, Carnegie Mellon University3
25 Apr 2018
TL;DR: Experiments show that when information extracted from AMs is provided to the MultiAgent planner, goal assignment is significantly faster, speeding-up the planning process considerably, and this approach greatly outperforms classical centralized planning.
Abstract: Many real-world robotic scenarios require performing task planning to decide courses of actions to be executed by (possibly heterogeneous) robots. A classical centralized planning approach that considers in the same search space all combinations of robots and goals could lead to inefficient solutions that do not scale well. Multi-Agent Planning (MAP) provides a good framework to solve this kind of tasks efficiently. Some MAP techniques have proposed to previously assign goals to agents (robots) so that the planning effort decreases. However, these techniques do not scale when the number of agents and goals grow, as in most real world scenarios with big maps or goals that cannot be reached by subsets of robots. In this paper we propose to help the computation of which goals should be assigned to each agent by using Actuation Maps (AMs). Given a map, AMs can determine the regions each agent can actuate on. They help on alleviating the effort of MAP techniques knowing which goals can be tackled by each agent, as well as cheaply estimating the cost of using each agent to achieve every goal. Experiments show that when information extracted from AMs is provided to the MultiAgent planner, goal assignment is significantly faster, speeding-up the planning process considerably. Experiments also show that this approach greatly outperforms classical centralized planning.
Proceedings Article•
COORDINATION OF SELF-OPTIMIZING MECHATRONIC SYSTEMS - A New Application for Multi-Agent Planning

[...]

Benjamin Klöpper1, Wilhelm Dangelmaier2•
National Institute of Informatics1, University of Paderborn2
8 Sep 2018
TL;DR: This paper introduces the application area self-optimizing mechatronic systems and identifies the arising coordination problems and shows that multi-agent technology and in particular multi- agent planning can be applied to solve both coordination scenarios.
Abstract: The paradigm of self-optimization introduces flexible and highly adaptive mechatronic systems. During the exploiation of this flexibility, new problems arise. One of these problems is the coordination of mechatronics systems and subsystems. This paper introduces the application area self-optimizing mechatronic systems and identifies the arising coordination problems. Two main scenarios are identified: coordination of autonomous mechatronic systems and coordination of several subsystems within an autonomous mechatronic system. We will show that multi-agent technology and in particular multi-agent planning can be applied to solve both coordination scenarios.
Posted Content•
Privacy Preserving Multi-Agent Planning with Provable Guarantees.

[...]

Amos Beimel, Ronen I. Brafman
31 Oct 2018-arXiv: Artificial Intelligence
TL;DR: A precise notion of secure computation for search-based algorithms is formulated and it is proved that Secure MAFS has this property in all domains.
Abstract: In privacy-preserving multi-agent planning, a group of agents attempt to cooperatively solve a multi-agent planning problem while maintaining private their data and actions. Although much work was carried out in this area in past years, its theoretical foundations have not been fully worked out. Specifically, although algorithms with precise privacy guarantees exist, even their most efficient implementations are not fast enough on realistic instances, whereas for practical algorithms no meaningful privacy guarantees exist. Secure-MAFS, a variant of the multi-agent forward search algorithm (MAFS) is the only practical algorithm to attempt to offer more precise guarantees, but only in very limited settings and with proof sketches only. In this paper we formulate a precise notion of secure computation for search-based algorithms and prove that Secure MAFS has this property in all domains.
Proceedings Article•
Human-UAV Teaming in Dynamic and Uncertain Environments

[...]

Alper T. Alan1, Chang Liu1, Elliot Salisbury1, Stephen D. Prior1, Sarvapali D. Ramchurn1, Feng Wu2, Kerry Tatlock3, Gareth Rees3 •
University of Southampton1, University of Science and Technology of China2, MBDA3
9 Jul 2018
TL;DR: In this demonstrator, an algorithm developed for human-agent coordination can be used to coordinate human actors on the ground and unmanned aerial vehicles in a rescue mission.
Abstract: In this demonstrator we show how an algorithm developed for human-agent coordination can be used to coordinate human actors on the ground and unmanned aerial vehicles in a rescue mission. A video can be found here: http://goo.gl/QLQD7q.
Book Chapter•10.1007/978-981-10-8642-7_2•
Agents and Multi-agent Coordination

[...]

Pratyusha Rakshit1, Amit Konar1•
Jadavpur University1
1 Jan 2018
TL;DR: This chapter introduces the basic concepts of cooperative and competitive multi-agent coordination, and begins with formal definitions of agency and elaborately discusses the perceptual and learning capability of an agent based on its architecture.
Abstract: An intelligent agent is an entity that performs its task in a given environment by exploiting the knowledge acquired from its interaction with the environment during problem-solving process. Over the past two decades, multi-agent systems have emerged as a new methodology to address the issue of organizing a large-scale system by assembling and coordinating individual agents to achieve a goal jointly. Remarkable features of multi-agent systems resulting in their immense real-world applications include low implementation cost, adaptability with dynamicity of environment, enhanced flexibility, great robustness, and ease of maintenance. A multi-agent system is primarily characterized by goal-oriented coordination among its agents, both in cooperative and in competitive circumstances. This chapter introduces the basic concepts of cooperative and competitive multi-agent coordination. It begins with formal definitions of agency and elaborately discusses the perceptual and learning capability of an agent based on its architecture. Gradually, the chapter unveils the emergence of multi-agent coordination due to handshaking of distributed artificial intelligence and machine intelligence. The chapter then highlights the significance of planning and learning in multi-agent coordination to solve real-world problems. The chapter next demonstrates the scope of evolutionary optimization algorithms to maximize coordination efficiency in multi-agent robotics by optimal utilization of system resources. The chapter ends with a discussion on enhancing the performance of traditional evolutionary optimization algorithms to handle measurement noise in real-world multi-robot coordination problems.
Proceedings Article•
A Hybrid Approach to Planning and Execution in Dynamic Environments Through Hierarchical Task Networks and Behavior Trees

[...]

Xenija Neufeld1, Sanaz Mostaghim1, Sandy Brand•
Otto-von-Guericke University Magdeburg1
25 Sep 2018
TL;DR: This work proposes a hybrid approach combining a Hierarchical Task Network planner for high-level planning while delegating low-level decision making and acting to Behavior Trees, and compares this approach with a pure planner in a multi-agent environment.
Abstract: Intelligent autonomous agents that are acting in dynamic environmentsin real-time are often required to follow long-termstrategies while also remaining reactive and being able to actdeliberately. In order to create intelligent behaviors for videogame characters, there are two common approaches – plannersare used for long-term strategical planning, whereas BehaviorTrees allow for reactive acting. Although both methodologieshave their advantages, when used on their own, theyfail to fully achieve both requirements described above. Inthis work, we propose a hybrid approach combining a HierarchicalTask Network planner for high-level planning whiledelegating low-level decision making and acting to BehaviorTrees. Furthermore, we compare this approach with a pureplanner in a multi-agent environment.
Proceedings Article•
Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation

[...]

Thomy Phan1, Lenz Belzner1, Thomas Gabor1, Kyrill Schmid1•
Ludwig Maximilian University of Munich1
9 Jul 2018
TL;DR: This paper proposes Emergent Value function Approximation for Distributed Environments (EVADE), an approach to integrate global experience into multi-agent online planning in stochastic domains to consider global effects during local planning.
Abstract: Making decisions is a great challenge in distributed autonomous environments due to enormous state spaces and uncertainty. Many online planning algorithms rely on statistical sampling to avoid searching the whole state space, while still being able to make acceptable decisions. However, planning often has to be performed under strict computational constraints making online planning in multi-agent systems highly limited, which could lead to poor system performance, especially in stochastic domains. In this paper, we propose Emergent Value function Approximation for Distributed Environments (EVADE), an approach to integrate global experience into multi-agent online planning in stochastic domains to consider global effects during local planning. For this purpose, a value function is approximated online based on the emergent system behaviour by using methods of reinforcement learning. We empirically evaluated EVADE with two statistical multi-agent online planning algorithms in a highly complex and stochastic smart factory environment, where multiple agents need to process various items at a shared set of machines. Our experiments show that EVADE can effectively improve the performance of multi-agent online planning while offering efficiency w.r.t. the breadth and depth of the planning process.
Proceedings Article•
Credit Assignment For Collective Multiagent RL With Global Rewards

[...]

Duc Thien Nguyen1, Akshat Kumar1, Hoong Chuin Lau1•
Singapore Management University1
1 Jan 2018
TL;DR: This work develops collective actor-critic RL approaches for this setting, and addresses the problem of multiagent credit assignment, and computing low variance policy gradient estimates that result in faster convergence to high quality solutions.
Abstract: Scaling decision theoretic planning to large multiagent systems is challenging due to uncertainty and partial observability in the environment. We focus on a multiagent planning model subclass, relevant to urban settings, where agent interactions are dependent on their ``collective influence'' on each other, rather than their identities. Unlike previous work, we address a general setting where system reward is not decomposable among agents. We develop collective actor-critic RL approaches for this setting, and address the problem of multiagent credit assignment, and computing low variance policy gradient estimates that result in faster convergence to high quality solutions. We also develop difference rewards based credit assignment methods for the collective setting. Empirically our new approaches provide significantly better solutions than previous methods in the presence of global rewards on two real world problems modeling taxi fleet optimization and multiagent patrolling, and a synthetic grid navigation domain.

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve