Collaborative Evolutionary Reinforcement Learning

Open AccessPosted Content

Collaborative Evolutionary Reinforcement Learning

- 02 May 2019

39

TL;DR: In this paper, the authors introduce Collaborative Evolutionary Reinforcement Learning (CERL), a scalable framework that comprises a portfolio of policies that simultaneously explore and exploit diverse regions of the solution space.

Abstract: Deep reinforcement learning algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically struggle with achieving effective exploration and are extremely sensitive to the choice of hyperparameters. One reason is that most approaches use a noisy version of their operating policy to explore - thereby limiting the range of exploration. In this paper, we introduce Collaborative Evolutionary Reinforcement Learning (CERL), a scalable framework that comprises a portfolio of policies that simultaneously explore and exploit diverse regions of the solution space. A collection of learners - typically proven algorithms like TD3 - optimize over varying time-horizons leading to this diverse portfolio. All learners contribute to and use a shared replay buffer to achieve greater sample efficiency. Computational resources are dynamically distributed to favor the best learners as a form of online algorithm selection. Neuroevolution binds this entire process to generate a single emergent learner that exceeds the capabilities of any individual learner. Experiments in a range of continuous control benchmarks demonstrate that the emergent learner significantly outperforms its composite learners while remaining overall more sample-efficient - notably solving the Mujoco Humanoid benchmark where all of its composite learners (TD3) fail entirely in isolation.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1177/0278364920987859

How to train your robot with deep reinforcement learning: lessons we have learned:

Julian Ibarz, +7 more

- 31 Jan 2021

- The International Journal of Robotics Re...

TL;DR: Deep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low-level sensor observations Although a large portion of deep RL research has focused on learning complex behaviors as discussed by the authors.

...read moreread less

518

•Journal Article•10.1177/0278364920987859

How to Train Your Robot with Deep Reinforcement Learning; Lessons We've Learned.

Julian Ibarz, +5 more

- 04 Feb 2021

- arXiv: Robotics

TL;DR: In this paper, the authors present a number of case studies involving robotic deep reinforcement learning in the real world and discuss commonly perceived challenges in deep RL and how they have been addressed in these works.

...read moreread less

275

Journal Article•10.1109/TSG.2020.2969650

Enhanced Coordinated Operations of Electric Power and Transportation Networks via EV Charging Services

Tao Qian, +4 more

- 28 Jan 2020

- IEEE Transactions on Smart Grid

TL;DR: A holistic framework to enhance the operation of coordinated electric power distribution network (PDN) and urban transportation network (UTN) via EV charging services is proposed and a deep reinforcement learning (DRL)-based solution framework is developed to decouple and approximately solve the stochastic bi-level problem.

...read moreread less

160

Journal Article•10.1109/TEVC.2021.3079985

A Survey on Evolutionary Construction of Deep Neural Networks

Xun Zhou, +3 more

- 13 May 2021

- IEEE Transactions on Evolutionary Comput...

TL;DR: An insight is provided into the automated DNN construction process by formulating it into a multi-level multi-objective large-scale optimization problem with constraints, where the non-convex, non-differentiable and black-box nature of this problem makes evolutionary algorithms (EAs) to stand out as a promising solver.

...read moreread less

91

•Journal Article•10.1109/TNNLS.2019.2959129

Reducing Estimation Bias via Triplet-Average Deep Deterministic Policy Gradient

Dongming Wu, +3 more

- 30 Oct 2020

- IEEE Transactions on Neural Networks

TL;DR: This article investigates the underestimation phenomenon in the recent twin delay deep deterministic actor-critic algorithm and theoretically demonstrates its existence, and proposes a novel triplet-averageDeep deterministic policy gradient algorithm that takes the weighted action value of three target critics to reduce the estimation bias.

...read moreread less

80

...

Expand

References

•Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

- 01 Jan 1988

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

39.7K

•Posted Content

Proximal Policy Optimization Algorithms

John Schulman, +4 more

- 20 Jul 2017

- arXiv: Learning

TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.

...read moreread less

18K

•Proceedings Article

Asynchronous methods for deep reinforcement learning

Volodymyr Mnih, +7 more

- 19 Jun 2016

TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

...read moreread less

9.2K