Open AccessPosted Content
Collaborative Evolutionary Reinforcement Learning
Shauharda Khadka,Somdeb Majumdar,Tarek Nassar,Zach Dwiel,Evren Tumer,Santiago Miret,Yinyin Liu,Kagan Tumer +7 more
TL;DR: In this paper, the authors introduce Collaborative Evolutionary Reinforcement Learning (CERL), a scalable framework that comprises a portfolio of policies that simultaneously explore and exploit diverse regions of the solution space.
read more
Abstract: Deep reinforcement learning algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically struggle with achieving effective exploration and are extremely sensitive to the choice of hyperparameters. One reason is that most approaches use a noisy version of their operating policy to explore - thereby limiting the range of exploration. In this paper, we introduce Collaborative Evolutionary Reinforcement Learning (CERL), a scalable framework that comprises a portfolio of policies that simultaneously explore and exploit diverse regions of the solution space. A collection of learners - typically proven algorithms like TD3 - optimize over varying time-horizons leading to this diverse portfolio. All learners contribute to and use a shared replay buffer to achieve greater sample efficiency. Computational resources are dynamically distributed to favor the best learners as a form of online algorithm selection. Neuroevolution binds this entire process to generate a single emergent learner that exceeds the capabilities of any individual learner. Experiments in a range of continuous control benchmarks demonstrate that the emergent learner significantly outperforms its composite learners while remaining overall more sample-efficient - notably solving the Mujoco Humanoid benchmark where all of its composite learners (TD3) fail entirely in isolation.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
How to train your robot with deep reinforcement learning: lessons we have learned:
Julian Ibarz,Jie Tan,Chelsea Finn,Chelsea Finn,Mrinal Kalakrishnan,Peter Pastor,Sergey Levine,Sergey Levine +7 more
TL;DR: Deep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low-level sensor observations Although a large portion of deep RL research has focused on learning complex behaviors as discussed by the authors.
How to Train Your Robot with Deep Reinforcement Learning; Lessons We've Learned.
TL;DR: In this paper, the authors present a number of case studies involving robotic deep reinforcement learning in the real world and discuss commonly perceived challenges in deep RL and how they have been addressed in these works.
275
Enhanced Coordinated Operations of Electric Power and Transportation Networks via EV Charging Services
TL;DR: A holistic framework to enhance the operation of coordinated electric power distribution network (PDN) and urban transportation network (UTN) via EV charging services is proposed and a deep reinforcement learning (DRL)-based solution framework is developed to decouple and approximately solve the stochastic bi-level problem.
160
A Survey on Evolutionary Construction of Deep Neural Networks
TL;DR: An insight is provided into the automated DNN construction process by formulating it into a multi-level multi-objective large-scale optimization problem with constraints, where the non-convex, non-differentiable and black-box nature of this problem makes evolutionary algorithms (EAs) to stand out as a promising solver.
91
Reducing Estimation Bias via Triplet-Average Deep Deterministic Policy Gradient
TL;DR: This article investigates the underestimation phenomenon in the recent twin delay deep deterministic actor-critic algorithm and theoretically demonstrates its existence, and proposes a novel triplet-averageDeep deterministic policy gradient algorithm that takes the weighted action value of three target critics to reduce the estimation bias.
80
References
•Book
Reinforcement Learning: An Introduction
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Human-level control through deep reinforcement learning
Volodymyr Mnih,Koray Kavukcuoglu,David Silver,Andrei Rusu,Joel Veness,Marc G. Bellemare,Alex Graves,Martin Riedmiller,Andreas K. Fidjeland,Georg Ostrovski,Stig Petersen,Charles Beattie,Amir Sadik,Ioannis Antonoglou,Helen King,Dharshan Kumaran,Daan Wierstra,Shane Legg,Demis Hassabis +18 more
TL;DR: This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Mastering the game of Go with deep neural networks and tree search
David Silver,Aja Huang,Chris J. Maddison,Arthur Guez,Laurent Sifre,George van den Driessche,Julian Schrittwieser,Ioannis Antonoglou,Veda Panneershelvam,Marc Lanctot,Sander Dieleman,Dominik Grewe,John Nham,Nal Kalchbrenner,Ilya Sutskever,Timothy P. Lillicrap,Madeleine Leach,Koray Kavukcuoglu,Thore Graepel,Demis Hassabis +19 more
TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.
•Posted Content
Proximal Policy Optimization Algorithms
TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.
18K
•Proceedings Article
Asynchronous methods for deep reinforcement learning
Volodymyr Mnih,Adrià Puigdomènech Badia,Mehdi Mirza,Alex Graves,Tim Harley,Timothy P. Lillicrap,David Silver,Koray Kavukcuoglu +7 more
- 19 Jun 2016
TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Related Papers (5)
Richard S. Sutton,Andrew G. Barto +1 more
- 01 Jan 1988