Journal Article10.1109/tits.2023.3316285
Imagination-Augmented Reinforcement Learning Framework for Variable Speed Limit Control
Duo Li,L. Lasenby +1 more
4
TL;DR: The proposed Imagination-Augmented Agent (I2A) consists an imagination path and a model-free path, which work together to generate appropriate control actions that outperforms other tested Reinforcement Learning (RL) agents in terms of Total Time Spent and bottleneck volume.
read more
Abstract: Variable Speed Limit (VSL) is a commonly applied active traffic management measure for urban motorways. In recent years, model-based and model-free approaches have been extensively adopted to solve VSL optimization problems. However, the success of model-based VSL relies heavily on the nature of the environmental model adopted (e.g., traffic flow model). Implicit environment models may result in inappropriate control actions. Although model-free approaches are able to directly map raw measurements to control actions without a need for an environment model, they usually require large amounts of training data. In order to address these issues, we propose an Imagination-Augmented Agent (I2A) for VSL control. The I2A consists an imagination path and a model-free path, which work together to generate appropriate control actions. The simulation results show that the proposed I2A agent outperforms other tested Reinforcement Learning (RL) agents in terms of Total Time Spent and bottleneck volume.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Imagination-augmented Hierarchical Reinforcement Learning for Safe and Interactive Autonomous Driving in Urban Environments
TL;DR: IAHRL efficiently integrates imagination into HRL to enable an agent to learn safe and interactive behaviors in real-world navigation tasks and introduces a new attention mechanism that allows the high-level policy to be permutation-invariant to the order of surrounding objects and to prioritize the authors' agent over them.
2
Exploring mechanisms of integrating global perception prediction for connected vehicles with lane-specific reinforcement learning-based variable speed limits
Li Song,Shijie Li,Guojun Chen,Xin Zhao,Nengchao Lyu,Wei David Fan +5 more
Dynamic Pricing for Wireless Charging Lane Management Based on Deep Reinforcement Learning
Fan Liu,Zhen Tan,Hing Kai Chan,Fan Liu,Zhen Tan,Hing Kai Chan +5 more
Abstract: We consider a dynamic pricing problem in a double-lane system consisting of one general purpose lane and one wireless charging lane (WCL). The electricity price is dynamically adjusted to affect the lane-choice behaviors of incoming electric vehicles (EVs), thereby regulating the traffic assignment between the two lanes with both traffic operation efficiency and charging service efficiency considered in the control objective. We first establish an agent-based dynamic double-lane traffic system model, whereby each EV acts as an agent with distinct behavioral and operational characteristics. Then, a deep Q-learning algorithm is proposed to derive the optimal pricing decisions. A regression tree (CART) algorithm is also designed for benchmarking. The simulation results reveal that the deep Q-learning algorithm demonstrates superior capability in optimizing dynamic pricing strategies compared to CART by more effectively leveraging system dynamics and future traffic demand information, and both outperform the static pricing strategy. This study serves as a pioneering work to explore dynamic pricing issues for WCLs.
Leveraging CAVs to Improve Traffic Efficiency: An MARL-Based Approach
Weizhen Han,Bingyi Liu,Zhi Liu,Xun Shao,Libing Wu,Jianping Wang +5 more
- 23 Jul 2024
TL;DR: This paper proposes a MARL-based approach, MACA, for collaborative path planning of CAVs and CVs to reduce traffic congestion and improve efficiency in urban scenarios, achieving up to 10.9% travel time reduction for CVs and 6.5% queue length reduction.
References
Attention Is All You Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Łukasz Kaiser,Illia Polosukhin +7 more
- 01 Jan 2017
Abstract: The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
51.8K
Human-level control through deep reinforcement learning
Volodymyr Mnih,Koray Kavukcuoglu,David Silver,Andrei Rusu,Joel Veness,Marc G. Bellemare,Alex Graves,Martin Riedmiller,Andreas K. Fidjeland,Georg Ostrovski,Stig Petersen,Charles Beattie,Amir Sadik,Ioannis Antonoglou,Helen King,Dharshan Kumaran,Daan Wierstra,Shane Legg,Demis Hassabis +18 more
TL;DR: This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
•Posted Content
Proximal Policy Optimization Algorithms
TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.
18K
Technical Note : \cal Q -Learning
Chris Watkins,Peter Dayan +1 more
TL;DR: This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989), showing that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.
•Proceedings Article
Policy Gradient Methods for Reinforcement Learning with Function Approximation
Richard S. Sutton,David McAllester,Satinder Singh,Yishay Mansour +3 more
- 29 Nov 1999
TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.