Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork

doi:10.48550/arXiv.2306.10698

Journal Article10.48550/arXiv.2306.10698

Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork

Chenxu Wang, +4 more

- 19 Jun 2023

- arXiv.org

- Vol. abs/2306.10698

2

TL;DR: In this article , a retrieval network based on a task-conditioned hypernetwork is proposed, which adapts the retrieval network's parameters depending on the task and enhances the collaborative efforts between the retrieval and decision networks.

Abstract: Deep reinforcement learning algorithms are usually impeded by sampling inefficiency, heavily depending on multiple interactions with the environment to acquire accurate decision-making capabilities. In contrast, humans seem to rely on their hippocampus to retrieve relevant information from past experiences of relevant tasks, which guides their decision-making when learning a new task, rather than exclusively depending on environmental interactions. Nevertheless, designing a hippocampus-like module for an agent to incorporate past experiences into established reinforcement learning algorithms presents two challenges. The first challenge involves selecting the most relevant past experiences for the current task, and the second is integrating such experiences into the decision network. To address these challenges, we propose a novel algorithm that utilizes a retrieval network based on a task-conditioned hypernetwork, which adapts the retrieval network's parameters depending on the task. At the same time, a dynamic modification mechanism enhances the collaborative efforts between the retrieval and decision networks. We evaluate the proposed algorithm on the challenging MiniGrid environment. The experimental results demonstrate that our proposed method significantly outperforms strong baselines.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arxiv.2402.04154

Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction

Yonggang Jin, +10 more

- 06 Feb 2024

- arXiv.org

TL;DR: Enhanced forms of task guidance for agents are explored, enabling them to comprehend gameplay instructions, thereby facilitating a "read-to-play"capability and demonstrating that incorporating multimodal game instructions significantly enhances the decision transformer's multitasking and generalization capabilities.

...read moreread less

2

Journal Article•10.48550/arxiv.2311.11385

Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts

Ahmed Hendawy, +2 more

- 19 Nov 2023

- arXiv.org

TL;DR: This paper introduces a novel approach for representation learning in MTRL that encapsulates common structures among the tasks using orthogonal representations to promote diversity and leverages a Gram-Schmidt process to shape a shared subspace of representations generated by a mixture of experts.

...read moreread less

References

•Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

- 12 Jun 2017

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

94.2K

•Posted Content

Proximal Policy Optimization Algorithms

John Schulman, +4 more

- 20 Jul 2017

- arXiv: Learning

TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.

...read moreread less

18K

•Posted Content

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel, +8 more

- 23 Oct 2019

- arXiv: Learning

TL;DR: This systematic study compares pre-training objectives, architectures, unlabeled datasets, transfer approaches, and other factors on dozens of language understanding tasks and achieves state-of-the-art results on many benchmarks covering summarization, question answering, text classification, and more.

...read moreread less

12.9K

•Proceedings Article•10.18653/V1/2020.ACL-MAIN.703

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Michael Lewis, +7 more

- 01 Jul 2020

TL;DR: BART is presented, a denoising autoencoder for pretraining sequence-to-sequence models, which matches the performance of RoBERTa on GLUE and SQuAD, and achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks.

...read moreread less

11.5K

...

Expand

Deep Reinforcement Learning with Multitask Episodic Memory Based on Task-Conditioned Hypernetwork

Chat with Paper

AI Agents for this Paper

Citations

Read to Play (R2-Play): Decision Transformer with Multimodal Game Instruction

Multi-Task Reinforcement Learning with Mixture of Orthogonal Experts

References

Attention is All you Need

Human-level control through deep reinforcement learning

Proximal Policy Optimization Algorithms

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension