On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

doi:10.48550/arxiv.2206.03271

Open AccessPosted Content10.48550/arxiv.2206.03271

On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

07 Jun 2022

32

TL;DR: The authors showed that multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation.

Abstract: Intelligent agents should have the ability to leverage knowledge from previously learned tasks in order to learn new ones quickly and efficiently. Meta-learning approaches have emerged as a popular solution to achieve this. However, meta-reinforcement learning (meta-RL) algorithms have thus far been restricted to simple environments with narrow task distributions. Moreover, the paradigm of pretraining followed by fine-tuning to adapt to new tasks has emerged as a simple yet effective solution in supervised and self-supervised learning. This calls into question the benefits of meta-learning approaches also in reinforcement learning, which typically come at the cost of high complexity. We hence investigate meta-RL approaches in a variety of vision-based benchmarks, including Procgen, RLBench, and Atari, where evaluations are made on completely novel tasks. Our findings show that when meta-learning approaches are evaluated on different tasks (rather than different variations of the same task), multi-task pretraining with fine-tuning on new tasks performs equally as well, or better, than meta-pretraining with meta test-time adaptation. This is encouraging for future research, as multi-task pretraining tends to be simpler and computationally cheaper than meta-RL. From these findings, we advocate for evaluating future meta-RL methods on more challenging tasks and including multi-task pretraining with fine-tuning as a simple, yet strong baseline.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.48550/arxiv.2403.03890

Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation

Xiao Ma, +3 more

- 06 Mar 2024

- arXiv.org

TL;DR: HDP is a hierarchical agent for multi-task robotic manipulation that factorises a manipulation policy into a high-level task-planning agent and a low-level goal-conditioned diffusion policy. It generates context-aware motion trajectories while satisfying robot kinematics constraints.

...read moreread less

18

•Proceedings Article•10.1145/3568162.3576989

SIRL: Similarity-based Implicit Representation Learning

Andreea Bobu, +4 more

- 02 Jan 2023

TL;DR: In contrastive learning as mentioned in this paper , the goal is to identify and isolate the causal features that people actually care about and use when they represent states and behaviors, and to learn representations that are more generalizable than self-supervised and task-input alternatives.

...read moreread less

11

Journal Article•10.1016/j.eswa.2023.123074

A fast interpretable adaptive meta-learning enhanced deep learning framework for diagnosis of diabetic retinopathy

Maofa Wang, +8 more

- Expert Systems With Applications

TL;DR: This study introduces FIAML-LR, a novel meta-learning framework that balances interpretability and accuracy for few-shot classification tasks, particularly in diabetic retinopathy diagnosis, achieving a 14.28% accuracy boost with limited data.

...read moreread less

5

Journal Article•10.3390/buildings14041106

Integrating Drone Imagery and AI for Improved Construction Site Management through Building Information Modeling

Wonjun Choi, +2 more

- 15 Apr 2024

- Buildings

TL;DR: This study explores the integration of drone imagery into the digital construction site management process, aiming to create BIM models with enhanced object recognition capabilities, underscoring the complexity of the task and laying the groundwork for future innovations in this area.

...read moreread less

5

•Journal Article•10.20517/ir.2022.20

Deep reinforcement learning for real-world quadrupedal locomotion: a comprehensive review

Hongyin Zhang, +2 more

- 01 Jan 2022

- Intelligence & robotics

TL;DR: This review article systematically organize and summarize relevant important literature, covering DRL algorithms from problem setting to advanced learning methods, and core components in the algorithm design, such as state and action spaces, reward functions, and solutions to reality gap problems.

...read moreread less

5

...

Expand

References

•Posted Content

Proximal Policy Optimization Algorithms

John Schulman, +4 more

- 20 Jul 2017

- arXiv: Learning

TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.

...read moreread less

18K

•Proceedings Article

Model-agnostic meta-learning for fast adaptation of deep networks

Chelsea Finn, +2 more

- 06 Aug 2017

TL;DR: An algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning is proposed.

...read moreread less

11.3K

•Proceedings Article

Optimization as a Model for Few-Shot Learning

Sachin Ravi, +1 more

- 24 Apr 2017

TL;DR: In this paper, an LSTM-based meta-learner model is proposed to learn the exact optimization algorithm used to train another learner neural network in the few-shot regime.

...read moreread less

3.8K

•Journal Article•10.1613/JAIR.3912

The arcade learning environment: an evaluation platform for general agents

Marc G. Bellemare, +3 more

- 01 May 2013

- Journal of Artificial Intelligence Resea...

TL;DR: The Arcade Learning Environment (ALE) as discussed by the authors is a platform for evaluating the development of general, domain-independent AI technology, which provides an interface to hundreds of Atari 2600 game environments, each one different, interesting, and designed to be a challenge for human players.

...read moreread less

2.9K

•Posted Content

Rainbow: Combining Improvements in Deep Reinforcement Learning

Matteo Hessel, +9 more

- 06 Oct 2017

- arXiv: Artificial Intelligence

TL;DR: This paper examines six extensions to the DQN algorithm and empirically studies their combination, showing that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance.

...read moreread less

2.1K