Open AccessProceedings Article
State Alignment-based Imitation Learning
Fangchen Liu,Zhan Ling,Tongzhou Mu,Hao Su +3 more
- 30 Apr 2020
TL;DR: This work proposes a novel state alignment-based imitation learning method to train the imitator by following the state sequences in the expert demonstrations as much as possible, and combines them into a reinforcement learning framework by a regularized policy update objective.
read more
Abstract: Consider an imitation learning problem that the imitator and the expert have different dynamics models. Most of existing imitation learning methods fail because they focus on the imitation of actions. We propose a novel state alignment-based imitation learning method to train the imitator by following the state sequences in the expert demonstrations as much as possible. The alignment of states comes from both local and global perspectives. We combine them into a reinforcement learning framework by a regularized policy update objective. We show the superiority of our method on standard imitation learning settings as well as the challenging settings in which the expert and the imitator have different dynamics models.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
Transfer Learning in Deep Reinforcement Learning: A Survey
TL;DR: This survey surveys the field of transfer learning in the problem setting of Reinforcement Learning, providing a systematic categorization of its state-of-the-art techniques.
•Posted Content
State-Only Imitation Learning for Dexterous Manipulation
TL;DR: This paper trains an inverse dynamics model and uses it to predict actions for state-only demonstrations and considerably outperforms RL alone, and is able to learn from demonstrations with different dynamics, morphologies, and objects.
99
DexMV: Imitation Learning for Dexterous Manipulation from Human Videos
Sascha O. Becker,Sascha O. Becker +1 more
- 01 Jan 2022
TL;DR: Zhou et al. as discussed by the authors proposed DexMV (Dexterous Manipulation from Videos) for imitation learning, which is a platform with a simulation system for complex dexterous manipulation tasks with a multi-finger robot hand and a computer vision system to record large-scale demonstrations of a human hand conducting the same tasks.
Learning From Imperfect Demonstrations From Agents With Varying Dynamics
Zhangjie Cao,Dorsa Sadigh +1 more
- 25 Mar 2021
TL;DR: In this article, the authors propose a metric composed of a feasibility score and an optimality score to measure how useful a demonstration is for imitation learning, which enables learning from more informative demonstrations and disregarding the less relevant demonstrations.
28
•Posted Content
ManiSkill: Generalizable Manipulation Skill Benchmark with Large-Scale Demonstrations
Tongzhou Mu,Zhan Ling,Fanbo Xiang,Derek Yang,Xuanlin Li,Stone Tao,Zhiao Huang,Zhiwei Jia,Hao Su +8 more
TL;DR: The SAPIEN Manipulation Skill Benchmark (ManiSkill) as mentioned in this paper is a full-physics simulator for 3D object manipulation that includes large intra-class topological and geometric variations.
21
References
Generative Adversarial Nets
Ian Goodfellow,Jean Pouget-Abadie,Mehdi Mirza,Bing Xu,David Warde-Farley,Sherjil Ozair,Aaron Courville,Yoshua Bengio +7 more
- 08 Dec 2014
TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
•Proceedings Article
Auto-Encoding Variational Bayes
Diederik P. Kingma,Max Welling +1 more
- 01 Jan 2014
TL;DR: A stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case is introduced.
•Posted Content
Proximal Policy Optimization Algorithms
TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.
18K
•Book
Optimal Transport: Old and New
Cédric Villani
- 02 Jan 2013
TL;DR: In this paper, the authors provide a detailed description of the basic properties of optimal transport, including cyclical monotonicity and Kantorovich duality, and three examples of coupling techniques.
7.4K
MuJoCo: A physics engine for model-based control
Emanuel Todorov,Tom Erez,Yuval Tassa +2 more
- 24 Dec 2012
TL;DR: A new physics engine tailored to model-based control, based on the modern velocity-stepping approach which avoids the difficulties with spring-dampers, which can compute both forward and inverse dynamics.
Related Papers (5)
Jonathan Ho,Stefano Ermon +1 more
- 10 Jun 2016
Pieter Abbeel,Andrew Y. Ng +1 more
- 04 Jul 2004
Stephane Ross,Drew Bagnell +1 more
- 31 Mar 2010