Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models

doi:10.1007/978-3-030-44051-0_12

Open AccessBook Chapter10.1007/978-3-030-44051-0_12

Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models

Ajay Kumar Tanwani, +8 more

- 09 Dec 2018

- pp 196-211

12

TL;DR: An algorithm that learns a joint probability density function of the demonstrations with invariant formulations of hidden semi-Markov models to extract invariant segments, and smoothly follow the generated sequence of states with a linear quadratic tracking controller allows a Baxter robot to learn a pick-and-place task while avoiding a movable obstacle based on only 4 kinesthetic demonstrations.

Abstract: Generalizing manipulation skills to new situations requires extracting invariant patterns from demonstrations. For example, the robot needs to understand the demonstrations at a higher level while being invariant to the appearance of the objects, geometric aspects of objects such as its position, size, orientation and viewpoint of the observer in the demonstrations. In this paper, we propose an algorithm that learns a joint probability density function of the demonstrations with invariant formulations of hidden semi-Markov models to extract invariant segments (also called sub-goals or options), and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The algorithm takes as input the demonstrations observed with respect to different coordinate systems describing virtual landmarks or objects of interest, and adapts the segments according to the environmental changes in a systematic manner. We present variants of this algorithm in latent space with low-rank covariance decompositions, semi-tied covariances, and non-parametric online estimation of model parameters under small variance asymptotics; yielding considerably low sample and model complexity for acquiring new manipulation skills. The algorithm allows a Baxter robot to learn a pick-and-place task while avoiding a movable obstacle based on only 4 kinesthetic demonstrations.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1109/ICRA40945.2020.9197324

Motion2Vec: Semi-Supervised Representation Learning from Surgical Videos

Ajay Kumar Tanwani, +5 more

- 31 May 2020

TL;DR: This paper learns a motion-centric representation of surgical video demonstrations by grouping them into action segments/subgoals/options in a semi-supervised manner and demonstrates the use of this representation to imitate surgical suturing kinematic motions from publicly available videos of the JIGSAWS dataset.

...read moreread less

45

Proceedings Article•10.1109/IROS45743.2020.9364328

Learning robust manipulation tasks involving contact using trajectory parameterized probabilistic principal component analysis

Cristian Alejandro Vergara Perico, +2 more

- 24 Oct 2020

TL;DR: In this article, Trajectory parameterized Probabilistic Principal Component Analysis (traPPCA) is introduced to learn manipulation tasks involving both motion and contact wrenches (forces and moments).

...read moreread less

10

Journal Article•10.1177/02783649211032721

Sequential robot imitation learning from observations

Ajay Kumar Tanwani, +4 more

- 06 Aug 2021

- The International Journal of Robotics Re...

TL;DR: In this paper, a framework to learn the sequential structure in the demonstrations for robot imitation learning is presented. But this framework is not suitable for the task of human imitation learning, as shown in Figure 1.

...read moreread less

9

•Proceedings Article

Mitigating Network Latency in Cloud-Based Teleoperation using Motion Segmentation and Synthesis

Nan Tian, +3 more

- 01 Oct 2019

8

Proceedings Article•10.1109/IROS47612.2022.9981384

Optimizing Demonstrated Robot Manipulation Skills for Temporal Logic Constraints

Akshay Dhonthi, +3 more

- 07 Sep 2022

TL;DR: Signal Temporal Logic (STL), an expressive form of temporal properties of systems, is used to formulate task speciﬁcations and use black-box optimization (BBO) to adapt an LfD skill accordingly.

...read moreread less

6

References

Journal Article•10.1109/5.18626

A tutorial on hidden Markov models and selected applications in speech recognition

Lawrence R. Rabiner

- 01 Feb 1989

TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.

...read moreread less

24.3K

Journal Article•10.1016/J.ROBOT.2008.10.024

A survey of robot learning from demonstration

Brenna D. Argall, +3 more

- 01 May 2009

- Robotics and Autonomous Systems

TL;DR: A comprehensive survey of robot Learning from Demonstration (LfD), a technique that develops policies from example state to action mappings, which analyzes and categorizes the multiple ways in which examples are gathered, as well as the various techniques for policy derivation.

...read moreread less

4.2K

•Journal Article•10.1198/016214506000000302

Hierarchical Dirichlet Processes

Yee Whye Teh, +3 more

- 01 Dec 2006

- Journal of the American Statistical Asso...

TL;DR: This work considers problems involving groups of data where each observation within a group is a draw from a mixture model and where it is desirable to share mixture components between groups, and considers a hierarchical model, specifically one in which the base measure for the childDirichlet processes is itself distributed according to a Dirichlet process.

...read moreread less

4.2K

Journal Article•10.1162/089976699300016728

Mixtures of probabilistic principal component analyzers

Michael E. Tipping, +1 more

- 01 Feb 1999

- Neural Computation

TL;DR: PCA is formulated within a maximum likelihood framework, based on a specific form of gaussian latent variable model, which leads to a well-defined mixture model for probabilistic principal component analyzers, whose parameters can be determined using an expectation-maximization algorithm.

...read moreread less

2.1K

•Posted Content

Generative Adversarial Imitation Learning

Jonathan Ho, +1 more

- 10 Jun 2016

- arXiv: Learning

TL;DR: A new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning, is proposed and a certain instantiation of this framework draws an analogy between imitation learning and generative adversarial networks.

...read moreread less

2K

...

Expand

Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models

Chat with Paper

AI Agents for this Paper

Citations

Motion2Vec: Semi-Supervised Representation Learning from Surgical Videos

Learning robust manipulation tasks involving contact using trajectory parameterized probabilistic principal component analysis

Sequential robot imitation learning from observations

Mitigating Network Latency in Cloud-Based Teleoperation using Motion Segmentation and Synthesis

Optimizing Demonstrated Robot Manipulation Skills for Temporal Logic Constraints

References

A tutorial on hidden Markov models and selected applications in speech recognition

A survey of robot learning from demonstration

Hierarchical Dirichlet Processes

Mixtures of probabilistic principal component analyzers

Generative Adversarial Imitation Learning

Related Papers (5)

Learning position and orientation dynamics from demonstrations via contraction analysis

Complex Sequential Tasks Learning with Bayesian Inference and Gaussian Mixture Model

On improving the extrapolation capability of task-parameterized movement models

Learning object-manipulation verbs for human-robot communication

Parametric Hidden Markov Models for Recognition and Synthesis of Movements