Learning Parameterized Skills

doi:10.7275/6439644.0

Open AccessProceedings Article10.7275/6439644.0

Learning Parameterized Skills

Bruno da Silva, +2 more

- 26 Jun 2012

- pp 1443-1450

109

TL;DR: A method for constructing skills capable of solving tasks drawn from a distribution of parameterized reinforcement learning problems by predicting policy parameters from task parameters is introduced.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1109/ICRA.2018.8460487

End-to-End Driving Via Conditional Imitation Learning

Felipe Codevilla, +4 more

- 21 May 2018

TL;DR: This work evaluates different architectures for conditional imitation learning in vision-based driving and conducts experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area.

...read moreread less

1.3K

•Proceedings Article

Universal Value Function Approximators

Tom Schaul, +3 more

- 06 Jul 2015

TL;DR: An efficient technique for supervised learning of universal value function approximators (UVFAs) V (s, g; θ) that generalise not just over states s but also over goals g is developed and it is demonstrated that a UVFA can successfully generalise to previously unseen goals.

...read moreread less

1K

•Proceedings Article

Probabilistic Movement Primitives

Alexandros Paraschos, +3 more

- 05 Dec 2013

TL;DR: This work analytically derive a stochastic feedback controller which reproduces the given trajectory distribution for robot movement control and presents a probabilistic formulation of the MP concept that maintains a distribution over trajectories.

...read moreread less

613

•Journal Article•10.1016/J.ROBOT.2012.05.008

Active learning of inverse models with intrinsically motivated goal exploration in robots

Adrien Baranes, +1 more

- 01 Jan 2013

- Robotics and Autonomous Systems

TL;DR: The Self-Adaptive Goal Generation Robust Intelligent Adaptive Curiosity (SAGG-RIAC) architecture is introduced as an intrinsically motivated goal exploration mechanism which allows active learning of inverse models in high-dimensional redundant robots.

...read moreread less

582

•Proceedings Article•10.15607/RSS.2017.XIII.048

Preparing for the Unknown: Learning a Universal Policy with Online System Identification

Wenhao Yu, +3 more

- 12 Jul 2017

TL;DR: In this paper, the authors present a new method of learning control policies that successfully operate under unknown dynamic models by leveraging a large number of training examples that are generated using a physical simulator.

...read moreread less

370

...

Expand

References

•Book

The Nature of Statistical Learning Theory

Vladimir Vapnik

- 01 Jan 1995

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?

...read moreread less

46K

Journal Article•10.1126/SCIENCE.290.5500.2319

A global geometric framework for nonlinear dimensionality reduction.

Joshua B. Tenenbaum, +2 more

- 22 Dec 2000

- Science

TL;DR: An approach to solving dimensionality reduction problems that uses easily measured local metric information to learn the underlying global geometry of a data set and efficiently computes a globally optimal solution, and is guaranteed to converge asymptotically to the true structure.

...read moreread less

15.9K

•Journal Article•10.1016/S0004-3702(99)00052-1

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Richard S. Sutton, +2 more

- 01 Aug 1999

- Artificial Intelligence

TL;DR: It is shown that options enable temporally abstract knowledge and action to be included in the reinforcement learning frame- work in a natural and general way and may be used interchangeably with primitive actions in planning methods such as dynamic pro- gramming and in learning methodssuch as Q-learning.

...read moreread less

3.9K

•Journal Article

Large Margin Methods for Structured and Interdependent Output Variables

Ioannis Tsochantaridis, +3 more

- 01 Dec 2005

- Journal of Machine Learning Research

TL;DR: This paper proposes to appropriately generalize the well-known notion of a separation margin and derive a corresponding maximum-margin formulation and presents a cutting plane algorithm that solves the optimization problem in polynomial time for a large class of problems.

...read moreread less

2.4K

•Proceedings Article•10.1109/ROBOT.2002.1014739

Movement imitation with nonlinear dynamical systems in humanoid robots

Auke Jan Ijspeert, +2 more

- 07 Aug 2002

TL;DR: The results demonstrate that multi-joint human movements can be encoded successfully by the CPs, that a learned movement policy can readily be reused to produce robust trajectories towards different targets, and that the parameter space which encodes a policy is suitable for measuring to which extent two trajectories are qualitatively similar.

...read moreread less

1K