Simon Schmitt
17 Papers
66 Citations
Simon Schmitt is an academic researcher from Google. The author has contributed to research in topics: Computer science & Reinforcement learning. The author has an hindex of 8, co-authored 14 publications.
Chat about Author
Papers
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model
Julian Schrittwieser,Ioannis Antonoglou,Thomas Hubert,Karen Simonyan,Laurent Sifre,Simon Schmitt,Arthur Guez,Edward Lockhart,Demis Hassabis,Thore Graepel,Timothy P. Lillicrap,David Silver +11 more
TL;DR: The MuZero algorithm is presented, which, by combining a tree-based search with a learned model, achieves superhuman performance in a range of challenging and visually complex domains, without any knowledge of their underlying dynamics.
1.4K
Mastering Atari, Go, chess and shogi by planning with a learned model
Julian Schrittwieser,Ioannis Antonoglou,Thomas Hubert,Karen Simonyan,Laurent Sifre,Simon Schmitt,Arthur Guez,Edward Lockhart,Demis Hassabis,Thore Graepel,Timothy P. Lillicrap,David Silver +11 more
TL;DR: MuZero as discussed by the authors is a reinforcement learning algorithm that combines a tree-based search with a learned model to achieve state-of-the-art performance in high-performance planning and visually complex domains.
1.2K
Multi-task Deep Reinforcement Learning with PopArt
Matteo Hessel,Hubert Soyer,Lasse Espeholt,Wojciech Marian Czarnecki,Simon Schmitt,Hado van Hasselt +5 more
- 17 Jul 2019
TL;DR: This work proposes to automatically adapt the contribution of each task to the agent’s updates, so that all tasks have a similar impact on the learning dynamics, and learns a single trained policy that exceeds median human performance on this multi-task domain.
•Posted Content
Kickstarting Deep Reinforcement Learning
Simon Schmitt,J. J. Hudson,Augustin Zidek,Simon Osindero,Carl Doersch,Wojciech Marian Czarnecki,Joel Z. Leibo,Heinrich Küttler,Andrew Zisserman,Karen Simonyan,S. M. Ali Eslami +10 more
TL;DR: It is shown that, on a challenging and computationally-intensive multi-task benchmark (DMLab-30), kickstarted training improves the data efficiency of new agents, making it significantly easier to iterate on their design.
120
•Posted Content
Muesli: Combining Improvements in Policy Optimization.
Matteo Hessel,Ivo Danihelka,Fabio Viola,Arthur Guez,Simon Schmitt,Laurent Sifre,Theophane Weber,David Silver,Hado van Hasselt +8 more
TL;DR: A novel policy update that combines regularized policy optimization with model learning as an auxiliary loss and does so without using deep search: it acts directly with a policy network and has computation speed comparable to model-free baselines.