Yi Wan
University of Alberta
9 Papers
25 Citations
Yi Wan is an academic researcher from University of Alberta. The author has contributed to research in topics: Reinforcement learning & Markov decision process. The author has an hindex of 4, co-authored 9 publications.
Chat about Author
Papers
•Posted Content
Planning with Expectation Models
TL;DR: It is shown that planning with an expectation model is equivalent to planning with a distribution model if the state value function is linear in state features, and two common parametrization choices for approximating the expectation are analyzed.
•Posted Content
Average-Reward Off-Policy Policy Evaluation with Function Approximation
TL;DR: In this article, the authors consider off-policy policy evaluation with function approximation (FA) in average-reward MDPs, where the goal is to estimate both the reward rate and the differential value function.
5
•Proceedings Article
Learning and Planning in Average-Reward Markov Decision Processes
Yi Wan,Abhishek Naik,Richard S. Sutton +2 more
- 18 Jul 2021
TL;DR: The first general proven-convergent off-policy model-free control algorithm for average-reward MDPs without reference states was proposed by Abounadi, Bertsekas, and Borkar as discussed by the authors.
•Proceedings Article
Average-Reward Off-Policy Policy Evaluation with Function Approximation
Shangtong Zhang,Yi Wan,Richard S. Sutton,Shimon Whiteson +3 more
- 18 Jul 2021
TL;DR: In this paper, the authors consider off-policy policy evaluation with function approximation (FA) in average-reward MDPs, where the goal is to estimate both the reward rate and the differential value function.