This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.
- 388 views
- 458 downloads
Selective Dyna-style Planning Using Neural Network Models with Limited Capacity
-
- Author / Creator
- Zaheer, Muhammad
-
In model-based reinforcement learning, planning with an imperfect model of the environment has the potential to harm learning progress.
But even when a model is imperfect, it may still contain information that is useful for planning.
In this thesis, we investigate the idea of using an imperfect model selectively: the agent should plan in parts of the state space where the model would be helpful but refrain from using the model where it would be harmful.
An effective selective planning mechanism needs to account for at least three sources of model errors: stochastic dynamics of the environment, insufficient coverage of the state space, and limited capacity to model the dynamics.
Prior work has used parameter uncertainty for selective planning, where the estimated uncertainty signals the errors due to insufficient coverage.
In this work, we emphasize the importance of structural uncertainty that signals the errors due to limited capacity; we show that the learned input-dependent variance, under the standard Gaussian assumption, can be interpreted as an estimate of structural uncertainty.
We empirically evaluate the ability of the learned variance to help plan selectively under limited capacity.
The results show that selective planning with the learned variance can be useful, even when planning with the model non-selectively would cause catastrophic failure. -
- Subjects / Keywords
-
- Graduation date
- Spring 2020
-
- Type of Item
- Thesis
-
- Degree
- Master of Science
-
- License
- Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.