- 381 views
- 391 downloads
Faster Gradient-TD Algorithms
-
- Author / Creator
- Hackman, Leah M
-
Gradient-TD methods are a new family of learning algorithms that are stable and convergent under a wider range of conditions than previous reinforcement learning algorithms. In particular, gradient-TD algorithms enable off-policy problems---problems where the distribution of the data is different from the distribution the learner seeks to learn about---while using function approximation in a data-efficient on-line manner. Despite these positive features, previous empirical work, though limited, suggests that gradient-TD methods are slower than they could be. One example of this slowness is in on-policy problems, where gradient-TD methods have been shown to be slower than conventional-TD methods in some cases (Maei, 2011). In this thesis, we examine this slowness through on- and off-policy experiments and introduce several variations of existing gradient-TD algorithms in search of “faster” gradient-TD methods. We then introduce hybrid gradient-TD methods, a class of algorithms unique in their ability to use conventional-TD and gradient-TD learning updates when appropriate. We introduce three algorithms, two of which are hybrid gradient-TD methods and close with the first experimental results. In particular, we present promising results which indicate one of our new algorithms provides the benefits of a hybrid gradient-TD method while outperforming previous gradient-TD methods.
-
- Subjects / Keywords
-
- Graduation date
- Spring 2013
-
- Type of Item
- Thesis
-
- Degree
- Master of Science
-
- License
- This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.