Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Sahir

doi:doi:10.7939/r3-r13e-n370

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

238 views
164 downloads

Methodical Advice Collection and Reuse in Deep Reinforcement Learning

Author / Creator

Sahir
Reinforcement learning (RL) has shown great success in solving many challenging tasks via the use of deep neural networks. Although the use of deep learning for RL brings immense representational power to the arsenal, it also causes sample inefficiency. This means that the algorithms are data-hungry and require millions of training samples to converge to an adequate policy. One way to combat this issue is to use action advising in a teacher-student framework, where a knowledgeable teacher provides action advice to a student. Despite the promising results in the action advising literature, there are limitations, such as a limited advice budget, inflexibility of choices to conduct advice collection and reuse. This thesis proposes the use of single (student agent) or dual uncertainties (student and the model of teacher) to drive the advice collection and reuse process in order to provide more flexibility in our algorithms to more efficiently exploit a teacher’s advice budget. Additionally, this thesis introduces a new method to compute uncertainty for a deep learning RL agent using a secondary neural network. The results show that using two uncertainties to drive advice collection and reuse improves learning performance across several Atari games.
Subjects / Keywords
Graduation date

Spring 2022
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-r13e-n370
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Taylor, Matt (Computing Science)