An Empirical Study of Exploration Strategies for Model-Free Reinforcement Learning

Yasui, Nikolaus Winget

doi:doi:10.7939/r3-8fvc-9g30

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

329 views
500 downloads

An Empirical Study of Exploration Strategies for Model-Free Reinforcement Learning

Author / Creator

Yasui, Nikolaus Winget
Reinforcement Learning is a formalism for learning by trial and error. Unfortunately, trial and error can take a long time to find a solution if the agent does not efficiently explore the behaviours available to it. Moreover, how an agent ought to explore depends on the task that the agent is trying to learn. In this thesis we study how an agent's exploration strategy affects how quickly it learns to solve different problems. In particular, we focus on model-free algorithms that learn value functions. We first examine the space of problems, or environments, that reinforcement learning agents are expected to solve. We identify six properties of environments that can make exploration difficult, and design a prototypical environment expressing each property. We also survey the exploration literature and categorize existing exploration methods by the heuristic that guides their behaviour. Lastly, we conduct an empirical study evaluating the performance of several exploration methods on our prototypical exploration environments. We found that only one method, Linear Upper Confidence Least Squares, was able to consistently perform well in every environment. We also found that methods which add a bonus to their value function tended to explore much more effectively than methods which add a bonus to their rewards. Our investigation of value-based exploration provides a novel, systematic approach to understanding the strengths and weaknesses of exploration algorithms in reinforcement learning.
Subjects / Keywords
Graduation date

Spring 2020
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-8fvc-9g30
License

Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- White, Martha (Computing Science)
- White, Adam (Computing Science)