An Empirical Study of Model-Free Exploration for Deep Reinforcement Learning

  • Author / Creator
    Zhao, Xutong
  • Reinforcement learning (RL) is a learning paradigm focusing on how agents interact with an environment to maximize cumulative reward signals emitted from the environment. Exploration versus exploitation challenge is critical in RL research: the agent ought to trade off between taking the known rewarding sequence of actions and exploring unknown actions that might be more rewarding. Exploration research in RL often uses algorithms and environments with many degrees of freedom, which can interfere with the interpretability of results. This thesis presents a systematic, yet simple, study of exploration methods for value-based control algorithms. We present a novel suite of small environments that each pose a distinct exploration challenge. Our environment designs allow us to observe the strengths and weaknesses of individual exploration methods, as well as trends across implementation details and conceptual approaches to exploration. We conduct a literature survey and categorize model-free exploration approaches by their underlying heuristics. We also empirically evaluate the performance of representative exploration methods on our exploration domains. Despite the simplicity of our environments, none of the tested exploration methods achieves good performance in all environments. However, some methods consistently improved upon the Q-learning baseline. Beyond our survey results, our suite of interpretable environments can be used as a sanity check to ensure that an exploration method behaves appropriately in simple situations.

  • Subjects / Keywords
  • Graduation date
    Fall 2021
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.