Usage
  • 38 views
  • 94 downloads

K-percent Evaluation for Lifelong Reinforcement Learning

  • Author / Creator
    Mesbahi, Golnaz
  • If we aspire to design algorithms that can run for long periods, continually
    adapting to new, unexpected situations, then we must be willing to deploy
    our agents without tuning their hyperparameters over the agent’s entire lifetime.
    The standard practice in deep RL—and even continual RL—is to assume
    unfettered access to the deployment environment for the full lifetime of the
    agent. In this thesis, we propose a new approach for evaluating lifelong RL
    agents where only k percent of the experiment data can be used for hyperparameter
    tuning. We then conduct an empirical study of DQN and SAC
    across a variety of continuing and non-stationary domains. We find agents
    generally perform poorly when restricted to k-percent tuning, whereas several
    algorithmic mitigations designed to maintain network plasticity help with the
    performance. In addition, we explore the impact of the tuning budget (k)
    on algorithm performance and hyperparameter selection, and assess various
    mitigation strategies’ network properties to analyze their behavior.

  • Subjects / Keywords
  • Graduation date
    Fall 2024
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-xscc-9d48
  • License
    This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.