- 38 views
- 94 downloads
K-percent Evaluation for Lifelong Reinforcement Learning
-
- Author / Creator
- Mesbahi, Golnaz
-
If we aspire to design algorithms that can run for long periods, continually
adapting to new, unexpected situations, then we must be willing to deploy
our agents without tuning their hyperparameters over the agent’s entire lifetime.
The standard practice in deep RL—and even continual RL—is to assume
unfettered access to the deployment environment for the full lifetime of the
agent. In this thesis, we propose a new approach for evaluating lifelong RL
agents where only k percent of the experiment data can be used for hyperparameter
tuning. We then conduct an empirical study of DQN and SAC
across a variety of continuing and non-stationary domains. We find agents
generally perform poorly when restricted to k-percent tuning, whereas several
algorithmic mitigations designed to maintain network plasticity help with the
performance. In addition, we explore the impact of the tuning budget (k)
on algorithm performance and hyperparameter selection, and assess various
mitigation strategies’ network properties to analyze their behavior. -
- Subjects / Keywords
-
- Graduation date
- Fall 2024
-
- Type of Item
- Thesis
-
- Degree
- Master of Science
-
- License
- This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.