- 43 views
- 63 downloads
Accounting for Hyperparameter Tuning in Online Reinforcement Learning
-
- Author / Creator
- Hakhverdyan, Anna
-
Most work in online reinforcement learning (RL) tunes hyperparameters in an offline phase without accounting for the said interaction. This empirical methodology is a reasonable approach to assess how well algorithms can perform but is limited when evaluating algorithms for practical deployment in the real world. In many applications, the environment is incompatible with exhaustive hyperparameter searches, and typical evaluations do not characterize how much data one must use for such searches. We investigate how to do online tuning, where the agent must select hyperparameters during interactions. Hyperparameter tuning is part of the agent rather than a separate hidden phase. We layer the Bayesian optimizer over standard RL algorithms and assess behaviour when tuning hyperparameters online. We show the expected result - this strategy's success depends on the environment and the algorithm. We introduce a way of tuning that mitigates wasteful resetting and shows that it can achieve comparable but not better performance levels than the default values, highlighting the need for further development.
-
- Subjects / Keywords
-
- Graduation date
- Fall 2024
-
- Type of Item
- Thesis
-
- Degree
- Master of Science
-
- License
- This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.