Advances in Simulation-Based Search and Batch Reinforcement Learning

Xiao, Chenjun

doi:doi:10.7939/r3-hnkc-0844

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

215 views
359 downloads

Advances in Simulation-Based Search and Batch Reinforcement Learning

Author / Creator

Xiao, Chenjun
Reinforcement learning (RL) defines a general computational problem where the learner must learn to make good decisions through interactive experience. To be effective in solving this problem, the learner must be able to explore the environment, make accurate predictions about the future, and compute strategic plans. These joint challenges distinguish RL from other machine learning problems. This dissertation considers two sub-topics of RL: Planning and Batch RL.

For planning, we contribute two novel techniques to improve the efficiency of Monte Carlo Tree Search (MCTS): 1) Memory-augmented MCTS incorporates a memory structure into MCTS in order to generate an approximate value estimate that combines the estimate of similar states; 2) a new MCTS algorithm that applies maximum entropy policy optimization to general sequential decision-making.

For batch RL, we offer three analyses towards a better understanding of the theoretical foundations of batch RL: 1) a minimax and instance-dependent analysis of batch policy optimization algorithms; 2) a characterization of the curse of passive data collection in batch RL; and 3) a theoretical analysis of convergence and generalization properties of value prediction algorithms with overparameterized models.
Subjects / Keywords
Graduation date

Spring 2023
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/r3-hnkc-0844
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Müller, Martin (Computing Science)
- Schuurmans, Dale (Computing Science)