On Efficient Planning in Large Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning

Tkachuk, Volodymyr

doi:doi:10.7939/r3-hj98-1s59

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

171 views
206 downloads

On Efficient Planning in Large Action Spaces with Applications to Cooperative Multi-Agent Reinforcement Learning

Author / Creator

Tkachuk, Volodymyr
A practical challenge in reinforcement learning is large action spaces that make planning computationally demanding. For example, in cooperative multi-agent reinforcement learning, a potentially large number of agents jointly optimize a global reward function, which leads to a blow-up in the action space as the number of agents increases. Building on recent work in planning with local access to a simulator and linear function approximation, we propose efficient algorithms for this setting that lead to polynomial compute and query complexity in all relevant problem parameters. As a minimal requirement, we assume access to an argmax oracle that allows to efficiently compute the greedy policy for any q-function in the model class. For the special case where the feature decomposition is additive, we further improve the bounds if the dimension of the feature space is large relative to the number of additive terms in the feature decomposition. To the best of our knowledge, this work provides the first computationally efficient algorithms with theoretical guarantees for the reinforcement learning problem, when the action space is large and we have access to a simulator.
Subjects / Keywords
Graduation date

Fall 2023
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-hj98-1s59
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Szepesvári, Csaba