On the benefits of sparsity in value function approximators for Reinforcement Learning

Davelouis Gallardo, Fatima D

doi:doi:10.7939/r3-63xz-v628

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

173 views
169 downloads

On the benefits of sparsity in value function approximators for Reinforcement Learning

Author / Creator

Davelouis Gallardo, Fatima D
In machine learning, sparse neural networks provide higher computational efficiency and in some cases, can perform just as well as fully-connected networks. In the online and incremental reinforcement learning (RL) problem, Prediction Adapted Networks (Martin and Modayil, 2021) is an algorithm that can adapt the sparse connectivity of a shallow value network with random hidden-layer weights. Martin and Modayil evaluated Prediction Adapted Networks (PANs) in the RL prediction setting and showed promising results, suggesting that one can use multiple online predictions of input signals to discover high-performing NN sparse topologies with no a priori inductive biases. However, there remain some open questions that one can ask about this algorithm. For instance, do the statistical benefits of PANs carry over to reinforcement learning control in multiple environments? Do PANs provide performance gains when we learn the sparse value network’s weights end-to-end in both the prediction and control settings? How does predictive sparsity compare against sparse network structures learned end-to-end? The contributions of this work are two fold. First, we investigate the above questions and provide answers. Second, we devise a methodology that encodes sparse value network structures as binary masks and systematically evaluate their performance. In one RL control environment, we find that predictive sparsity performs on par with both a fully-connected architecture and a sparse network induced by L1 regularization. However, in another domain PANs does not generate a sparse structure that can outperform even random sparsity. Surprisingly, in the same RL prediction environment that was used in the PANs original work, we found that learning the hidden-layer weights does not lead to better performance, suggesting there may be unidentified properties of environments for which PANs is best suited.
Subjects / Keywords
- Reinforcement Learning
- Neural Network Sparsity
Graduation date

Spring 2024
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-63xz-v628
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Bowling, Michael
- Martin, John D