This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.
Search
Skip to Search Results- 101Reinforcement Learning
- 23Machine Learning
- 12Artificial Intelligence
- 6Transfer Learning
- 5Planning
- 5Representation Learning
- 91Graduate and Postdoctoral Studies (GPS), Faculty of
- 91Graduate and Postdoctoral Studies (GPS), Faculty of/Theses and Dissertations
- 5Computing Science, Department of
- 5Computing Science, Department of/Technical Reports (Computing Science)
- 3WISEST Summer Research Program
- 3WISEST Summer Research Program/WISEST Research Posters
-
Fall 2021
Structural credit assignment in neural networks is a long-standing problem, with a variety of alternatives to backpropagation proposed to allow for local training of nodes. One of the early strategies was to treat each node as an agent and use a reinforcement learning method called REINFORCE to...
-
Spring 2024
In reinforcement learning, agents solve problems through interactions with the environment. However, when faced with intricate environmental dynamics, learning can become challenging, resulting in sub-optimal policies. A potential remedy to this situation lies in the transfer of knowledge from...
-
Spring 2023
AlphaZero is a self-play reinforcement learning algorithm that achieves superhuman play in the games of chess, shogi, and Go via policy iteration. To be an effective policy improvement operator, AlphaZero’s search needs to have accurate value estimates for the states that appear in its search...
-
Toward Practical Reinforcement Learning Algorithms: Classification Based Policy Iteration and Model-Based Learning
DownloadSpring 2017
In this dissertation, we advance the theoretical understanding of two families of Reinforcement Learning (RL) methods: Classification-based policy iteration (CBPI) and model-based reinforcement learning (MBRL) with factored semi-linear models. In contrast to generalized policy iteration, CBPI...
-
Fall 2019
Policy evaluation, learning value functions, is an integral part of the reinforcement learning problem. In this thesis, I propose a neural network architecture, the Two-Timescale Network (TTN), for value function approximation which utilizes linear function approximation for the value function...
-
Spring 2019
Juan Fernando Hernandez Garcia
Unifying seemingly disparate algorithmic ideas to produce better performing algorithms has been a longstanding goal in reinforcement learning. As a primary example, the TD(λ) algorithm elegantly unifies temporal difference (TD) methods with Monte Carlo methods through the use of eligibility...
-
Spring 2020
Reinforcement learning (RL) is a powerful learning paradigm in which agents can learn to maximize sparse and delayed reward signals. Although RL has had many impressive successes in complex domains, learning can take hours, days, or even years of training data. A major challenge of contemporary...
-
Spring 2021
This dissertation demonstrates how to utilize data collected previously from different sources to facilitate learning and inference for a target task. Learning from scratch for a target task or environment can be expensive and time-consuming. To address this problem, we make three contributions...
-
Spring 2024
Optimistic value estimates provide one mechanism for directed exploration in reinforcement learning (RL). The agent acts greedily with respect to an estimate of the value plus what can be seen as a value bonus. The value bonus can be learned by estimating a value function on reward bonuses,...