This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

Search

Filter

Subject / Keyword

Show 4 more ...

Departments

76Department of Computing Science

Author / Creator / Contributor

Show 4 more ...

Year

Collections

Languages

76English

Item type

76Thesis

Supervisors

Show 4 more ...

Two-Timescale Networks for Nonlinear Value Function Approximation
Download

Fall 2019

Chung, Wesley

Policy evaluation, learning value functions, is an integral part of the reinforcement learning problem. In this thesis, I propose a neural network architecture, the Two-Timescale Network (TTN), for value function approximation which utilizes linear function approximation for the value function...
Unifying n-Step Temporal-Difference Action-Value Methods
Download

Spring 2019

Juan Fernando Hernandez Garcia

Unifying seemingly disparate algorithmic ideas to produce better performing algorithms has been a longstanding goal in reinforcement learning. As a primary example, the TD(λ) algorithm elegantly unifies temporal difference (TD) methods with Monte Carlo methods through the use of eligibility...
Useful Policy Invariant Shaping from Arbitrary Advice
Download

Spring 2020

Behboudian, Paniz

Reinforcement learning (RL) is a powerful learning paradigm in which agents can learn to maximize sparse and delayed reward signals. Although RL has had many impressive successes in complex domains, learning can take hours, days, or even years of training data. A major challenge of contemporary...
Using Prior Data to Facilitate Learning and Inference in New Environments
Download

Spring 2021

Wen, Junfeng

This dissertation demonstrates how to utilize data collected previously from different sources to facilitate learning and inference for a target task. Learning from scratch for a target task or environment can be expensive and time-consuming. To address this problem, we make three contributions...
Value Bonuses Using Ensemble Errors For Exploration in Reinforcement Learning
Download

Spring 2024

Wahab, Abdul

Optimistic value estimates provide one mechanism for directed exploration in reinforcement learning (RL). The agent acts greedily with respect to an estimate of the value plus what can be seen as a value bonus. The value bonus can be learned by estimating a value function on reward bonuses,...
Vector Step-size Adaptation for Continual, Online Prediction
Download

Fall 2019

Jacobsen, Andrew

In this thesis, we investigate different vector step-size adaptation approaches for continual, online prediction problems. Vanilla stochastic gradient descent can be considerably improved by scaling the update with a vector of appropriately chosen step-sizes. Many methods, including AdaGrad,...