Search

Filter

Supervisors

Show 4 more ...

Author / Creator / Contributor

Show 4 more ...

Subject / Keyword

Show 4 more ...

Year

Collections

Languages

25English

Item type

25Thesis

Departments

25Department of Computing Science

A Distribution Dependent Analysis of Meta-Learning
Download

Spring 2022

Konobeev, Mikhail

A key problem in the theory of meta-learning is to understand how the task distributions influence transfer risk, the expected error of a meta-learner on a new task drawn from the unknown task distribution. In this work, focusing on fixed design linear regression with Gaussian noise and a...
An Empirical Study of Exploration Strategies for Model-Free Reinforcement Learning
Download

Spring 2020

Yasui, Nikolaus Winget

Reinforcement Learning is a formalism for learning by trial and error. Unfortunately, trial and error can take a long time to find a solution if the agent does not efficiently explore the behaviours available to it. Moreover, how an agent ought to explore depends on the task that the agent is...
Analysis of an Alternate Policy Gradient Estimator for Softmax Policies
Download

Spring 2022

Garg, Shivam

Policy gradient (PG) estimators are ineffective in dealing with softmax policies that are sub-optimally saturated, which refers to the situation when the policy concentrates its probability mass on sub-optimal actions. Sub-optimal policy saturation may arise from a bad policy initialization or a...
Chasing Hallucinated Value: A Pitfall of Dyna Style Algorithms with Imperfect Environment Models
Download

Spring 2020

Jafferjee, Taher

In Dyna style algorithms, reinforcement learning (RL) agents use a model of the environment to generate simulated experience. By updating on this simulated experience, Dyna style algorithms allow agents to potentially learn control policies in fewer environment interactions than agents that use...
Continuous Multilevel Actions in Reinforcement Learning
Download

Fall 2023

Mitchell, Daniel

Multilevel action selection is a reinforcement learning technique in which an action is broken into two parts, the type and the parameters. When using multilevel action selection in reinforcement learning, one must break the action space into multiple subsets. These subsets are typically disjoint...
Directly Learning Predictors on Missing Data with Neural Networks
Download

Fall 2023

Awwal, Alvina

The problem of missing data is omnipresent in a wide range of real-world datasets. When learning and predicting on this data with neural networks, the typical strategy is to fill-in or complete these missing values in the dataset, called impute-then-regress. Much less common is to attempt to...
Distributional Losses for Regression
Download

Spring 2019

Imani, Ehsan

In this thesis we introduce a new loss for regression, the Histogram Loss. There is some evidence that, in the problem of sequential decision making, estimating the full distribution of return offers a considerable gain in performance, even though only the mean of that distribution is used in...
Feature Generalization in Deep Reinforcement Learning: An Investigation into Representation Properties
Download

Fall 2022

Miahi, Erfan

In this thesis, we investigate the connection between the properties and the generalization performance of representations learned by deep reinforcement learning algorithms. Much of the earlier work on representation learning for reinforcement learning focused on designing fixed-basis...
Goal-Space Planning with Subgoal Models
Download

Fall 2022

Lo, Chunlok

This thesis investigates a new approach to model-based reinforcement learning using background planning: mixing (approximate) dynamic programming updates and model-free updates, similar to the Dyna architecture. Background planning with learned models is often worse than model-free alternatives,...
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Download

Fall 2020

Chan, Alan

Policy gradient methods typically estimate both explicit policy and value functions. The long-extant view of policy gradient methods as approximate policy iteration---alternating between policy evaluation and policy improvement by greedification---is a helpful framework to elucidate algorithmic...