Search

Filter

Subject / Keyword

Show 4 more ...

Collections

Author / Creator / Contributor

Show 4 more ...

Year

Languages

83English

Item type

83Thesis

Departments

Supervisors

Show 4 more ...

A Framework for Safe Evaluation of Offline Learning
Download

Spring 2022

Radi, Hager

The world offers unprecedented amounts of data in real-world domains, from which we can develop successful decision-making systems. It is possible for reinforcement learning (RL) to learn control policies offline from such data but challenging to deploy an agent during learning in safety-critical...
Actor-Expert: A Framework for using Q-learning in Continuous Action Spaces
Download

Fall 2019

Lim, Sungsu

Q-learning can be difficult to use in continuous action spaces, because a difficult optimization has to be solved to find the maximal action. Some common strategies have been to discretize the action space, solve the maximization with a powerful optimizer at each step, restrict the functional...
Adapting Behaviour via Intrinsic Reward
Download

Spring 2021

Linke, Cameron

Learning about many things can provide numerous benefits to a reinforcement learning system. For example, learning many auxiliary value functions, in addition to optimizing the environmental reward, appears to improve both exploration and representation learning. The question we tackle in this...
Adaptive Representation for Policy Gradient
Download

Spring 2015

Das Gupta, Ujjwal

Much of the focus on finding good representations in reinforcement learning has been on learning complex non-linear predictors of value. Methods like policy gradient, that do not learn a value function and instead directly represent policy, often need fewer parameters to learn good policies....
Adaptive Search Control through Meta-Gradient Reinforcement Learning
Download

Spring 2024

Burega, Bradley Thomas

In model-based reinforcement learning, an agent can improve its policy by planning: learning from experience generated by a model. Search control is the problem of determining which starting state should be used to generate this experience. Given a limited planning budget, an agent should be...
Advances in Simulation-Based Search and Batch Reinforcement Learning
Download

Spring 2023

Xiao, Chenjun

Reinforcement learning (RL) defines a general computational problem where the learner must learn to make good decisions through interactive experience. To be effective in solving this problem, the learner must be able to explore the environment, make accurate predictions about the future, and...
An Empirical Study of Experience Replay for Control in Continuous State Domains
Download

Fall 2022

Li, Xin

In this thesis, we investigate the empirical performance of several experience replay techniques. Efficient experience replay plays an important role in model-free reinforcement learning by improving sample efficiency through reusing past experience. However, replay-based methods were largely...
An Investigation into Reinforcement Learning in FPGA Placement Optimization
Download

Fall 2023

Chen, Ruichen

With the increasing complexity and capacity of modern Field-Programmable Gate Arrays (FPGAs), there is a growing demand for efficient FPGA computer-aided design (CAD) tools, particularly at the placement stage. While some previous works, such as RLPlace, have explored the efficacy of single-state...
Analysis of an Alternate Policy Gradient Estimator for Softmax Policies
Download

Spring 2022

Garg, Shivam

Policy gradient (PG) estimators are ineffective in dealing with softmax policies that are sub-optimally saturated, which refers to the situation when the policy concentrates its probability mass on sub-optimal actions. Sub-optimal policy saturation may arise from a bad policy initialization or a...
Characterizing Discrete Representations for Reinforcement Learning
Download

Fall 2023

Meyer, Edan J

In reinforcement learning (RL), agents learn to maximize a reward signal using nothing but observations from the environment as input to their decision making processes. Whether the agent is simple, consisting of only a policy that maps observations to actions, or complex, containing auxiliary...