Search
Skip to Search Results- 1Asadi Atui, Kavosh
- 1Ashley, Dylan R
- 1Bennett, Brendan
- 1Delp, Michael
- 1Dick, Travis B
- 1Hackman, Leah M
-
Spring 2020
The predictive representations hypothesis is that representing the state of the world in terms of predictions about the future will result in good generalization. In this thesis, good generalization is specifically quantified by good learning performance in both accuracy and speed when predicting...
-
Spring 2021
Temporal difference (TD) methods provide a powerful means of learning to make predictions in an online, model-free, and highly scalable manner. In the reinforcement learning (RL) framework, we formalize these prediction targets in terms of a (possibly discounted) sum of rewards, called the...
-
Spring 2011
Off-policy reinforcement learning is useful in many contexts. Maei, Sutton, Szepesvari, and others, have recently introduced a new class of algorithms, the most advanced of which is GQ(lambda), for off-policy reinforcement learning. These algorithms are the first stable methods for general...
-
Extending the Sliding-step Technique of Stochastic Gradient Descent to Temporal Difference Learning
DownloadFall 2018
Stochastic gradient descent is at the heart of many recent advances in machine learning. In each of a series of steps, stochastic gradient descent processes an example and adjusts the weight vector in the direction that would most reduce the error for that example. A step-size parameter is used...
-
Spring 2013
Gradient-TD methods are a new family of learning algorithms that are stable and convergent under a wider range of conditions than previous reinforcement learning algorithms. In particular, gradient-TD algorithms enable off-policy problems---problems where the distribution of the data is different...
-
Fall 2023
Of all the capabilities of natural intelligence, one of the most exceptional is the ability to expand upon and refine knowledge of the world through subjective experience. Therefore, a longstanding goal of Artificial Intelligence has been to replicate this success: to enable artificial agents to...
-
Spring 2022
In this dissertation, we study online off-policy temporal-difference learning algorithms, a class of reinforcement learning algorithms that can learn predictions in an efficient and scalable manner. The contributions of this dissertation are one of the two kinds: (1) empirically studying existing...
-
Spring 2015
This thesis consists of two independent projects, each contributing to a central goal of artificial intelligence research: to build computer systems that are capable of performing tasks and solving problems without problem-specific direction from us, their designers. I focus on two formal...
-
Fall 2009
Learning and planning are two fundamental problems in artificial intelligence. The learning problem can be tackled by reinforcement learning methods, such as temporal-difference learning, which update a value function from real experience, and use function approximation to generalise across...
-
Strengths, Weaknesses, and Combinations of Model-based and Model-free Reinforcement Learning
DownloadSpring 2016
Reinforcement learning algorithms are conventionally divided into two approaches: a model-based approach that builds a model of the environment and then computes a value function from the model, and a model-free approach that directly estimates the value function. The first contribution of this...