Search

Filter

Subject / Keyword

Show 4 more ...

Languages

83English

Collections

Author / Creator / Contributor

Show 4 more ...

Year

Item type

83Thesis

Departments

Supervisors

Show 4 more ...

Dynamic Tuning of PI-Controllers based on Model-free Reinforcement Learning Methods
Download

Spring 2010

Abbasi Brujeni, Lena

In this thesis, a Reinforcement Learning (RL) method called Sarsa is used to dynamically tune a PI-controller for a Continuous Stirred Tank Heater (CSTH) experimental setup. The proposed approach uses an approximate model to train the RL agent in the simulation environment before implementation...
Effective Real-time Reinforcement Learning for Vision-Based Robotic Tasks
Download

Spring 2023

Wang, Yan

Vision is one of the essential means for humans to perceive the world. Similarly, today's intelligent robot agents rely on camera images to perform complex tasks in the real world. Due to the ever-changing nature of the real world, intelligent robot agents must continually learn from...
Efficient Exploration in Reinforcement Learning through Time-Based Representations
Download

Spring 2019

Cholodovskis Machado, Marlos

In the reinforcement learning (RL) problem an agent must learn how to act optimally through trial-and-error interactions with a complex, unknown, stochastic environment. The actions taken by the agent influence not just the immediate reward it observes but also the future states and rewards it...
Experiments in off-policy reinforcement learning with the GQ(lambda) algorithm
Download

Spring 2011

Delp, Michael

Off-policy reinforcement learning is useful in many contexts. Maei, Sutton, Szepesvari, and others, have recently introduced a new class of algorithms, the most advanced of which is GQ(lambda), for off-policy reinforcement learning. These algorithms are the first stable methods for general...
Experiments with Hex in OpenSpiel and AlphaZero
Download

Fall 2022

Daliri,Mohammadreza

OpenSpiel is an open-source software system for implementing high-performance software players for many different computer games. Hex is a two-player game of perfect information used in a variety of computer games research projects. The OpenSpiel project has implemented a version of the AlphaZero...
Feature Generalization in Deep Reinforcement Learning: An Investigation into Representation Properties
Download

Fall 2022

Miahi, Erfan

In this thesis, we investigate the connection between the properties and the generalization performance of representations learned by deep reinforcement learning algorithms. Much of the earlier work on representation learning for reinforcement learning focused on designing fixed-basis...
Game-independent AI agents for playing Atari 2600 console games
Download

Spring 2010

Naddaf, Yavar

This research focuses on developing AI agents that play arbitrary Atari 2600 console games without having any game-specific assumptions or prior knowledge. Two main approaches are considered: reinforcement learning based methods and search based methods. The RL-based methods use feature vectors...
Goal-Space Planning with Subgoal Models
Download

Fall 2022

Lo, Chunlok

This thesis investigates a new approach to model-based reinforcement learning using background planning: mixing (approximate) dynamic programming updates and model-free updates, similar to the Dyna architecture. Background planning with learned models is often worse than model-free alternatives,...
Gradient Temporal-Difference Learning Algorithms
Download

Fall 2011

Maei, Hamid Reza

We present a new family of gradient temporal-difference (TD) learning methods with function approximation whose complexity, both in terms of memory and per-time-step computation, scales linearly with the number of learning parameters. TD methods are powerful prediction techniques, and with...
Improving the reliability of reinforcement learning algorithms through biconjugate Bellman errors
Download

Spring 2024

Patterson, Andrew

In this thesis, we seek to improve the reliability of reinforcement learning algorithms for nonlinear function approximation. Semi-gradient temporal difference (TD) update rules form the basis of most state-of-the-art value function learning systems despite clear counterexamples proving their...