Search

Filter

Subject / Keyword

Show 4 more ...

Languages

4English

Supervisors

Author / Creator / Contributor

Year

Collections

Item type

4Thesis

Departments

4Department of Computing Science

Analysis of an Alternate Policy Gradient Estimator for Softmax Policies
Download

Spring 2022

Garg, Shivam

Policy gradient (PG) estimators are ineffective in dealing with softmax policies that are sub-optimally saturated, which refers to the situation when the policy concentrates its probability mass on sub-optimal actions. Sub-optimal policy saturation may arise from a bad policy initialization or a...
Consistent Emphatic Temporal-Difference Learning
Download

Fall 2023

He, Jiamin

Off-policy policy evaluation has been a critical and challenging problem in reinforcement learning, and Temporal-Difference (TD) learning is one of the most important approaches for addressing it. There has been significant interest in searching for off-policy TD algorithms which find the same...
Effective Real-time Reinforcement Learning for Vision-Based Robotic Tasks
Download

Spring 2023

Wang, Yan

Vision is one of the essential means for humans to perceive the world. Similarly, today's intelligent robot agents rely on camera images to perform complex tasks in the real world. Due to the ever-changing nature of the real world, intelligent robot agents must continually learn from...
Investigating Two Policy Gradient Methods Under Different Time Discretizations
Download

Fall 2021

Farrahi, Homayoon

Continuous-time reinforcement learning tasks commonly use discrete time steps of fixed cycle times for actions. Choosing a small action-cycle time in such tasks allows reinforcement learning agents fast reaction and a more temporally detailed perception of the environment. The learning...

1 - 4 of 4