This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

Search

Filter

Subject / Keyword

1Ensemble methods
1Exploration
1Prediction errors
1RQFs
1Reinforcement Learning
1Reward bonus

1UCB
1Uncertainty estimates
1Value bonuses

Show 3 more ...

Item type

1Thesis

Author / Creator / Contributor

1Wahab, Abdul

Year

Collections

1Graduate and Postdoctoral Studies (GPS), Faculty of
1Graduate and Postdoctoral Studies (GPS), Faculty of/Theses and Dissertations

Languages

1English

Departments

1Department of Computing Science

Supervisors

1White, Martha (Computing Science)

Value Bonuses Using Ensemble Errors For Exploration in Reinforcement Learning
Download

Spring 2024

Wahab, Abdul

Optimistic value estimates provide one mechanism for directed exploration in reinforcement learning (RL). The agent acts greedily with respect to an estimate of the value plus what can be seen as a value bonus. The value bonus can be learned by estimating a value function on reward bonuses,...

1 - 1 of 1

Search

Items (1)

Collections

Communities

Value Bonuses Using Ensemble Errors For Exploration in Reinforcement Learning