Reinforcement Learning Algorithms for MDPs

Szepesvari, Csaba

doi:doi:10.7939/R3SF2MK09

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Computing Science, Department of / Technical Reports (Computing Science)

Usage

752 views
818 downloads

Reinforcement Learning Algorithms for MDPs

Author(s) / Creator(s)
- Szepesvari, Csaba
Technical report TR09-13. This article presents a survey of reinforcement learning algorithms for Markov Decision Processes (MDP). In the first half of the article, the problem of value estimation is considered. Here we start by describing the idea of bootstrapping and temporal difference learning. Next, we compare incremental and batch algorithmic variants and discuss the impact of the choice of the function approximation method on the success of learning. In the second half, we describe methods that target the problem of learning to control an MDP. Here online and active learning are discussed first, followed by a description of direct and actor-critic methods. | TRID-ID TR09-13
Date created

2009
Subjects / Keywords
Type of Item

Report
DOI

https://doi.org/10.7939/R3SF2MK09
License

Attribution 3.0 International

Language
- English