The Theoretical Foundation for Incremental Least-Squares Temporal Difference Learning

Zinkevich, Martin

doi:doi:10.7939/R3PR7MW1W

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Computing Science, Department of / Technical Reports (Computing Science)

Usage

205 views
240 downloads

The Theoretical Foundation for Incremental Least-Squares Temporal Difference Learning

Author(s) / Creator(s)
- Zinkevich, Martin
Technical report TR06-25. In this paper we present a mathematical foundation for Incremental Least-Squares Temporal Difference Learning (iLSTD) for policy evaluation in reinforcement learning with linear function approximation. iLSTD is an incremental method for achieving results similar to LSTD, the data-efficient, least-squares version of temporal difference learning, without incurring the full cost of the LSTD computation. Here, we give a technical foundation for the asymptotic properties of iLSTD. | TRID-ID TR06-25
Date created

2006
Subjects / Keywords
- Incremental Least-Squares Temporal Difference Learning
Type of Item

Report
DOI

https://doi.org/10.7939/R3PR7MW1W
License

Attribution 3.0 International

Language
- English