- 220 views
- 336 downloads
Linear Least-squares Dyna-style Planning
-
- Author(s) / Creator(s)
-
Technical report TR11-04. World model is very important for model-based reinforcement learning. For example, a model is frequently used in Dyna: in learning steps to select actions and in planning steps to project sampled states or features. In this paper we propose least-squares Dyna (LS-Dyna) algorithm to improve the accuracy of the world model and provide better planning. LS-Dyna is a special Dyna architecture in that it estimates the world model by a least-squares method. LS-Dyna is more data efficient, yet it has the same complexity with existing linear Dyna that is based on gradient descent estimation of the world model. Furthermore, the least-squres modeling is computed in an online recursive fashion and does not have to record historical experience or tune a step-size. Experimental results on a 98-state Boyan chain example and a Mountain-car problem show that LS-Dyna performs significantly better than TD/Q-learning and the gradient-descent linear Dyna algorithm. | TRID-ID TR11-04
-
- Date created
- 2011
-
- Type of Item
- Report
-
- License
- Attribution 3.0 International