Search
Skip to Search Results-
2007
Wang, Tao, Bowling, Michael, Lizotte, Daniel, Schuurmans, Dale
Technical report TR07-10. We propose to use a new dual approach to dynamic programming. The idea is to maintain an explicit representation of stationary distributions as opposed to value functions. A significant advantage of the dual approach is that it allows one to exploit well developed...
-
2008
Lizotte, Daniel, Wang, Tao, Bowling, Michael, Schuurmans, Dale
Technical report TR08-16. We propose a dual approach to dynamic programming and reinforcement learning based on maintaining an explicit representation of visit distributions as opposed to value functions. An advantage of working in the dual is that it allows one to exploit techniques for...
-
2006
Wang, Tao, Schuurmans, Dale, Bowling, Michael
Technical report TR06-26. We investigate the dual approach to dynamic programming and reinforcement learning, based on maintaining an explicit representation of stationary distributions as opposed to value functions. A significant advantage of the dual approach is that it allows one to exploit...
-
2007
Wang, Tao, Schuurmans, Dale, Bowling, Michael, Lizotte, Daniel
Technical report TR07-05. We investigate novel, dual algorithms for dynamic programming and reinforcement learning, based on maintaining explicit representations of stationary distributions instead of value functions. In particular, we investigate the convergence properties of standard dynamic...