Search
Skip to Search Results- 3Simulations
- 2Function approximation
- 2Temporal difference learning
- 2Two-timescale stochastic approximation
- 1Active learning
- 1Actor-critic methods
-
1995
de Bruin, Arie, Plaat, Aske, Schaeffer, Jonathan, Pijls, Wim
Technical report TR95-15. This paper has three main contributions to our understanding of fixed-depth minimax search: (A) A new formulation for Stockman's SSS* algorithm, based on Alpha-Beta, is presented. It solves all the perceived drawbacks of SSS, finally transforming it into a practical...
-
1992
Technical report TR92-02. This paper presents some experimental results and analyses of the gene invariant genetic algorithm(GIGA). Although a subclass of the class of genetic algorithms, this algorithm and its variations represent a unique approach with many interesting results. The primary...
-
2009
Bhatnagar, Shalabh, Sutton, Richard, Ghavamzadeh, Mohammad, Lee, Mark
Technical report TR09-10. We present four new reinforcement learning algorithms based on actor-critic, function approximation, and natural gradient ideas, and we provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which...