Search

Filter

Subject / Keyword

Show 4 more ...

Item type

19Report

Collections

Author / Creator / Contributor

Show 4 more ...

Year

Languages

19English

Improving Local Search for Resource-Constrained Planning
Download

2010

Mueller, Martin, Hoffman, Joerg, Nakhost, Hootan

Technical report TR10-02. A ubiquitous feature of planning problems -- problems involving the automatic generation of action sequences for attaining a given goal -- is the need to economize limited resources such as fuel or money. While heuristic search, mostly based on standard algorithms such...
Man Versus Machine: The Silicon Graphics World Checkers Championship
Download

1992

Schaeffer, Jonathan

Technical report TR92-19. In August 1992, the first man versus machine world championship took place. The champion, Dr. Marion Tinsley, is arguably the greatest checkers player that ever lived. The challenger was the computer checkers program Chinook, a 3 year team effort from the University of...
Measuring the Size of Large No-Limit Poker Games
Download

2013

Johanson, Michael

In the field of computational game theory, games are often compared in terms of their size. This can be measured in several ways, including the number of unique game states, the number of decision points, and the total number of legal actions over all decision points. These numbers are either...
Measuring the Size of Large No-Limit Poker Games
Download

2013-02-26

Johanson, Michael

In the field of computational game theory, games are often compared in terms of their size. This can be measured in several ways, including the number of unique game states, the number of decision points, and the total number of legal actions over all decision points. These numbers are either...
Natural Actor - Critic Algorithms
Download

2009

Bhatnagar, Shalabh, Sutton, Richard, Ghavamzadeh, Mohammad, Lee, Mark

Technical report TR09-10. We present four new reinforcement learning algorithms based on actor-critic, function approximation, and natural gradient ideas, and we provide their convergence proofs. Actor-critic reinforcement learning methods are online approximations to policy iteration in which...
On Local Regret
Download

2012

Bowling, Michael, Zinkevich, Martin

Online learning aims to perform nearly as well as the best hypothesis in hindsight. For some hypothesis classes, though, even finding the best hypothesis offline is challenging. In such offline cases, local search techniques are often employed and only local optimality guaranteed. For online...
Proceedings of Quantum Computing Summer School
Download

2002

Fortin, David, Antoniu, Angela, Sardarli, Arzu, Rezania, Vahid, Levner, Ilya, Bulitko, Vadim

Technical report TR02-14. The 2002 Quantum Computing Summer School (QCSS'02) at the University of Alberta was organized as a learning and discussion forum for researchers in Artificial Intelligence, Computer Science, Physics, Mathematics, and Engineering. The short-term objective was to introduce...
Reinforcement Learning Algorithms for MDPs
Download

2009

Szepesvari, Csaba

Technical report TR09-13. This article presents a survey of reinforcement learning algorithms for Markov Decision Processes (MDP). In the first half of the article, the problem of value estimation is considered. Here we start by describing the idea of bootstrapping and temporal difference...
Structure Learning of Causal Bayesian Networks: A Survey
Download

2011

Mahmood, Ashique

Technical report TR11-01. Causality is a fundamental concept in reasoning. The effectiveness of many reasoning tasks depends on the understanding of the underlying cause-effect relationships. Therefore, the notion of causality has been explored in a wide range of disciplines. Causal discovery,...