Search

Filter

Author / Creator / Contributor

Show 4 more ...

Subject / Keyword

Show 4 more ...

Year

Collections

Languages

11English

Item type

Departments

1Department of Computing Science

Supervisors

1Michael Bowling (Computing Science)

Computing Robust Counter-Strategies
Download

2007

Johanson, Michael, Bowling, Michael, Zinkevich, Martin

Technical report TR07-15. Adaptation to other initially unknown agents often requires computing an effective counter-strategy. In the Bayesian paradigm, one must find a good counter-strategy to the inferred posterior of the other agents' behavior. In the experts paradigm, one may want to choose...
Convergence and No-Regret in Multiagent Learning
Download

2004

Bowling, Michael

Technical report TR04-11. Learning in a multiagent system is a challenging problem due to two key factors. First, if other agents are simultaneously learning then the environment is no longer stationary, thus undermining convergence guarantees. Second, learning is often susceptible to...
Dual Representations for Dynamic Programming
Download

2008

Lizotte, Daniel, Wang, Tao, Bowling, Michael, Schuurmans, Dale

Technical report TR08-16. We propose a dual approach to dynamic programming and reinforcement learning based on maintaining an explicit representation of visit distributions as opposed to value functions. An advantage of working in the dual is that it allows one to exploit techniques for...
Dual Representations for Dynamic Programming
Download

2007

Wang, Tao, Bowling, Michael, Lizotte, Daniel, Schuurmans, Dale

Technical report TR07-10. We propose to use a new dual approach to dynamic programming. The idea is to maintain an explicit representation of stationary distributions as opposed to value functions. A significant advantage of the dual approach is that it allows one to exploit well developed...
Dual Representations for Dynamic Programming and Reinforcement Learning
Download

2006

Wang, Tao, Schuurmans, Dale, Bowling, Michael

Technical report TR06-26. We investigate the dual approach to dynamic programming and reinforcement learning, based on maintaining an explicit representation of stationary distributions as opposed to value functions. A significant advantage of the dual approach is that it allows one to exploit...
Generalized Sampling and Variance in Counterfactual Regret Minimization
Download

2012

Lanctot, Marc, Gibson, Richard, Burch, Neil, Szafron, Duane

In large extensive form games with imperfect information, Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing approximate Nash equilibria. While the base algorithm performs a full tree traversal on each iteration, Monte Carlo CFR (MCCFR) reduces the per...
Monte Carlo Sampling and Regret Minimization for Equilibrium Computation and Decision-Making in Large Extensive Form Games
Download

Spring 2013

Lanctot, Marc

In this thesis, we investigate the problem of decision-making in large two-player zero-sum games using Monte Carlo sampling and regret minimization methods. We demonstrate four major contributions. The first is Monte Carlo Counterfactual Regret Minimization (MCCFR): a generic family of...
Monte Carlo Sampling for Regret Minimization in Extensive Games
Download

2009

Bowling, Michael, Zinkevich, Martin, Waugh, Kevin, Lanctot, Marc

Technical report TR09-15. Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the...
On Local Regret
Download

2012

Bowling, Michael, Zinkevich, Martin

Online learning aims to perform nearly as well as the best hypothesis in hindsight. For some hypothesis classes, though, even finding the best hypothesis offline is challenging. In such offline cases, local search techniques are often employed and only local optimality guaranteed. For online...
Regret Minimization in Games with Incomplete Information
Download

2007

Bowling, Michael, Johanson, Michael, Zinkevich, Martin, Piccione, Carmelo

Technical report TR07-14. Extensive games are a powerful model of multiagent decision-making scenarios with incomplete information. Finding a Nash equilibrium for very large instances of these games has received a great deal of recent attention. In this paper, we describe a new technique for...