Download the full-sized PDF of Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing AgentsDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Regret Minimization in Games and the Development of Champion Multiplayer Computer Poker-Playing Agents Open Access


Other title
regret minimization
multiagent systems
artificial intelligence
game theory
Type of item
Degree grantor
University of Alberta
Author or creator
Gibson, Richard G
Supervisor and department
Szafron, Duane (Computing Science)
Examining committee member and department
Larson, Kate (Cheriton School of Computer Science, University of Waterloo)
Holte, Robert (Computing Science)
Mueller, Martin (Computing Science)
Bowling, Michael (Computing Science)
Szafron, Duane (Computing Science)
Department of Computing Science

Date accepted
Graduation date
Doctor of Philosophy
Degree level
Recently, poker has emerged as a popular domain for investigating decision problems under conditions of uncertainty. Unlike traditional games such as checkers and chess, poker exhibits imperfect information, varying utilities, and stochastic events. Because of these complications, decisions at the poker table are more analogous to the decisions faced by humans in everyday life. In this dissertation, we investigate regret minimization in extensive-form games and apply our work in developing champion computer poker agents. Counterfactual Regret Minimization (CFR) is the current state-of-the-art approach to computing capable strategy profiles for large extensive-form games. Our primary focus is to advance our understanding and application of CFR in domains with more than two players. We present four major contributions. First, we provide the first set of theoretical guarantees for CFR when applied to games that are not two-player zero-sum. We prove that in such domains, CFR eliminates strictly dominated plays. In addition, we provide a modification of CFR that is both more efficient and can lead to stronger strategies than were previously possible. Second, we provide new regret bounds for CFR, present three new CFR sampling variants, and demonstrate their efficiency in several different domains. Third, we prove the first set of sufficient conditions that guarantee CFR will minimize regret in games with imperfect recall. Fourth, we generalize three previous game tree decomposition methods, present a new decomposition method, and demonstrate their improvement empirically over standard techniques. Finally, we apply the work in this thesis to construct three-player Texas hold'em agents and enter them into the Annual Computer Poker Competition. Our agents won six out of the seven three-player events that we entered from the 2010, 2011, 2012, and 2013 computer poker competitions.
Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.
Citation for previous publication
Richard Gibson and Duane Szafron. Regret minimization in multiplayer extensive games. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCAI), pages 2802–2803, 2011.Richard G. Gibson and Duane Szafron. On strategy stitching in large extensive form multiplayer games. In Advances in Neural Information Processing Systems (NIPS) 24, pages 100–108, 2011.Richard Gibson, Marc Lanctot, Neil Burch, Duane Szafron, and Michael Bowling. Generalized sampling and variance in counterfactual regret minimization. In Proceedings of the Twenty-Sixth Conference on Artificial Intelligence (AAAI), pages 1355–1361, 2012.Marc Lanctot, Richard Gibson, Neil Burch, and Michael Bowling. No-regret learning in extensive-form games with imperfect recall. In Proceedings of the Twenty-Ninth International Conference on Machine Learning (ICML), pages 65–72, 2012.Richard Gibson, Neil Burch, Marc Lanctot, and Duane Szafron. Efficient Monte Carlo counterfactual regret minimization in games with many player actions. In Advances in Neural Information Processing Systems 25 (NIPS), pages 1889–1897, 2012.

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 1526296
Last modified: 2015:10:12 14:24:06-06:00
Filename: Gibson_Richard_Spring 2014.pdf
Original checksum: 5d7b802e6f92908befd243f0fec2fe02
Well formed: false
Valid: false
Status message: Lexical error offset=1501115
Page count: 159
Activity of users you follow
User Activity Date