Usage
  • 40 views
  • 23 downloads

Time and Space: Why Imperfect Information Games are Hard

  • Author / Creator
    Burch, Neil
  • Decision-making problems with two agents can be modeled as two player games, and a Nash equilibrium is the basic solution concept describing good play in adversarial games. Computing this equilibrium solution for imperfect information games, where players have private, hidden information, is harder than solving perfect information games. Both domains may technically be classified as easy, with algorithms that require polynomial time and space, but the difference in degree means that for extremely large games, both computation time and storage requirements may exceed available resources. In this thesis, I present four main contributions towards fast and memory efficient techniques for solving extensive-form games, which describe a class of imperfect information games with sequential decision making. First, the thesis introduces an analysis of counterfactual regret minimisation (CFR), an algorithm for solving extensive-form games, and presents tighter regret bounds that describe the rate of progress. CFR is a popular, state-of-the-art algorithm, and the improved bounds give some indication as to why the algorithm has good practical performance. Second, I describe and analyse a game-solving algorithm, CFR+ , which has faster empirical performance than CFR, and compare it to a number of theoretically faster algorithms. We wrote and released an efficient, distributed implementation of CFR+ to solve heads-up limit Texas hold'em, making it the first competitively played imperfect information game to be solved. Third, the thesis presents a series of theoretical tools for using decomposition, and creating algorithms which operate on small portions of a game at a time. I describe an algorithm, CFR-D, which can solve games without ever storing a complete strategy, and an algorithm for re-solving a portion of a game to complete an incomplete equilibrium within the full game. Both algorithms have a formal guarantee of correctness. Finally, I describe continual resolving, an algorithm which is an imperfect information analogue to heuristic search in perfect information games. Continual re-solving uses decomposition to play games by doing depth-limited solving with a heuristic evaluation, with a theoretical bound on solution quality. We implemented continual re-solving in the game of no-limit heads-up no-limit Texas hold'em, and beat a group of 33 professional players in a human trial.

  • Subjects / Keywords
  • Graduation date
    2018-06
  • Type of Item
    Thesis
  • Degree
    Doctor of Philosophy
  • DOI
    https://doi.org/10.7939/R36W96Q7C
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
    English
  • Institution
    University of Alberta
  • Degree level
    Doctoral
  • Department
  • Supervisor / co-supervisor and their department(s)
  • Examining committee members and their departments
    • Schapire, Robert (Computer Science, Princeton)
    • Friggstad, Zac (Computing Science)
    • Szafron, Duane (Computing Science)
    • Bowling, Michael (Computing Science)
    • Holte, Robert (Computing Science)