Dark Hex: A Large Scale Imperfect Information Game

Tapkan, Mustafa B

doi:doi:10.7939/r3-ch0a-z894

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

257 views
280 downloads

Dark Hex: A Large Scale Imperfect Information Game

Author / Creator

Tapkan, Mustafa B
Imperfect information games model many large-scale real-world problems. Hex is the classic two-player zero-sum no-draw connection game where each player wants to join their two sides. Dark Hex is an imperfect information version of Hex in which each player sees only their own moves. Finding Nash equilibrium for Dark Hex problems is computationally challenging, as the number of possible strategies for each player is large: for example, for the 4$\times$3 board, the number of pure strategies is around $10^{113}$. In this work, we use a variety of methods to find improved strategies for 4$\times$3 Dark Hex. To start, we discuss different versions of Dark Hex. After that we introduce a strategy for both players that improve the current results of the $\epsilon$-Nash equilibrium of Dark Hex on the 4$\times$3 board; previously it was known that the first player can win with a probability of $0.112$ and the second player with $0.732$ giving $\epsilon=0.156$. We first improve these bounds with new handcrafted strategies to $0.1428$ and $0.75$ respectively in the abstract game, $\epsilon=0.107$. We then further improve the strategies using a player, trained by Monte Carlo Counterfactual Regret Minimization using probability one pruning and \textit{imperfect recall} abstraction. We introduce a new simplification approach where we limit the actions based on their probabilities, the new strategies
introduced using simplification over the MCCFR strategies give us an improved $\epsilon=0.01$ in the abstract game.
We then introduce probability smoothing with extra parameters. The final results give use $0.791$ and $0.205$ for players respectively, giving us $\epsilon=0.002$, in the abstract game. We discuss the differences between handcrafted and computer-generated strategies.
Lastly, we compare players who were trained using Neural Fictitious Self-Play and Monte Carlo Counterfactual Regret Minimization in different abstraction and pruning settings, we fashion a round-robin tournament and report the results; comparing the strength of all the trained players.
Subjects / Keywords
Graduation date

Fall 2022
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-ch0a-z894
License

This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Mueller, Martin (Computing Science)
- Hayward, Ryan (Computing Science)