Monte Carlo Sampling for Regret Minimization in Extensive Games

Bowling, Michael; Zinkevich, Martin; Waugh, Kevin; Lanctot, Marc

doi:doi:10.7939/R3319S48Q

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Computing Science, Department of / Technical Reports (Computing Science)

Usage

643 views
1442 downloads

Monte Carlo Sampling for Regret Minimization in Extensive Games

Author(s) / Creator(s)
Technical report TR09-15. Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome sampling. In this paper, we describe a general family of domain-independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) of which the original and poker-specific versions are special cases. We start by showing that MCCFR performs the same regret updates as CFR on expectation. Then, we introduce two sampling schemes: outcome sampling and external sampling, showing that both have bounded overall regret with high probability. Thus, they can compute an approximate equilibrium using self-play. Finally, we prove a new tighter bound on the regret for the original CFR algorithm and relate this new bound to MCCFR's bounds. We show empirically that, although the sample-based algorithms require more iterations, their lower cost per iteration can lead to dramatically faster convergence in various games. | TRID-ID TR09-15
Date created

2009
Subjects / Keywords
Type of Item

Report
DOI

https://doi.org/10.7939/R3319S48Q
License

Attribution 3.0 International

Language
- English