ERA

Download the full-sized PDF of Monte Carlo Sampling for Regret Minimization in Extensive GamesDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R3319S48Q

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Computing Science, Department of

Collections

This file is in the following collections:

Technical Reports (Computing Science)

Monte Carlo Sampling for Regret Minimization in Extensive Games Open Access

Descriptions

Author or creator
Lanctot, Marc
Waugh, Kevin
Zinkevich, Martin
Bowling, Michael
Additional contributors
Subject/Keyword
Computer Games
Extensive games
Game theory
Regret minimalization
Sampling
Type of item
Computing Science Technical Report
Computing science technical report ID
TR09-15
Language
English
Place
Time
Description
Technical report TR09-15. Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome sampling. In this paper, we describe a general family of domain-independent CFR sample-based algorithms called Monte Carlo counterfactual regret minimization (MCCFR) of which the original and poker-specific versions are special cases. We start by showing that MCCFR performs the same regret updates as CFR on expectation. Then, we introduce two sampling schemes: outcome sampling and external sampling, showing that both have bounded overall regret with high probability. Thus, they can compute an approximate equilibrium using self-play. Finally, we prove a new tighter bound on the regret for the original CFR algorithm and relate this new bound to MCCFR's bounds. We show empirically that, although the sample-based algorithms require more iterations, their lower cost per iteration can lead to dramatically faster convergence in various games.
Date created
2009
DOI
doi:10.7939/R3319S48Q
License information
Creative Commons Attribution 3.0 Unported
Rights

Citation for previous publication

Source
Link to related item

File Details

Date Uploaded
Date Modified
2014-04-25T00:38:32.983+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 302801
Last modified: 2015:10:12 16:46:16-06:00
Filename: TR09-15.pdf
Original checksum: 904d8691578c49c045ef93a03e6e292d
Well formed: false
Valid: false
Status message: Lexical error offset=298165
Page count: 22
Activity of users you follow
User Activity Date