ERA

Download the full-sized PDF of Generalized Sampling and Variance in Counterfactual Regret MinimizationDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R3M61BP9B

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Computing Science, Department of

Collections

This file is in the following collections:

Technical Reports (Computing Science)

Generalized Sampling and Variance in Counterfactual Regret Minimization Open Access

Descriptions

Author or creator
Gibson, Richard
Lanctot, Marc
Burch, Neil
Szafron, Duane
Additional contributors
Subject/Keyword
Game Theory
Sequential Decision Making
Artificial Intelligence
Computer Games
Type of item
Computing Science Technical Report
Computing science technical report ID
TR12-02
Language
English
Place
Time
Description
In large extensive form games with imperfect information, Counterfactual Regret Minimization (CFR) is a popular, iterative algorithm for computing approximate Nash equilibria. While the base algorithm performs a full tree traversal on each iteration, Monte Carlo CFR (MCCFR) reduces the per iteration time cost by traversing just a sampled portion of the tree. On the other hand, MCCFR's sampled values introduce variance, and the effects of this variance were previously unknown. In this paper, we generalize MCCFR by considering any generic estimator of the sought values. We show that any choice of an estimator can be used to probabilistically minimize regret, provided the estimator is bounded and unbiased. In addition, we relate the variance of the estimator to the convergence rate of an algorithm that calculates regret directly from the estimator. We demonstrate the application of our analysis by defining a new bounded, unbiased estimator with empirically lower variance than MCCFR estimates. Finally, we use this estimator in a new sampling algorithm to compute approximate equilibria in Goofspiel, Bluff, and Texas hold'em poker. Under each of our selected sampling schemes, our new algorithm converges faster than MCCFR.
Date created
2012
DOI
doi:10.7939/R3M61BP9B
License information
Creative Commons Attribution 3.0 Unported
Rights

Citation for previous publication

Source
Link to related item

File Details

Date Uploaded
Date Modified
2014-04-28T20:57:40.151+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 293425
Last modified: 2015:10:12 13:20:11-06:00
Filename: TR12-02.pdf
Original checksum: 5a00ff83ee5a245c42a461aa6bd5877b
Well formed: true
Valid: true
File title: CMSY10
Page count: 13
Activity of users you follow
User Activity Date