Usage
  • 112 views
  • 169 downloads

Pure Exploration in Multi-Armed Bandits

  • Author / Creator
    Stephens, Connor J
  • Many practical problems in fields ranging from online advertising to genomics can be framed as the task of selecting the best option from among several choices, based on a limited number of noisy evaluations of the quality of each choice. Pure exploration in multi-armed bandits is an online-learning setting which aims to capture the challenge of designing adaptive data collection and terminal selection procedures that optimize the quality of their final recommendation. In this thesis we provide a comprehensive overview of the sample complexities, as well as algorithms that achieve them, for several core pure exploration settings. We cover problems in both the fixed confidence and fixed budget frameworks and provide a number of novel contributions, including a new analysis of a well-known algorithm, Sequential Halving. This result reveals new performance guarantees and contributes to a greater understanding of the sample complexity of making close to optimal selections in the fixed budget setting, a highly practical problem setup which has remained largely overlooked in the literature until recently.

  • Subjects / Keywords
  • Graduation date
    Spring 2023
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-bdxv-pp85
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.