ERA

Download the full-sized PDF of Multi-Armed Bandit Problems under Delayed FeedbackDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R3BH85

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Graduate Studies and Research, Faculty of

Collections

This file is in the following collections:

Theses and Dissertations

Multi-Armed Bandit Problems under Delayed Feedback Open Access

Descriptions

Other title
Subject/Keyword
Online Learning
Multi-Armed Bandit
Delayed Feedback
Type of item
Thesis
Degree grantor
University of Alberta
Author or creator
Joulani, Pooria
Supervisor and department
Szepesvari, Csaba (Computing Science)
Examining committee member and department
Mueller, Martin (Computing Science)
Khabbazian, Majid (Electrical and Computer Engineering)
Department
Department of Computing Science
Specialization
Statistical Machine Learning
Date accepted
2012-09-26T13:14:35Z
Graduation date
2012-09
Degree
Master of Science
Degree level
Master's
Abstract
In this thesis, the multi-armed bandit (MAB) problem in online learning is studied, when the feedback information is not observed immediately but rather after arbitrary, unknown, random delays. In the ``stochastic" setting when the rewards come from a fixed distribution, an algorithm is given that uses a non-delayed MAB algorithm as a black-box. We also give a method to generalize the theoretical guarantees of non-delayed UCB-type algorithms to the delayed stochastic setting. Assuming the delays are independent of the rewards, we upper bound the penalty in the performance of these algorithms (measured by ``regret'') by an additive term depending on the delays. When the rewards are chosen in an adversarial manner, we give a black-box style algorithm using multiple instances of a non-delayed adversarial MAB algorithm. Assuming the delays depend only on time, we upper bound the performance penalty of the algorithm by a multiplicative factor depending on the delays.
Language
English
DOI
doi:10.7939/R3BH85
Rights
Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.
Citation for previous publication

File Details

Date Uploaded
Date Modified
2014-04-24T23:24:49.034+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 882386
Last modified: 2015:10:12 14:34:05-06:00
Filename: Joulani_Pooria_Fall 2012.pdf
Original checksum: 8b8bc8e1dc5558558323680b81264f57
Well formed: false
Valid: false
Status message: No document catalog dictionary offset=0
Activity of users you follow
User Activity Date