Download the full-sized PDF of Bandit Convex Optimization with Biased Noisy Gradient OraclesDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Bandit Convex Optimization with Biased Noisy Gradient Oracles Open Access


Other title
learning theory
machine learning
Type of item
Degree grantor
University of Alberta
Author or creator
Hu, Xiaowei
Supervisor and department
Gyorgy, Andras (Computing Science)
Szepesvari, Csaba (Computing Science)
Examining committee member and department
Friggstad, Zachary (Computing Science)
Department of Computing Science

Date accepted
Graduation date
2017-06:Spring 2017
Master of Science
Degree level
Optimizing an objective function over convex sets is a key problem in many different machine learning models. One of the various kinds of well studied objective functions is the convex function, where any local minimum must be the global mini- mum over the domain. To find the optimal point that minimize the objective convex function, a natural choice for the search direction is the negative gradient. The resulting algorithm, usually called the gradient descent method, is widely used to approach convex optimization problems. Given the entire objective function or the first-order information (gradient), it is straightforward to do the line search follow- ing the chosen direction. However, another scenario exists where the algorithm has no access to such entire function or the first-order information, except evaluation of some queried points. Then there comes the bandit optimization problem, which is also known as zeroth-order or derivative-free optimization. Algorithms for bandit convex optimization often rely on constructing noisy gradient estimates, which are then used in appropriately adjusted first-order algorithms, like gradient descent, replacing actual gradients. Depending on the properties of the function to be optimized and the nature of “noise” in the bandit feedback, the bias and variance of gradient estimates exhibit various tradeoffs. For example, the gradient estimate with a small bias tends to have a large variance, while the estimate with a small variance could have a large bias. Considering that both the bias and variance of the gradient estimate might jeopardize the optimization algorithm. It is worthwhile to measure their influences in a quantitative pattern, and study if the optimization error basing on a certain gradient estimate can be further improved. This thesis proposes a novel framework that replaces the specific gradient estimation methods with an abstract oracle. The oracle directly interacts with the algorithm, outputting biased, noisy gradient estimates satisfying some predefined properties. In this way, we abstract tradeoffs of the bias and variance, skipping the details of constructing gradient estimates. With the help of the new framework we unify previous works, reproducing their results in a clean and concise fashion, proving the upper bound of the optimization error with the Mirror Descent algorithm. Meanwhile, perhaps more importantly, the framework also allows us to show a lower bound of the optimization error, which can match the corresponding upper bound under certain conditions. This formally demonstrates that, to achieve the optimal root-n rate for the bandit convex optimization, either the algorithms that use existing gradient estimators, or the proof techniques used to analyze them, have to go beyond what exists today.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication
Hu, Prashanth L.A., Gyorgy, and Szepesvari (2016).The Proceedings of the Nineteenth International Conference on Artificial Intelligence and Statistics (AISTATS), volume 51 of JMLR: W&CP, Cadiz, Spain, May 9–11. (Bandit) Convex Optimization with Biased Noisy Gradient Oracles

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 1954212
Last modified: 2017:06:13 12:18:08-06:00
Filename: Hu_Xiaowei_201701_MSc.pdf
Original checksum: 8deea91bffec6b46f3d9ccddb72007bc
Well formed: false
Valid: false
Status message: Unexpected error in findFonts java.lang.ClassCastException: edu.harvard.hul.ois.jhove.module.pdf.PdfSimpleObject cannot be cast to edu.harvard.hul.ois.jhove.module.pdf.PdfDictionary offset=4956
Status message: Invalid name tree offset=1931866
Status message: Invalid name tree offset=1931866
Status message: Invalid name tree offset=1931866
Status message: Invalid name tree offset=1931866
Activity of users you follow
User Activity Date