Search
Skip to Search Results- 4Online Learning
- 1Adaptive Optimization
- 1Asynchronous Optimization
- 1Confidence Sets
- 1Costly Observations
- 1Cross-Validation
-
Fall 2012
In this thesis, the multi-armed bandit (MAB) problem in online learning is studied, when the feedback information is not observed immediately but rather after arbitrary, unknown, random delays. In the stochastic" setting when the rewards come from a fixed distribution, an algorithm is given that...
-
Spring 2013
In a discrete-time online control problem, a learner makes an effort to control the state of an initially unknown environment so as to minimize the sum of the losses he suffers, where the losses are assumed to depend on the individual state-transitions. Various models of control problems have...
-
Spring 2013
This work introduces the “online probing” problem: In each round, the learner is able to purchase the values of a subset of features for the current instance. After the learner uses this information to produce a prediction for this instance, it then has the option of paying for seeing the full...
-
Fall 2019
We study three problems in the application, design, and analysis of online optimization algorithms for machine learning. First, we consider speeding-up the common task of k-fold cross-validation of online algorithms, and provide TreeCV, an algorithm that reduces the time penalty of k-fold...