Search
Skip to Search Results- 2Joulani, Pooria
- 1Abbasi-Yadkori, Yasin
- 1Afkanpour, Arash
- 1Ajallooeian, Mohammad Mahdi
- 1Aslan,Ozlem
- 1Ayoub, Alex
- 4Machine learning
- 4Online Learning
- 4Reinforcement Learning
- 3Learning theory
- 3Machine Learning
- 2Online learning
Results for "supervisors_tesim:"Szepesvari, Csaba (Computing Science)""
-
Fall 2023
A matroid bandit is the online version of combinatorial optimization on a matroid, in which the learner chooses $K$ actions from a set of $L$ actions that can form a matroid basis. Many real-world applications such as recommendation systems can be modeled as matroid bandits. In such learning...
-
Fall 2023
Many real-world tasks in fields such as robotics and control can be formulated as constrained Markov decision processes (CMDPs). In CMDPs, the objective is usually to optimize the return while ensuring some constraints being satisfied at the same time. The primal-dual approach is a common...
-
Spring 2021
In batch policy evaluation the goal is to predict the value of a policy given some historical data. A specific example, which motivated the approach pursued in this thesis, is to predict the probability of putting a natural wildfire out given some specific configuration of dispatched resources,...
-
Fall 2021
This thesis proposes novel algorithmic ideas in reinforcement learning for regret minimization. These algorithmic ideas enjoy nice theoretical guarantees and are more practical in large problems than their alternatives. We focus on finite-horizon episodic RL. We propose model-based and model-free...
-
Fall 2019
In this thesis, we investigate different vector step-size adaptation approaches for continual, online prediction problems. Vanilla stochastic gradient descent can be considerably improved by scaling the update with a vector of appropriately chosen step-sizes. Many methods, including AdaGrad,...
-
Fall 2019
We study three problems in the application, design, and analysis of online optimization algorithms for machine learning. First, we consider speeding-up the common task of k-fold cross-validation of online algorithms, and provide TreeCV, an algorithm that reduces the time penalty of k-fold...
-
Spring 2017
Most machine learning problems can be posed as solving a mathematical program that describes the structure of the prediction problem, usually expressed in terms of carefully chosen losses and regularizers. However, many machine learning problems yield mathematical programs that are not convex in...
-
Spring 2017
Optimizing an objective function over convex sets is a key problem in many different machine learning models. One of the various kinds of well studied objective functions is the convex function, where any local minimum must be the global mini- mum over the domain. To find the optimal point that...
-
Fall 2017
On the one hand, theoretical analyses of machine learning algorithms are typically performed based on various probabilistic assumptions about the data. While these probabilistic assumptions are important in the analyses, it is debatable whether such assumptions actually hold in practice. Another...