SearchSkip to Search Results
- 1Abbasi-Yadkori, Yasin
- 1Afkanpour, Arash
- 1Ajallooeian, Mohammad Mahdi
- 1Balazs, Gabor
- 1Bartók, Gábor
- 4Machine learning
- 3Online Learning
- 2Learning theory
- 2Machine Learning
- 2Online learning
- 2Reinforcement Learning
Results for "supervisors_tesim:"Szepesvari, Csaba (Computing Science)""
In a discrete-time online control problem, a learner makes an effort to control the state of an initially unknown environment so as to minimize the sum of the losses he suffers, where the losses are assumed to depend on the individual state-transitions. Various models of control problems have...
In a partial-monitoring game a player has to make decisions in a sequential manner. In each round, the player suffers some loss that depends on his decision and an outcome chosen by an opponent, after which he receives "some" information about the outcome. The goal of the player is to keep the...
Optimal Mechanisms for Machine Learning: A Game-Theoretic Approach to Designing Machine Learning CompetitionsDownload
In this thesis we consider problems where a self-interested entity, called the principal, has private access to some data that she wishes to use to solve a prediction problem by outsourcing the development of the predictor to some other parties. Assuming the principal, who needs the machine...
Sampling from a given probability distribution is a key problem in many different disciplines. Markov chain Monte Carlo (MCMC) algorithms approach this problem by constructing a random walk governed by a specially constructed transition probability distribution. As the random walk progresses, the...
In this thesis, the multi-armed bandit (MAB) problem in online learning is studied, when the feedback information is not observed immediately but rather after arbitrary, unknown, random delays. In the ``stochastic" setting when the rewards come from a fixed distribution, an algorithm is given...
This thesis studies the reinforcement learning and planning problems that are modeled by a discounted Markov Decision Process (MDP) with a large state space and finite action space. We follow the value-based approach in which a function approximator is used to estimate the optimal value function....
This thesis explores theoretical, computational, and practical aspects of convex (shape-constrained) regression, providing new excess risk upper bounds, a comparison of convex regression techniques with theoretical guarantee, a novel heuristic training algorithm for max-affine representations,...
On the one hand, theoretical analyses of machine learning algorithms are typically performed based on various probabilistic assumptions about the data. While these probabilistic assumptions are important in the analyses, it is debatable whether such assumptions actually hold in practice. Another...
Optimizing an objective function over convex sets is a key problem in many different machine learning models. One of the various kinds of well studied objective functions is the convex function, where any local minimum must be the global mini- mum over the domain. To find the optimal point that...
Most machine learning problems can be posed as solving a mathematical program that describes the structure of the prediction problem, usually expressed in terms of carefully chosen losses and regularizers. However, many machine learning problems yield mathematical programs that are not convex in...