Search
Skip to Search Results 1AbbasiYadkori, Yasin
 1Afkanpour, Arash
 1Ajallooeian, Mohammad Mahdi
 1Aslan,Ozlem
 1Balazs, Gabor
 1Bartók, Gábor
 4Machine learning
 3Online Learning
 2Learning theory
 2Machine Learning
 2Online learning
 2Reinforcement Learning
Results for "supervisors_tesim:"Szepesvari, Csaba (Computing Science)""

201306
In a discretetime online control problem, a learner makes an effort to control the state of an initially unknown environment so as to minimize the sum of the losses he suffers, where the losses are assumed to depend on the individual statetransitions. Various models of control problems have...

201211
In a partialmonitoring game a player has to make decisions in a sequential manner. In each round, the player suffers some loss that depends on his decision and an outcome chosen by an opponent, after which he receives "some" information about the outcome. The goal of the player is to keep the...

Optimal Mechanisms for Machine Learning: A GameTheoretic Approach to Designing Machine Learning Competitions
Download201306
In this thesis we consider problems where a selfinterested entity, called the principal, has private access to some data that she wishes to use to solve a prediction problem by outsourcing the development of the predictor to some other parties. Assuming the principal, who needs the machine...

201506
Sampling from a given probability distribution is a key problem in many different disciplines. Markov chain Monte Carlo (MCMC) algorithms approach this problem by constructing a random walk governed by a specially constructed transition probability distribution. As the random walk progresses, the...

201211
In this thesis, the multiarmed bandit (MAB) problem in online learning is studied, when the feedback information is not observed immediately but rather after arbitrary, unknown, random delays. In the ``stochastic" setting when the rewards come from a fixed distribution, an algorithm is given...

201111
This thesis studies the reinforcement learning and planning problems that are modeled by a discounted Markov Decision Process (MDP) with a large state space and finite action space. We follow the valuebased approach in which a function approximator is used to estimate the optimal value function....

201606
This thesis explores theoretical, computational, and practical aspects of convex (shapeconstrained) regression, providing new excess risk upper bounds, a comparison of convex regression techniques with theoretical guarantee, a novel heuristic training algorithm for maxaffine representations,...

201711
On the one hand, theoretical analyses of machine learning algorithms are typically performed based on various probabilistic assumptions about the data. While these probabilistic assumptions are important in the analyses, it is debatable whether such assumptions actually hold in practice. Another...

201706
Optimizing an objective function over convex sets is a key problem in many different machine learning models. One of the various kinds of well studied objective functions is the convex function, where any local minimum must be the global mini mum over the domain. To find the optimal point that...

201706
Most machine learning problems can be posed as solving a mathematical program that describes the structure of the prediction problem, usually expressed in terms of carefully chosen losses and regularizers. However, many machine learning problems yield mathematical programs that are not convex in...