 81 views
 469 downloads
Optimal Mechanisms for Machine Learning: A GameTheoretic Approach to Designing Machine Learning Competitions

 Author / Creator
 Ajallooeian, Mohammad Mahdi

In this thesis we consider problems where a selfinterested entity, called the principal, has private access to some data that she wishes to use to solve a prediction problem by outsourcing the development of the predictor to some other parties. Assuming the principal, who needs the machine learning solution, and the potential providers of the solution are two independent, selfinterested agents, which is the case for many realworld situations, this then becomes a gametheoretic problem. We propose mathematical models for variants of this problem by borrowing techniques from the literature of mechanism design and provide principled solutions. We consider experimental design when there are multiple selfinterested agents involved in developing a solution for a machine learning problem. A first case is when there is a public competition, each agent offers a single solution and solutions are available offtheshelf to the agents: there is no development cost included. The problem considered is to find a set of payment rules that guarantees to maximize the profit of the principal on expectation even when the developers are selfinterested. The solution depends on the distribution of the skilllevel of developers available, which is assumed to be known. To deal with our problem, the standard mechanism design techniques are revisited and extended in a number of ways. In particular, a general approach is given that allows the design of payment rules (more generally, mechanisms) when such payment rules must depend on some quantity that becomes known only after the mechanism is executed. This extension plays a key role in our solution to the machine learning paymentrule design problem, where data must be kept private (otherwise the developers could submit “overfitting” predictors), yet the principal’s profit (and thus the payment) should depend on the performance of the predictor chosen on theprivate data. Then, we address other interesting variants of the problem and provide solutions : when a single developer can submit multiple solutions, and when the solution is to be developed in multiple stages, or when the development cost is nonzero.

 Subjects / Keywords

 Graduation date
 201306

 Type of Item
 Thesis

 Degree
 Master of Science

 License
 This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for noncommercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.