Optimal Mechanisms for Machine Learning: A Game-Theoretic Approach to Designing Machine Learning Competitions

  • Author / Creator
    Ajallooeian, Mohammad Mahdi
  • In this thesis we consider problems where a self-interested entity, called the principal, has private access to some data that she wishes to use to solve a prediction problem by outsourcing the development of the predictor to some other parties. Assuming the principal, who needs the machine learning solution, and the potential providers of the solution are two independent, self-interested agents, which is the case for many real-world situations, this then becomes a game-theoretic problem. We propose mathematical models for variants of this problem by borrowing techniques from the literature of mechanism design and provide principled solutions. We consider experimental design when there are multiple self-interested agents involved in developing a solution for a machine learning problem. A first case is when there is a public competition, each agent offers a single solution and solutions are available off-the-shelf to the agents: there is no development cost included. The problem considered is to find a set of payment rules that guarantees to maximize the profit of the principal on expectation even when the developers are self-interested. The solution depends on the distribution of the skill-level of developers available, which is assumed to be known. To deal with our problem, the standard mechanism design techniques are revisited and extended in a number of ways. In particular, a general approach is given that allows the design of payment rules (more generally, mechanisms) when such payment rules must depend on some quantity that becomes known only after the mechanism is executed. This extension plays a key role in our solution to the machine learning payment-rule design problem, where data must be kept private (otherwise the developers could submit “overfitting” predictors), yet the principal’s profit (and thus the payment) should depend on the performance of the predictor chosen on theprivate data. Then, we address other interesting variants of the problem and provide solutions : when a single developer can submit multiple solutions, and when the solution is to be developed in multiple stages, or when the development cost is non-zero.

  • Subjects / Keywords
  • Graduation date
    Spring 2013
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
  • Supervisor / co-supervisor and their department(s)
  • Examining committee members and their departments
    • Schuurmans, Dale (Computing Science)
    • György, András (Computing Science)
    • Bulitko, Vadim (Computing Science)
    • Jinag, Hai (Electrical and Computer Engineering)