Solving Association Problems with Convex Co-embedding

  • Author / Creator
    Mirzazadeh, Farzaneh
  • Co-embedding is the process of mapping elements from multiple sets into a common latent space, which can be exploited to infer element-wise associations by considering the geometric proximity of their embeddings. Such an approach underlies the state of the art for link prediction, relation learning, multi-label tagging, relevance retrieval and ranking. This dissertation provides contributions to the study of co-embedding for solving association problems. First, a unifying view for solving association problems with co-embedding is presented, which covers both alignment-based and distance-based models. Although current approaches rely on local training methods applied to non-convex formulations, I demonstrate how general convex formulations can be achieved for co-embedding. I then empirically compare convex versus non-convex formulations of the training problem under an alignment model. Surprisingly, the empirical results reveal that, in most cases, the two are equivalent. Second, the connection between metric learning and co-embedding is investigated. I show that heterogeneous metric learning can be cast as distance-based co-embedding, and propose a scalable algorithm for solving the training problem globally. The co-embedding framework allows metric learning to be applied to a wide range of association problems---including link prediction, relation learning, multi-label tagging and ranking. I investigate the relation between the standard non-convex training formulation and the proposed convex reformulation of heterogeneous metric learning, both empirically and analytically. Again, it is discovered that under certain conditions, the objective values achieved by the two approaches are identical. I develop a formal characterization of the conditions under which this equality holds. Finally, a constrained form of co-embedding is proposed for structured output prediction. A key bottleneck in structured output prediction is the need for inference during training and testing, usually requiring some form of dynamic programming. Rather than using approximate inference or tailoring a specialized inference method for a particular structure I instead pre-compile prediction constraints directly into the learned representation. By eliminating the need for explicit inference a more scalable approach to structured output prediction can be achieved, particularly at test time. I demonstrate the idea for hierarchical multi-label prediction under subsumption and mutual exclusion constraints, where a relationship to maximum margin structured output prediction can be established. Experiments demonstrate that the benefits of structured output training can still be realized even after inference has been eliminated.

  • Subjects / Keywords
  • Graduation date
    2017-06:Spring 2017
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
    • Department of Computing Science
  • Supervisor / co-supervisor and their department(s)
    • Greiner, Russell (Computing Science)
    • Schuurmans, Dale (Computing Science)
  • Examining committee members and their departments
    • Bowling, Michael (Computing Science)
    • Sander, Joerg (Computing Science)
    • Szepesvari, Csaba (Computing Science)
    • Zemel, Richard (Computer Science, University of Toronto)