ERA

Download the full-sized PDF of Instance-dependent analysis of learning algorithmsDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R39G5GT4V

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Graduate Studies and Research, Faculty of

Collections

This file is in the following collections:

Theses and Dissertations

Instance-dependent analysis of learning algorithms Open Access

Descriptions

Other title
Subject/Keyword
independent component analysis
learning theory
online learning
machine learning
online linear optimization
partially linear model
Type of item
Thesis
Degree grantor
University of Alberta
Author or creator
Huang, Ruitong
Supervisor and department
Szepesvari, Csaba (Computing Science)
Schuurmans, Dale (Computing Science)
Examining committee member and department
Friggstad, Zachary (Computing Science)
Cesa-Bianchi, Nicolo (Computer Science)
Greiner, Russ (Computing Science)
Gyorgy, Andras (Electrical and Electronic Engineering)
Department
Department of Computing Science
Specialization
statistical machine learning
Date accepted
2017-09-28T13:15:39Z
Graduation date
2017-11:Fall 2017
Degree
Doctor of Philosophy
Degree level
Doctoral
Abstract
On the one hand, theoretical analyses of machine learning algorithms are typically performed based on various probabilistic assumptions about the data. While these probabilistic assumptions are important in the analyses, it is debatable whether such assumptions actually hold in practice. Another question is whether these probabilistic assumptions really catch the essence of "learning" as is implicitly assumed since the introduction of PAC models in learning theory. On the other hand, when avoiding making assumptions about the data, typical analyses tend to follow a worst-case minimax approach, e.g. in the adversarial online learning framework. Oftentimes the results obtained will fail to catch and exploit the `niceness' of the data that may help speeding up the learning. It is also debatable whether the data encountered in typical learning scenarios would ever be truly adversarial. Motivated by the above issues, this thesis suggests to perform instance-dependent analysis of learning algorithms to improve our understanding of learning. Special emphasis is put on characterizing the `niceness' of data from the perspective of learning. In this thesis, we demonstrate this approach in three settings: 1. In the unsupervised learning setting, we redefine the problem of independent component analysis (ICA) to avoid any kind of stochastic assumptions and develop (for the first time) a provably polynomial-time learning algorithm based on our deterministic analysis. 2. In the supervised learning setting, we start with a statistical framework: We analyze the finite sample performances of the empirical risk minimization algorithm (ERM) for a generalized partially linear model under the random design setting. We detect a potential deficiency of the ERM algorithm. Further investigation leads to a high probability instance-dependent generalization bound. 3. Finally, in the online learning setting, we take a thorough analysis of the follow the leader (FTL) algorithm in the online linear prediction problem, and discover a broad range of previously unknown favourable conditions of the data under which FTL achieves a fast learning rate. Our approach leads to various instance-dependent results that are more general, expressive, and meaningful, in the sense that these results are able to catch the important factors of the data on which the performances of the learning algorithms heavily rely.
Language
English
DOI
doi:10.7939/R39G5GT4V
Rights
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication
Huang, Ruitong, and Csaba Szepesvári. "A finite-sample generalization bound for semiparametric regression: Partially linear models." Artificial Intelligence and Statistics. 2014.Huang, Ruitong, Andras Gyorgy, and Csaba Szepesvári. "Deterministic independent component analysis." Proceedings of the 32nd International Conference on Machine Learning (ICML-15). 2015.Huang, Ruitong, et al. "Following the Leader and Fast Rates in Linear Prediction: Curved Constraint Sets and Other Regularities." Advances in Neural Information Processing Systems. 2016.

File Details

Date Uploaded
Date Modified
2017-09-28T19:15:40.683+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 2963422
Last modified: 2017:11:08 18:19:35-07:00
Filename: Huang_Ruitong_201709_PhD.pdf
Original checksum: 10addd9d5fc7030a7d7daf111eeac50f
Well formed: false
Valid: false
Status message: Invalid page tree node offset=1684568
Status message: Invalid Resources Entry in document offset=48162
Status message: Unexpected error in findFonts java.lang.ClassCastException: edu.harvard.hul.ois.jhove.module.pdf.PdfSimpleObject cannot be cast to edu.harvard.hul.ois.jhove.module.pdf.PdfDictionary offset=7678
Status message: Invalid name tree offset=2931599
Status message: Invalid name tree offset=2931599
Activity of users you follow
User Activity Date