Parsimonious Contaminated Shifted Asymmetric Laplace Mixtures: Unsupervised Learning with Outlier Identification for Asymmetric Clusters in High Dimensions

  • Author / Creator
    McLaughlin, Paul
  • A family of parsimonious contaminated shifted asymmetric Laplace mixtures is developed for asymmetric clusters in the presence of outliers and noise (referred to as bad points herein). A series of constraints are applied to a modified factor analyzer structure of the scale matrix parameters, yielding the twelve models comprising the family. Application of the modified factor analyzer structure and this series of parsimonious constraints makes this model effective at analyzing high-dimensional data by reducing the quantity of free parameters that need to be estimated in the model. Notably, these models are developed for an unsupervised setting and do not rely on any previous information about identified outliers or the underlying group structure of the data. A variant of the EM algorithm is developed for parameter estimation. Various implementation issues are discussed, and a series of analyses and comparisons to well-established clustering methods is conducted on real and simulated data.

  • Subjects / Keywords
  • Graduation date
    Fall 2021
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.