ERA

Download the full-sized PDF of Persistent Homology on Time seriesDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R3K931F13

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Graduate Studies and Research, Faculty of

Collections

This file is in the following collections:

Theses and Dissertations

Persistent Homology on Time series Open Access

Descriptions

Other title
Subject/Keyword
Persistent Homology
Time Series
Random Forest
Type of item
Thesis
Degree grantor
University of Alberta
Author or creator
Zhou,Yi
Supervisor and department
Giseon Heo (Dentistry)
Examining committee member and department
Bei Jiang (Mathematical and Statistical Sciences)
Ivan Mizera (Mathematical and Statistical Sciences)
Ivor Cribben (Alberta Scholl of Business)
Giseon Heo (Dentistry)
Department
Department of Mathematical and Statistical Sciences
Specialization
Statistical Machine Learning
Date accepted
2016-09-29T09:23:45Z
Graduation date
2016-06:Fall 2016
Degree
Master of Science
Degree level
Master's
Abstract
Topology is a useful tool of mathematics studying how objects are related to one another by investigating their qualitative structural properties, such as connectivity and shape. In this thesis, we applied the method of topological data analysis (TDA) on sequence data and adopt the theory of persistent homology for time series, based on topological features computed over the persistence diagram. Aiming to analyze sequence data from diverse views, we investigate topological features (in a persistent homology perspective) of both traditional statistical tools (i.e. time series) and machine learning methods (i.e. random forest). Combining the advantages of three different ideas, we finally have a way to solve clustering (unsupervised learning) and predicting problems (supervised learning) for our two datasets respectively. There are two main contributions in this thesis. In Chapter 2, we applied persistent homology on the cross correlation matrices and partial correlation matrices of time series, and obtain topological features from the persistence diagrams and barcodes. With this information, we generated consistent clusters and loops from our data and this solution for unsupervised learning problems of unlabeled datasets constitutes my first contribution in this thesis. The second contribution lies in considering landscape as an important covariate for supervised learning problems. In Chapter 3, we applied persistent homology on polysomnography (PSG) time series and took the integrals of landscapes as covariates generated from time series. A random forest model is built with these covariates to predict Obstructive Apnea-Hypopnea (3\% desaturation) Index of new incoming patient.
Language
English
DOI
doi:10.7939/R3K931F13
Rights
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication

File Details

Date Uploaded
Date Modified
2016-09-29T15:23:46.801+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 3251342
Last modified: 2016:11:16 15:57:28-07:00
Filename: Zhou_Yi_201609_MSc .pdf
Original checksum: ddaf000ea115271116531f63743fd998
Activity of users you follow
User Activity Date