Locally Weighted Predictive Modeling for Regression and Classification of Process Data

Alireza Kheradmand

doi:doi:10.7939/r3-08vf-0g98

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

304 views
349 downloads

Locally Weighted Predictive Modeling for Regression and Classification of Process Data

Author / Creator

Alireza Kheradmand
Predictive modeling has proven to be a valuable tool in process industry to estimate hard-to-measure variables that cannot be measured online. Those variables usually require LAB analysis to be quantified, which is time-consuming and costly. Predictive modeling can be used for both regression and classification. Predictive models in regression are often used as soft sensors or inferential sensors, and play an important role in control of processes.Traditional methods in building predictive models usually suggest global approaches, where a training set of data is used to build the model, and the model performance is validated on the testing set. However, the performance of the model usually degrades over time as a result of nonlinearity, time varying issues or curse of dimensionality, all of which point out the necessity of model updating and good maintenance techniques. There are a number of existing methods in providing solutions for model updating and maintenance. Nonetheless if the operation mode keeps on changing rapidly, those methods are not able to adapt well with the process.To address above mentioned issues, locally weighted modeling has been proposed, which strives to build as many local predictive models as necessary to address highly nonlinear process modeling problem. Locally weighted modeling is also known as Just-In-Time learning, which builds a local model for each query sample that arrives. Just-In-Time relies on a similarity measurement step to highlight the most relevant samples to the query sample. The local model is built mostly based on those samples. Traditional similarity metrics only account for magnitude of samples and mostly disregard the effect of time sequence. A novel Just-In-Time learning method is proposed in this thesis as "Trend-Based Just-In-Time", which accounts for magnitude, direction and trend of changes in database with respect to the query sample. The proposed method takes advantage of Principal Component Analysis (PCA) and a moving horizon approach to evaluate similarity. Additionally, traditional Just-In-Time methods have limitations in dealing with missing data. By using Probabilistic PCA (PPCA), one is able to evaluate similarity in presence of missing data. Similarity measurement is usually carried out in input subspace of a data set. Commonly, by including output space information in similarity calculation, improvement in prediction can be anticipated. Therefore, the proposed Trend-Based Just-In-Time is extended to similarity calculation in output space as well.As mentioned, predictive models are also used for building classification models (or predictive classifiers). Similar to regression, global models might not be able to provide good classifications when data set is not linearly separable. Another issue that might occur to a data set is curse of dimensionality, which requires a variable (or variable) selection technique. In this thesis, a novel approach for locally weighted classification of high dimensional data with variable selection is proposed. The proposed approach addresses curse of dimensionality problem by using Neighborhood Component Analysis (NCA) as a variable selection technique and Kernel PCA as dimension reduction method. To address high degree of nonlinearity, locally weighted approach is employed, which performs similarity measurement between query sample and historical database in latent space. Afterwards, by considering the weights obtained from similarity calculation, a locally weighted Support Vector Machine (SVM) model is built. All proposed approaches have been implemented on various Near InfraRed (NIR) data sets as well as synthetic data sets.
Subjects / Keywords
Graduation date

Spring 2019
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-08vf-0g98
License

Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Chemical and Materials Engineering
Specialization
- Process Control
Supervisor / co-supervisor and their department(s)
- Huang, Biao (Chemical and Materials Engineering)