Usage
  • 223 views
  • 298 downloads

Locally Weighted Predictive Modeling for Regression and Classification of Process Data

  • Author / Creator
    Alireza Kheradmand
  • Predictive modeling has proven to be a valuable tool in process industry to estimate hard-to-measure variables that cannot be measured online. Those variables usually require LAB analysis to be quantified, which is time-consuming and costly. Predictive modeling can be used for both regression and classification. Predictive models in regression are often used as soft sensors or inferential sensors, and play an important role in control of processes.Traditional methods in building predictive models usually suggest global approaches, where a training set of data is used to build the model, and the model performance is validated on the testing set. However, the performance of the model usually degrades over time as a result of nonlinearity, time varying issues or curse of dimensionality, all of which point out the necessity of model updating and good maintenance techniques. There are a number of existing methods in providing solutions for model updating and maintenance. Nonetheless if the operation mode keeps on changing rapidly, those methods are not able to adapt well with the process.To address above mentioned issues, locally weighted modeling has been proposed, which strives to build as many local predictive models as necessary to address highly nonlinear process modeling problem. Locally weighted modeling is also known as Just-In-Time learning, which builds a local model for each query sample that arrives. Just-In-Time relies on a similarity measurement step to highlight the most relevant samples to the query sample. The local model is built mostly based on those samples. Traditional similarity metrics only account for magnitude of samples and mostly disregard the effect of time sequence. A novel Just-In-Time learning method is proposed in this thesis as "Trend-Based Just-In-Time", which accounts for magnitude, direction and trend of changes in database with respect to the query sample. The proposed method takes advantage of Principal Component Analysis (PCA) and a moving horizon approach to evaluate similarity. Additionally, traditional Just-In-Time methods have limitations in dealing with missing data. By using Probabilistic PCA (PPCA), one is able to evaluate similarity in presence of missing data. Similarity measurement is usually carried out in input subspace of a data set. Commonly, by including output space information in similarity calculation, improvement in prediction can be anticipated. Therefore, the proposed Trend-Based Just-In-Time is extended to similarity calculation in output space as well.As mentioned, predictive models are also used for building classification models (or predictive classifiers). Similar to regression, global models might not be able to provide good classifications when data set is not linearly separable. Another issue that might occur to a data set is curse of dimensionality, which requires a variable (or variable) selection technique. In this thesis, a novel approach for locally weighted classification of high dimensional data with variable selection is proposed. The proposed approach addresses curse of dimensionality problem by using Neighborhood Component Analysis (NCA) as a variable selection technique and Kernel PCA as dimension reduction method. To address high degree of nonlinearity, locally weighted approach is employed, which performs similarity measurement between query sample and historical database in latent space. Afterwards, by considering the weights obtained from similarity calculation, a locally weighted Support Vector Machine (SVM) model is built. All proposed approaches have been implemented on various Near InfraRed (NIR) data sets as well as synthetic data sets.

  • Subjects / Keywords
  • Graduation date
    Spring 2019
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-08vf-0g98
  • License
    Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.