Extracting Dynamic Latent Feature with Bayesian Approaches for Process Data Analysis

  • Author / Creator
    Ma, Yanjun
  • Data-driven approaches have been profoundly studied and successfully applied for process
    industries, such as in the development of inferential sensors. Among a variety of modelling
    techniques, the latent variable modelling approaches are widely preferred, which can learn
    informative features from massive industrial data. In order to make latent variable models
    more practical for process data analytics, the temporal correlations should be considered
    in feature extraction. Following the probabilistic modelling procedure, dynamic models are
    developed in this thesis to describe the latent feature. Besides several probability models,
    novel inferencing algorithms are elaborated for different application scenarios.
    In most chemical processes, features with large inertia and small varying velocity are
    believed to be more informative. By imposing this modelling preferences as prior distributions
    of model parameters, the first contribution of this thesis builds the dynamic latent
    features under a fully Bayesian framework. The preference for large inertia is implemented
    through a constraint and a prior distribution for the dynamic model of latent features,
    namely the transition function. The consideration of regularization is implemented through
    the generative model of raw process data, namely the observation function. Based on the
    variational Bayesian inference, a novel learning method is developed to extract the slowly
    varying features and learn model parameters.
    The second contribution of this thesis forms a transition function for the constrained
    latent features. As a hierarchical extension of the hidden Markov model, it describes a
    dynamic model for the probabilities of discrete variables. By using the Beta distribution
    to replace the Gaussian distribution, the novel transition function retained similar dynamic
    characteristics in the constrained domain. The preferred region of transition parameters
    can be determined for Bayesian inference. In this feature extraction model, a non-linear observation
    function is used to learn the constrained feature from unconstrained observations,
    where novel smoothing and marginalizing algorithms are created.
    In the third contribution of this thesis, a more practical observation function is proposed
    to extract dynamic features from multiple operating regions and outlier contaminated data. Specifically, multiple linear models are utilized to accommodate switching operation regions,
    and a heavy-tailed noise distribution is used to improve robustness. In order to integrate
    multiple observation models into the uni ed dynamic latent feature, a novel Bayesian state
    estimation algorithm is developed. In its online application, the proposed method is also
    extended to general multiple model state estimation.
    In the fourth contribution of this thesis, another observation function is proposed, which
    generalizes the ARMAX identification problem under the probabilistic framework. In this
    work, the dynamic latent feature is used to represent the random (time-variant) time delay,
    and the proposed Bayesian algorithm can solve the problem of parameter estimation and
    time delay estimation jointly. In particular, the random time delay is studied for three
    scenarios, where a static model, a hierarchical model, and a Markov model are developed.
    With the consideration of temporal correlations, the Markov model provides better performance
    for system identification. Besides, the hierarchical model also demonstrates its
    effectiveness of modelling sequentially independent time delay.
    The practicality of these proposed feature extraction models and inferencing algorithms
    are verified using numerical examples, benchmark simulations, and case studies on industrial
    data. Specifically, the application includes modelling the emulsion quality from the subsurface recovery process, modelling the steam quality from the steam generation process,
    and a target tracking problem with multiple models.

  • Subjects / Keywords
  • Graduation date
    Fall 2019
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.