Usage
  • 132 views
  • 158 downloads

Robust Generalized Weighted Probabilistic Principal Component Regression with Application in Data-driven Optimization

  • Author / Creator
    Memarian, Alireza
  • The operations of the plant may deviate from the initial design due to the uncertainties and changes in the several conditions as a result of market demand, operation conditions, and safety regulations over time. To maintain productivity, safety, and efficiency, operators should ensure the plant to be operating around its optimal point. However, due to the changes in the operating conditions of the plant, the current optimal point may deviate from the one obtained during the initial design. Alongside finding the optimal point, it is essential to find the optimal path that steers the plant from the current operating conditions to the optimal operating point. Hence, auto-mated self-optimization of the plants is gaining popularity in academia and industry. One of the approaches that is in practice in plant optimization is optimizing the plant with the aid of the model. Thus, developing a model that can mimic the plant with the utmost accuracy is important. However, due to the possible differences between the developed model and the plant (model-plant mismatch), the obtained optimal point from the model may not be accurate. The main objective of this thesis is to develop a general framework for optimization of a plant that can handle the model-plant mismatch. A model-based optimization strategy is utilized to achieve this objective. To develop a model that is robust to outliers, and can handle delays, missing data in input and output, and also is simple to use in plant optimization, two extensions of a generalized weighted probabilistic principal component regression method are proposed in this thesis. In addition, the proposed model is able to deal with high-dimensional plant datasets, multi-modal and/or nonlinear nature of the plants. The high dimensionality, multi-modal nature of plants, missing data in input and output variables, and outliers are addressed simultaneously in Chapter 2, the mixture robust semi-supervised probabilistic principal component regression model with missing input data. The main challenge with the model developed in Chapter 2 is to determine the optimal number of mixture components to be used while modeling. In Chapter 3 entitled weighted semi-supervised probabilistic principal component regression with missing input and delayed output variables, challenges like the delay between each input and output variable and missing data are addressed. These extensions are developed under the expectation maximization (EM) framework owing to the fact that they can efficiently deal with hidden variables like missing data, delays, and outliers. To account for the missing input and output data in these models, the data imputation method and semi-supervised framework are utilized, respectively. To deal with the presence of outliers, a combination of two Gaussian distributions is used as a prior for the noise, and a model-free distribution is considered for the delay variables. Finally, a strategy to update the range of delay in the variables is proposed to help speeding up the convergence of the algorithm. A combination of these two proposed algorithms is capable of making the most use of all available information and address uncertainties that may occur in plants. Therefore, by incorporating the proposed extensions of the PPCR model together, a generalized weighted PPCR model is developed to describe the plant, which is able to deal with different types of uncertainties while performing the plant optimization. To account for the model-plant mismatch between the generalized weighted PPCR model and the plant in addition to steering the solution closer to the plant’s optimal point, a robust Gaussian process regression model is utilized. To increase the accuracy of the generalized weighted PPCR model, a nonlinearity index is proposed that defines the range of the data to be used while developing a model. The proposed algorithm builds a local model around the current operating point and tries to find its optimal point by solving the optimization problem, and then steer the plant to the obtained optimal solution. By repeating these two steps, i.e. 1) building a local model and 2) steering the plant to the obtained optimal point, the algorithm tries to gradually move the plant from its initial operating point to the optimal point. Finally, the applicability and performance of all the proposed methods are tested and demonstrated through several numerical, simulation, experimental, and industrial examples.

  • Subjects / Keywords
  • Graduation date
    Spring 2022
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-ky91-n502
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.