Download the full-sized PDF of Data Mining and Knowledge Discovery for Process identification and multivariate monitoring using spectroscopy: application to low temperature bitumen visbreakingDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Data Mining and Knowledge Discovery for Process identification and multivariate monitoring using spectroscopy: application to low temperature bitumen visbreaking Open Access


Other title
Data mining, Bayesian learning, multivariate
Type of item
Degree grantor
University of Alberta
Author or creator
Tefera, Dereje Tamiru
Supervisor and department
Dr. Prasad and Dr. de Klerk
Examining committee member and department
Dr. Robert Hays, Department of Chemical and Materials Engineering
Dr. Petr Nikrityuk, Department of Chemical and Materials Engineering
Dr. Arno de Klerk, Department of Chemical and Materials Engineering
Dr. Vinay Prasad, Department of Chemical and Materials Engineering
Department of Chemical and Materials Engineering
Process Control
Date accepted
Graduation date
2016-06:Fall 2016
Master of Science
Degree level
Data mining and knowledge discovery is a systematic process of identifying useful information from a data set where there is no or limited information about the underlying process. In this study, data mining and other learning methods are used cohesively to model a low temperature visbreaking process. Low temperature visbreaking is the process under investigation for field upgrading of oil sands bitumen. The classical visbreaker is operated at a temperature in the range of 430 to 500 °C, which would result in the formation of significant visbroken products, requiring subsequent hydrotreating. Due to this reason, several recent investigations have focused on finding an optimal operation condition that enables significant reduction of viscosity and limit the formation of olefins. These studies have indicated that the operation of a visbreaker at a temperature in the range of 150 to 400 °C could significantly decrease the viscosity of bitumen, while limiting the formation of cracked products. However, this process is at an investigation stage and there is very limited information about the underlying reaction mechanism. Spectroscopy is an ideal tool for the identification of such a complex chemical process since it provides comprehensive information about the underlying chemical changes at a given operation condition. But, the large amount of useful information contained in spectroscopic data is often difficult to extract since absorption intensities from individual chemical constituents of the sample experience a high degree of overlap, particularly for reactions involving chemically complex systems such as reactions involving heavy oils. The notion of this thesis is to develop data-driven models that can describe the process well and can ultimately be used for real time analysis and optimization of the process of visbreaking in the temperature range of 150 to 400 °C using Fourier Transform-Infrared (FTIR) Spectroscopy data. The first part of the research focuses on thermal kinetic modeling from the spectroscopy data acquired in the experimental analysis of the process. Obtaining mechanistic and kinetic descriptions for the chemistry involved in this process was a significant challenge, because of the compositional complexity of bitumen and the associated analytical challenges. Lumped kinetic models for heavy oil cracking can only be useful for describing the process on a preconceived reaction network, but are unsatisfactory for developing reaction networks. This study proposes a novel method to derive a reaction network of thermal cracking of oil sands bitumen from FTIR spectroscopy data using data mining and other learning methods. The development of the kinetic network required implementation of several learning methods, including principal component analysis (PCA), data clustering and Bayesian learning. PCA is used for variable selection and a Bayesian agglomerative hierarchical cluster analysis was employed to obtain groups of pseudo-species with similar spectroscopic properties. Then, a Bayesian structure-learning algorithm was used to develop the corresponding reaction network. The reaction network derived from the model was compared to the reaction network of thermal cracking of model alkyl aromatic compounds proposed in the literature, and the agreement was encouraging. One attractive feature of the model is that it can be embedded into the process control system to predict the real-time reaction network and the process need limited or no prior description of the reaction network. The second part attempts to design a spectroscopy-based online monitoring method for the process under consideration. The designed algorithm predicts the chemical rank of the unknown chemical mixture; resolves mixture spectra and evaluates the corresponding concentration profile of the resolved components so that the effect of different operation condition can be analyzed on a real time basis. The model includes several steps to resolve mixture spectra. In the first step, it predicts instrument noise and chemical rank of the system using PCA and Malinowski’s error indicator function (IND) respectively. Once the chemical rank is determined, evolving factor analysis (EFA) is used to approximate the initial concentration profile. The final resolution of the spectra is completed using multivariate curve resolution alternating least squares (MCR-ALS). The model results agreed well with available experimental data for 1H NMR characterization and other measurements such as microcarbon residue content. The model needs negligible computational effort and the only input required is the FTIR spectra and the model can be suitable for real time monitoring.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 7131370
Last modified: 2016:11:16 15:21:27-07:00
Filename: Tefera_Dereje Tamiru_201609_MSC.pdf
Original checksum: 185db9faa8372ca9d2638b971cc48d87
Activity of users you follow
User Activity Date