Understanding the Chemistry of Conversion of Model Compounds and Biomass by Hydrous Pyrolysis Based on Spectroscopic Data Using Data Fusion, Data Mining, and Chemometrics

  • Author / Creator
  • Encoding the presence of multispecies in a complex system, difficulty in characterizing the physical constituents of the products with analytical instruments along with developing the causality or modeling between these groups are some major challenges in a complex system such as biomass conversion. To overcome these challenges and provide insights into the potential chemistry and reaction mechanism of the hydrous pyrolysis (HTL) of biomass and its main products (cellulose and lignin), this research employed some machine learning methods such as data mining and data fusion techniques. Therefore, Levoglucason, 2-Phenoxyethyl benzene (representing cellulose and lignin, respectively), a physical mixture of these model components, and Monterey pine whole biomass underwent 108 HTL reactions in the presence of hot water and catalysts under different conditions. For characterization of the produced bio-oil two spectroscopic techniques were used, Fourier transform-infrared (FTIR) and Proton nuclear magnetic resonance (1H NMR). In the process of knowledge discovery from hidden interesting patterns in the large data sets provided by these spectroscopic techniques, this research employed data fusion. The aim of data fusion is to develop experimentally and computationally sensible models from spectroscopic data with the advantages of a consistent combination of absorbance across wavenumbers (variables) with demonstrable improvement in the reaction network structure for integrating multiple data sets provided by FTIR and 1H NMR. Developed model has the advantage of decreasing the error while processing large-scale reactions along with increasing the model performance by factoring in complementary information. The final fused data set was used for data clustering by using the Bayesian hierarchical clustering (BHC). In a large data set, while the traditional hierarchical clustering algorithms have the difficulties of deciding which distance metric to choose, BHC has the advantage of computing the marginal likelihoods in order to decide which clusters to merge and to avoid overfitting. After grouping the wavenumbers into different clusters, the Bayesian network learning approach (BN) was applied to develop the optimal reaction network. To identify the optimal structure of the network, three different optimization approaches are applied: two greedy search-and-score algorithms called tabu and hill climbing, and a hybrid algorithm called the max-min hill climbing (MMHC).

    In spite of the fact that spectroscopic techniques such as FTIR can provide useful information relating to the structure of the compound, it has a weakness of having a high dimensional space of wavenumbers which is sometimes difficult to be interpreted or analyzed. To resolve this issue, chemometric methods can be applied. In these methods, statistical or mathematical techniques have been used to collect the required information regarding the objects of interest in the data. Self-modeling multivariate curve resolution (SMCR) is a popular example of a chemometric technique. The reason to employ this method is to obtain a set of pseudo-components and their spectra and to use them to develop the reaction network. SMCR is a very useful tool for the elucidation of the multi-component phenomena in complex chemical systems such as biomass conversion. Developed algorithm can be applied for real-time analysis of many complex reacting systems and mixtures because it provides quantitative tracking of changes in the process and can be used for compositional control. In addition, it also acts as a screening method to propose hypotheses about reaction mechanisms in complex reacting mixtures. Moreover, for online monitoring of species conversion in these kinds of complex reactions, this research computed the concentrations of these pseudocomponents over the number of samples. The application of this trace makes it useful in the online monitoring of species conversion by integrating it with a suitable control strategy that adjusts process conditions to maximize the yield of the desired product.

  • Subjects / Keywords
  • Graduation date
    Fall 2019
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.