Developing bioinformatics tools for metabolomics

  • Author / Creator
    Xia, Jianguo
  • Metabolomics aims to study all small-molecule compounds (i.e. metabolites) in cells, tissues, or biofluids. These compounds provide a functional readout of the physiological, developmental, and pathological state of a biological system. The field of metabolomics has expanded rapidly over the last few years with increasing applications to disease diagnosis, drug toxicity screening, nutritional studies and many other life sciences. However, significant challenges remain in both collecting and understanding metabolomic data. The central objective of my thesis project is to develop novel bioinformatic tools to address some of the key computational challenges in metabolomic studies. In particular, my research is focused on three areas: (i) compound identification from complex biofluids, (ii) processing and statistical analysis of metabolomic data, and (iii) functional interpretation of metabolomic data. In addressing these issues I have developed a number of efficient and user-friendly software tools, including MetaboMiner, MetaboAnalyst, MSEA and MetPA. Each of these software packages has required the development of novel algorithms, novel interfaces or the implementation of novel analytical concepts. MetaboMiner ( is a standalone Java application for compound identification from 2D NMR spectra of complex biofluids. Based on a novel adaptive search algorithm and specially constructed spectral libraries, MetaboMiner is able to automatically identify ~80% of metabolites from good quality NMR spectra. MetaboAnalyst ( is a web-based pipeline for metabolomic data processing, normalization, and statistical analysis. This application is based on a novel framework that combines the statistical and visualization power offered by R ( with an enhanced graphical user interface enabled by Java Server Faces technology. It is currently the most comprehensive and popular data analysis web service in metabolomics. MSEA or metabolite set enrichment analysis ( represents a novel application of the gene set enrichment analysis technique to metabolomics. In particular, MSEA is a web application for the identification of biologically meaningful patterns through enrichment analysis of quantitative metabolomic data. To create MSEA, I assembled a unique database of ~6300 groups of biologically related metabolites with association data on diseases, pathways, genetic traits, and cellular or organ localization. MetPA ( is a web-based tool for metabolic pathway analysis. It integrates functional enrichment analysis and pathway topology analysis through a novel Google-map style network visualization system. MetPA currently supports the analysis of ~1200 KEGG metabolic pathways for 15 model organisms.

  • Subjects / Keywords
  • Graduation date
    Fall 2011
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
  • Supervisor / co-supervisor and their department(s)
  • Examining committee members and their departments
    • Li, Liang (Chemistry)
    • Gallin, Warren (Biological Sciences)
    • Greiner, Russ (Computing Science)
    • Pavlidis, Paul (Psychiatry, University of British Columbia)
    • Deyholos, Michael (Biological Sciences)