Usage
  • 140 views
  • 135 downloads

CHEMINFORMATICS TOOLS FOR ENABLING METABOLOMICS

  • Author / Creator
    DJOUMBOU FEUNANG, YANNICK
  • Metabolites are small molecules (<1500 Da) that are used in or produced during chemical reactions in cells, tissues, or organs. Upon absorption or biosynthesis in humans (or other organisms), they can either be excreted back into the environment in their original form, or as a pool of degradation products. The outcome and effects of such interactions is function of many variables, including the structure of the starting metabolite, and the genetic disposition of the host organism. For this reasons, it is usually very difficult to identify the transformation products as well as their long-term effect in humans and the environment. This can be explained by many factors: (1) the relevant knowledge and data are for the most part unavailable in a publicly available electronic format; (2) when available, they are often represented using formats, vocabularies, or schemes that vary from one source (or repository) to another. Assuming these issues were solved, detecting patterns that link the metabolome to a specific phenotype (e.g. a disease state), would still require that the metabolites from a biological sample be identified and quantified, using metabolomic approaches. Unfortunately, the amount of compounds with publicly available experimental data (~20,000) is still very small, compared to the total number of expected compounds (up to a few million compounds). For all these reasons, the development of cheminformatics tools for data organization and mapping, as well as for the prediction of biotransformation and spectra, is more crucial than ever. My PhD thesis focused on developing several cheminformatics tools that address these limitations. First, I developed ClassyFire and ChemOnt. ClassyFire is a publicly available software tool and webserver that automatically and hierarchically classifies any given molecule based on its structure. It relies partly on ChemOnt, a comprehensive and comprehensible taxonomy that contains >4,800 chemical categories, as well as their textual descriptions and mappings to other ontologies. ClassyFire was used to classify and annotate >80 million compounds. The webserver also integrates a text-based search engine. These features make ClassyFire unique in the sphere of publicly available computational tools. ClassyFire and ChemOnt are available at http://classyfire.wishartlab.com. Second, I developed BioTransformer and BioTransformerDB. BioTransformer is a software tool for the prediction of small molecule metabolism in mammals. It uses a hybrid approach that partly relies on BioTransformerDB, a unique database of biotransformations containing experimentally confirmed metabolic reactions that transform >1,000 drugs, pesticides, cosmetics, and food compounds, among others. The current version of BioTransformer, which is available at https://bitbucket.org/djoumbou/biotransformer, focuses on the human species, but is easily expandable to other species. Third, I developed CFM-ID 3.0, an extension of CFM-ID (1.0, and 2.0), originally developed by Felicity Allen et al. CFM-ID 3.0 is a software tool and webserver for the prediction and annotation of MS spectra, as well as the identification of metabolites. With the integration of a rule-based fragmentation approach for spectra prediction, the development of new ranking functions, and the expansion of the spectral database, CFM-ID 3.0 showed a significant improvement, in terms of speed and accuracy, compared to previous versions. CFM-ID 3.0 is currently available as we web server at http://cfmid-staging.wishartlab.com/. ClassyFire, BioTransformer, and CFM-ID have found applications in various fields including chemical information management, metabolomics, and exposomics, among others. Together, they build a cheminformatics platform that can enable metabolomics, and contribute to the understanding of our environment as well as the advancement of science.

  • Subjects / Keywords
  • Graduation date
    2017-11:Fall 2017
  • Type of Item
    Thesis
  • Degree
    Doctor of Philosophy
  • DOI
    https://doi.org/10.7939/R3VD6PJ8B
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
    English
  • Institution
    University of Alberta
  • Degree level
    Doctoral
  • Department
    • Department of Biological Sciences
  • Specialization
    • MICROBIOLOGY AND BIOTECHNOLOGY
  • Supervisor / co-supervisor and their department(s)
    • WISHART, DAVID S.
    • GALLIN, WARREN
    • GREINER, RUSSELL
  • Examining committee members and their departments
    • GREINER, RUSSELL (COMPUTING SCIENCE)
    • WISHART, DAVID S (BIOLOGICAL SCIENCES, COMPUTING SCIENCE)
    • MATTINGLY, CAROLYN (TOXICOLOGY, NORTH CAROLINA STATE UNIVERSITY)
    • STOTHARD, PAUL (AGRICULTURAL FOOD AND NUTRITIONAL SCIENCE)
    • GALLIN, WARREN (BIOLOGICAL SCIENCES)