Download the full-sized PDF of Protein Structure Characterization From NMR Chemical ShiftsDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Protein Structure Characterization From NMR Chemical Shifts Open Access


Other title
NMR Chemical Shifts
Protein Structure Characterization
Machine-learning Algorithms
Secondary and Super-secondary Structure Identification
Accessible Surface Area Estimation
Fold Recognition
Chemical Shift Threading
Type of item
Degree grantor
University of Alberta
Author or creator
Hafsa, Noor E
Supervisor and department
Wishart, David (Computing Science & Biological Sciences)
Examining committee member and department
Lin, Guohui (Computing Science)
Wishart, David(Computing Science & Biological Sciences)
Greiner, Russell (Computing Science)
Spyracopoulos, Leo (Biochemistry)
McIntosh, Lawrence (Biochemistry & Molecular Biology)
Department of Computing Science

Date accepted
Graduation date
2016-06:Fall 2016
Doctor of Philosophy
Degree level
In order to understand the complex biological functions of proteins, highly detailed, atomic resolution protein structures are needed. Experimental methods such as X-ray crystallography and NMR spectroscopy provide standard platforms for determining the atomic-resolution structures of proteins. However, a continuing bottleneck in conventional NOE-based NMR structure determination lies in the difficulty of measuring NOEs for medium-to-large proteins and the resulting time-costs and the corresponding reduction in structure accuracy and precision. This has led to an increased interest in using other easily identifiable NMR parameters, such as chemical shifts, to facilitate protein structure determination by NMR. Chemical shifts, often considered as mileposts of NMR, have long been used to decipher the structures of small molecules. However, chemical shifts are much less frequently utilized for structural interpretation of larger macromolecules such as peptides and proteins. Most existing macromolecular methods use chemical shifts and various heuristic, rule-based algorithms to identify and determine a small number of structural parameters (such as secondary structure). Other methods, such as CS-Rosetta and CS23D, which attempt to determine 3D structures from chemical shifts alone, are only modestly successful (~50% success). So while good progress has been made, I believe that there is still substantial room for improvement and that the “Shift-to-Structure” problem has not yet been fully solved. My PhD project involves investigating innovative computational and machine- learning approaches to develop chemical-shift based prediction models to determine protein structures with high efficiency and high accuracy (>90%). More specifically, my thesis consists of three major components: a) shift-based local protein structure prediction; b) prediction of protein local/non-local interactions from sequence and chemical shifts; and c) tertiary fold recognition from chemical shifts. Towards that goal, I have developed several chemical-shift based prediction models that exploit advanced computational and machine-learning algorithms. In particular, I developed a) CSI 2.0 - a multi-class prediction method for protein local structure prediction from chemical shift data; b) CSI 3.0 – a computational model that identifies detailed local structure and structural motifs in proteins using chemical shift data; c) ShiftASA – a boosted tree regression model for predicting accessible surface area from chemical shifts; and d) E-Thrifty - a protein fold recognition method that performs chemical shift threading to identify and generate the most probable fold or 3D structure that a query protein may have. Validation of these proposed methods was performed using several independent test sets and the results indicate substantial improvements over other state-of-the-art methods. Given their superior performance, I believe that these methods will be useful contributions to the field of NMR-based protein structure determination and will be fundamental to the development 3D structure determination protocols that use only chemical shift data.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication
Hafsa, N. E., & Wishart, D. S. (2014). “CSI 2.0: a significantly improved version of the Chemical Shift Index”. Journal of Biomolecular NMR, 60(2-3), 131-146Hafsa, N. E., Arndt, D., & Wishart, D. S. (2015). “CSI 3.0: a web server for identifying secondary and super-secondary structure in proteins using NMR chemical shifts”. Nucleic Acids Research, 43, W370-377Hafsa, N. E., Arndt, D., & Wishart, D. S. (2015). “Accessible surface area from NMR chemical shifts”. Journal of Biomolecular NMR, 62(3), 387-401

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 7501154
Last modified: 2016:11:16 15:24:30-07:00
Filename: Hafsa_Noor_E_201609_PhD.pdf
Original checksum: 31fb19e2613c1b61c236ad1ca0556c08
Activity of users you follow
User Activity Date