Large-scale Characterization Of Intrinsic Disorder And High-throughput Prediction Of RNA, DNA and Protein Binding Mediated By Intrinsic Disorder

  • Author / Creator
    Peng, Zhenling
  • Intrinsically disordered proteins lack stable 3D structures in vivo, are functionally important, and are very common in nature. In the past three decades, many studies focused on prediction of intrinsic disorder from protein sequence, estimation of its abundance, and analyses of its functional roles. However, these studies were limited in their scope; for example, they focused only on one of many functional and structural aspects. We performed first-of-its-kind comprehensive and detailed analysis of abundance, functional roles, and cellular localizations of intrinsic disorder in complete proteomes. We show that intrinsic disorder is abundant across all kingdoms of life including viruses, is involved in crucial cellular processes, such as translation, transcription, metabolism, regulation, signaling, and so on, and is preferentially located in the ribosome and nucleus. We also mapped intrinsic disorder into eukaryotic, bacterial and archaean cells. These observations motivated us to further analyze two protein families  ribosomal proteins and proteins involved in the programmed cell death. We performed analysis across multiple species, which shows that intrinsic disorder is enriched and performs a variety of important cellular functions in ribosomal and cell death proteins. These two studies reveal that intrinsic disorder is involved in the interactions between proteins, RNAs, and DNAs. The prediction and characterization of these interactions for ordered proteins (i.e., proteins with stable 3D structures in vivo) recently attracted significant attention. However, there are no methods that target these functions/interactions mediated by the intrinsic disorder. Development of such methods is now possible by using the curated functional annotations of intrinsic disorder from the DisProt database. Utilizing these data we developed the first computational prediction method, DisoRDPbind, that predicts protein-protein, -RNA and -DNA interactions mediated by the intrinsic disorder. Our method utilizes logistic regression algorithm and a custom-designed and empirically selected set of descriptors of the input protein sequence. Empirical assessment using two benchmark datasets and large-scale predictions on four eukaryotic proteomes suggests that DisoRDPbind provides good predictive quality, differs from the methods focused on the predictions for the ordered proteins, and its computational efficiency allows for annotation of these interactions in whole proteomes.

  • Subjects / Keywords
  • Graduation date
    Fall 2014
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
  • Specialization
    • Software Engineering and Intelligent Systems
  • Supervisor / co-supervisor and their department(s)
  • Examining committee members and their departments
    • Han, Jie (Electrical and Computer Engineering)
    • Babu, M. Madan (MRC Laboratory of Molecular Biology)
    • Kurgan, Lukasz (Electrical and Computer Engineering)
    • Dick, Scott (Electrical and Computer Engineering)
    • Reformat, Marek (Electrical and Computer Engineering)