Download the full-sized PDF of Computational support systems for prediction and characterization of protein crystallization outcomesDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Computational support systems for prediction and characterization of protein crystallization outcomes Open Access


Other title
machine learning
statistical analysis
protein structure
X-ray crystallography
structural coverage
3D structure
Type of item
Degree grantor
University of Alberta
Author or creator
Mizianty, Marcin J
Supervisor and department
Kurgan, Lukasz (Electrical and Computer Engineering)
Examining committee member and department
Zemp, Roger (Electrical and Computer Engineering)
Godzik, Adam (Sanford-Burnham Institute, La Jolla, CA)
Michalak, Marek (Biochemistry)
Reformat, Marek (Electrical and Computer Engineering)
Dick, Scott (Electrical and Computer Engineering)
Kurgan, Lukasz (Electrical and Computer Engineering)
Department of Electrical and Computer Engineering
Software Engineering and Intelligent Systems
Date accepted
Graduation date
Doctor of Philosophy
Degree level
Analysis of protein structures may reveal their function, regulation and interactions. Almost 90% of the known protein structures were solved using X-ray crystallography; however, many more structures remain unsolved. Protein Structure Initiative (PSI) project was created to speed up structure determination. PSI includes structural genomics (SG) centers that perform high-throughput crystallization which processes hundreds of proteins using standardized protocols. Large quantities of crystallization data generated by PSI fueled research that looked into proteins’ properties associated with success of crystallization. In spite of intense research crystallization of proteins is still among the most complex and least understood problems in structural biology. Since SG centers do not focus on individual proteins, but rather on covering the protein structure space, they have certain flexibility in selection of targets. At the beginning of my PhD program we designed and assessed three accurate methods that predict crystallization propensity based on a protein sequence. These methods could be used to prioritize targets based on their predicted propensity for the successful structure determination. We observed that as the crystallization protocols are updated the predictors of crystallization propensity need to be correspondingly upgraded and enhanced. To this end, in the course of the thesis we developed an accurate predictor that generates crystallization propensity and indicates causes of the potential crystallization failure, which can occur at any of the three major steps in the protein crystallization protocol: production of protein material, purification, and production of crystals. Our predictors are empirically compared against state-of-the-art in the field demonstrating favorable predictive performance. Finally, we designed another accurate and runtime-efficient method which we then used to perform first-of-its-kind large-scale analysis of crystallization propensity for proteins encoded in 1,953 fully sequenced genomes. Analysis of these predictions shows that current X-ray crystallography combined with homology modeling could provide an average per-proteome structural coverage of 73% with over 60% coverage for archaea and bacterial proteomes, and between 35 and 70% for eukaryotes. Moreover, our study revealed that use of knowledge-based target selection increases coverage by a significant margin, which for majority of organisms is between 25 to 40%.
Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.
Citation for previous publication
Kurgan, L.A., Razib, A.A., Aghakhani, S., Dick, S., Mizianty, M.J. & Jahandideh, S. (2009). CRYSTALP2: sequence-based protein crystallization propensity prediction. BMC structural biology 9 p. 50.Kurgan, L.A. & Mizianty, M.J. (2009). Sequence-Based Protein Crystallization Propensity Prediction for Structural Genomics: Review and Comparative Analysis. Natural Science 1 (2) pp. 93–106.Mizianty, M.J. & Kurgan, L.A. (2009). Meta prediction of protein crystallization propensity. Biochemical and biophysical research communications 390 (1) pp. 10–15.Mizianty, M.J. & Kurgan, L.A. (2011). Sequence-based prediction of protein crystallization, purification and production propensity. Bioinformatics 27 (13) pp. i24–i33.Mizianty, M.J. & Kurgan, L.A. (2012). CRYSpred: Accurate Sequence-Based Protein Crystallization Propensity Prediction Using Sequence-Derived Structural Characteristics. Protein and peptide letters 19 (1) pp. 40–49.

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 3595379
Last modified: 2015:10:12 15:20:55-06:00
Filename: Mizianty_Marcin_Fall 2013.pdf
Original checksum: e0f9195075bd3caa36286350b605c3a4
Activity of users you follow
User Activity Date