• Author / Creator
    Mirzazadeh, Farzaneh
  • Radiotherapy is often used to treat prostate cancer. While using high dose of radiation does kill cancer cells, it can cause toxicity in healthy tissues for some patients. It would be best to apply this treatment only to patients who are likely to be immune from such toxicity. This requires a classifier that can predict, before treatment, which patients are likely to exhibit severe toxicity. Here, we explore ways to use certain genetic features, called Single Nucleotide Polymorphisms (SNPs), for this task. This thesis uses several machine learning methods for learning such classifiers for predicting toxicity. This problem is challenging as there are a large number of features (164,273 SNPs) but only 82 samples. We explore an ensemble classification method for this problem, called Mixture Using Variance (MUV), which first learns several different base probabilistic classifiers, then for each query combines the responses of the different base classifiers based on their respective variances. The original MUV learns the individual classifiers using bootstrap sampling of the training data; we modify this by considering different subsets of the features for each classifier. We derive a new combination rule for base classifiers in the proposed setting and obtain some new theoretical results. Based on characteristics of our task, we propose an approach that involves first clustering the features before selecting only a subset of features from each cluster for each base classifier. Unfortunately, we were unable to predict radiation toxicity in prostate cancer patients using just the SNP values. However, our further experimental results reveal strong relation between correctness of a classifier in its prediction and the variance of the response to the corresponding classification query, which show that the main idea is promising.

  • Subjects / Keywords
  • Graduation date
    Spring 2010
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.