- 218 views
- 418 downloads
The Budgeted Biomarker Discovery Problem
-
- Author / Creator
- Khan, Sheehan Veikko
-
Researchers conduct association studies to discover biomarkers in order to gain new
biological insight on complex diseases and phenotypes. Although most researchers
have intuitions about what defines a biomarker and how to assess the results of an association study, there is neither a formal definition for what a biomarker is, nor objective goal for association studies. As a result, the literature is full of association studies with conflicting results – e.g., studies on the same phenotype that produce lists of biomarkers with little to no overlap.This thesis presents the “Budgeted Biomarker Discovery (BBD) problem”, which clearly defines (1) what a biomarker is, and (2) rewards for correctly identifying
biomarkers and penalties for incorrectly identifying biomarkers. Furthermore, the
BBD problem allows researchers to use a mixture of high- and low-throughput technologies. In the context of discovering biomarkers from gene expression data, we
show how future association studies can use both microarrays and qPCR data to
objectively find the genes that are biomarkers in a cost efficient manner.We present several algorithms for solving the BBD problem, and show that good
algorithms must make use of both microarrays and qPCR. Also, they must be able to adapt to the data as it is collected. For example, when solving a new BBD problem, we must begin by collecting microarrays because we do not yet know how many biomarkers we expect to identify, or which qPCR arrays would be most informative. Thus, we use the high-throughput microarrays to survey the problem, until we can identify which specific low-throughput qPCR arrays to use for focusing on those genes that are potentially biomarkers. To identify when this transition should occur, we present the problem of estimating the density of univariate statistics in high-throughput
data, and we present our Fused Density Estimation (FDE) algorithm as a solution. We use FDE as the backbone of our adaptive algorithms for solving BBD
problems. In a series of experiments on real microarray data and realistic synthetic
data, we show that our BBD1 algorithm is the most robust solution, amongst those
considered, to the BBD problem. -
- Subjects / Keywords
-
- Graduation date
- Fall 2015
-
- Type of Item
- Thesis
-
- Degree
- Doctor of Philosophy
-
- License
- This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.