Identifying expression quantitative trait loci in genome wide association studies

  • Author / Creator
    Moradi, Fahimeh
  • Introduction: Genome wide association studies (GWAS) have been widely used in recent years to identify the new information on genetic variants which are associated with complex trait in many diseases. Advances in identifying the Single nucleotide polymorphisms (SNPs) facilitate the study of etiologies of common disorders including cancers, inflammatory bowel diseases (IBD) and colorectal cancer. However, the known SNPs are not sufficient to explain the heritability associated with traits. Variations in gene expression demonstrate that transcript levels of many RNAs behave as heritable quantitative trait. Studying the genetics of gene expression can provide additional power to the roles of GWAS variants. Expression quantitative trait loci (eQTL) mapping links the genome-wide SNPs with RNA expression. Objectives: The objective of this thesis is to identify an efficient, statistically sound and user friendly method for analysis of eQTL studies. Methods: In this study, we performed expression quantitative trait loci (eQTL) analysis using the Matrix eQTL R package. This technique implements matrix covariance calculation and efficiently runs linear regression analysis. The statistical test determines the association between SNP and gene expression, where the null hypothesis is no association between genotype and phenotypes. In eQTL mapping, the regulative variants are classified as cis and trans, the definition depending on the physical distance between a gene and transcript. A certain genomic distance (e.g. 1 Mb) is defined as the maximum distance at which cis or trans regulatory elements can be located from the gene they regulate. False discovery rate (FDR) is used to identify significant cis and trans eQTL for multiple testing corrections. Results: We applied matrix eQTL to a real data set consisting of 730,256 SNP and 33,298 RNA for 173 samples. SNPs with minor allele frequency (MAF) less than 0.05 and those violating the Hardy_Weinberg equilibrium (HWE) , were excluded from the study. In this study, 15,408 cis eQTL and 27,562 trans eQTL are identified at a FDR less than 0.05, corresponding to p value thresholds of 8e-5 and 1e-8, respectively. Conclusion: We found out that matrix eQTL is a computationally efficient and user friendly method for analysis of eQTL studies. The results provide insight into the genomic architecture of gene regulation in inflammatory bowel disease (IBD).

  • Subjects / Keywords
  • Graduation date
    Spring 2017
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
  • Language
  • Institution
    University of Alberta
  • Degree level
  • Department
  • Specialization
    • Epidemiology
  • Supervisor / co-supervisor and their department(s)
  • Examining committee members and their departments
    • Jiang, Bei (Mathematical and Statistical Sciences)
    • Menon, Devidas (School of Public Health)
    • Midodzi, William K (Department of Medicine_Memorial University)