Usage
  • 390 views
  • 318 downloads

Identification and Characterization of a Novel Premenopausal Breast Cancer Locus and Insights into Copy Number Variations for Disease Predisposition and Prognosis

  • Author / Creator
    Kumaran, Mahalakshmi
  • Breast cancer is a complex multifactorial disease with the interplay of genetic, environmental and lifestyle factors contributing to the disease risk. Studies based on twins estimated that ~30% of the risk is due to genetic factors. High and moderate penetrant mutations along with low penetrance variants accounted for a proportion of the total heritable risk. Remaining heritability is yet to be accounted for.
    My thesis is based on genome-wide analysis of both SNPs and Copy Number Variations (CNVs) as genetic determinants of breast cancer risk.
    (i) Characterization of the SNP rs1429142 conferring premenopausal breast cancer risk
    I focused on SNP (rs1429142 on chromosome locus 4q31.22) associated with premenopausal breast cancer risk, first of its kind in literature reported by the Damaraju laboratory (Stages 1-3). In the current study additional cases were genotyped (Stage 4). In the analysis of the combined samples (Stage1-4; 4331 cases/4271 controls) the index SNP showed genome-wide significance (OR 1.25, p-value 4.35x10-8). Analysis of rs1429142 showed elevated risk in premenopausal women (n=1503 cases/4271 controls; odds ratio (OR) 1.40, p-value 5.81x10-10). Postmenopausal Caucasian women (n=2700 cases/4271 controls) showed modest risk (OR 1.17; p-value 7.81x10-04) and this finding was confirmed in the postmenopausal cohort from Cancer Genetic Markers of Susceptibility study (CGEMS, USA). SNP rs1429142 showed an association among premenopausal women with African ancestry (OR minor allele 0.82; p-value-1.45x10-02). Since the index SNP, rs1429142, was in an intergenic regiona, fine-scale mapping of the locus 4q31.22 revealed 135 SNPs to be associated with premenopausal risk. Conditional regression analysis did not reveal any additional peaks of association. Likelihood ratio analysis excluded five variants that were less likely causal compared to the strongly associated SNP. I further refined the putative loci (130 SNPs) by linkage disequilibrium (LD) block mapping and compared patterns for Caucasian and African populations (HapMap data).
    I examined active enhancer functions based on chromatin state (histone marks, DNase hypersensitive sites) in human breast cell lines (HMEC, vHEMC) and breast myoepithelial primary cells using data from publicly available resources. I found evidence for the binding of the transcription factors (C-FOS, STAT1/3, POL2/3) at SNP sites in the human breast cell line MCF10A-Er-Src. Three SNPs (rs1366691, rs1429139, rs7667633) were identified as potentially causal and appeared to be part of the predicted Topologically Associated Domain (TAD), helping to explain short-range interactions and enhancer-promoter cross-talk.
    (ii) CNV association studies: I studied CNVs, which are larger in size (>50 bp and up to 1Mb) relative to the single base changes of SNPs. CNVs harbor both coding and non-coding genes and may exert gene-dosage effects or regulatory functions. Whole genome CNVs were captured in 422 cases and 348 controls using the Human Affymetrix SNP 6 array platform (discovery dataset). Whole genome copy number estimation was performed and the CNVs with frequencies > 10% and overlapping protein-coding genes were considered further. Association analysis revealed a total of 200 contiguous CNV regions (CNVRs) or CNVs associated with breast cancer risk (q-value < 0.05).
    I investigated if any of the breast cancer associated CNVs show prognostic relevance since SNP GWAS attempts to identify prognostic markers were thus far unsuccessful. Among the 200 associated CNVs/CNVRs, 21 CNVRs (overlapping with 22 genes) showed association with Overall survival (OS) and Recurrence Free Survival (RFS). CNVs were interrogated for gene dosage effects by correlating copy number status with breast tumor tissue gene expression. Also, I interrogated the role of germline CNVs harboring small-noncoding RNAs in conferring breast cancer risk. Further, I investigated the breast tissue specific expression of CNV-embedded small-noncoding RNAs (CNV-sncRNAs) to understand the post-transcriptional gene regulatory mechanisms and how they might contribute to breast cancer. I used 495 samples (Affymetrix 6 array data) available in the TCGA as my validation set and identified 1812 breast cancer associated CNVs harboring miRNAs (n=38), piRNAs (n=9865), snoRNAs (n=71) and tRNAs (n=12) genes. A subset of CNV-sncRNAs expressed in breast tissue (tumor and normal) in TCGA dataset, also showed correlation with germline copy numbers.
    In summary, I have fine-mapped premenopausal breast cancer locus and identified potential causal variants which are predicted to have enhancer functions Germline CNVs also are useful markers for breast cancer susceptibility and prognosis.

  • Subjects / Keywords
  • Graduation date
    Fall 2018
  • Type of Item
    Thesis
  • Degree
    Doctor of Philosophy
  • DOI
    https://doi.org/10.7939/R32Z1355D
  • License
    Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.