Regions of linkage disequilibrium LD were
Regions of linkage disequilibrium (LD) were defined for the three loci tagged by SNPs reaching genome-wide significance. rs17599026, associated with urinary frequency, tags a 106kb region of LD (base position 137,657,783–137,763,798; Fig. 2A) containing part of KDM3B (including the promoter region through exon 20), the upstream FAM53C, and part of the upstream CDC25C (the promoter region through exon 6). rs7720298, associated with decreased urine stream, tags a 39kb region of LD (base position 13,858,328–13,897,362; Fig. 2B) that contains exons 16 through 30 of DNAH5. rs11230328 was not in strong linkage disequilibrium with any other common SNPs found in the more recent release of the 1000 Genomes population data, and this locus may represent a spurious association. rs11230328 lies within a LINE element, and may be difficult to map in the genome due to its location within a region of high homology. Imputation coverage (i.e. the number of SNPs successfully imputed in the study datasets out of common SNPs (MAF≥0.05) within the 1000 Genomes European population) was high within the regions tagged (correlation r2≥0.5) by rs17599026 (173/183, 94.5%) and rs7720298 (130/136, 95.6%).
SNPs rs17599026 and rs7720298 are located in non-coding regions, as is common in GWAS, and the LD blocks tagged by these SNPs cover coding and non-coding regions. rs17599026, associated with urinary frequency, lies in an intronic region 23bp downstream of exon 20 of KDM3B (MIM 609373; NM_016604.3:c4753+23C>T) encoding the lysine-specific demethylase 3B protein. KDM3B is highly expressed in testes of tissue from the Human Protein Atlas project (Uhlen et al., 2015), suggesting that this gene could be involved in normal bladder function and potentially dysfunction following damage from radiation exposure. rs17599026 itself does not lie in a site of known transcription factor binding or chromatin modification from the Encyclopedia of DNA Elements (ENCODE) catalog (ENCODE Project Consortium, 2012) and there are no significant expression quantitative trail loci (eQTLs) for this SNP in the Genotype-Tissue Expression (GTEx) project (GTEx Consortium, 2013). Given that rs17599026 is very close to exon 20, it could have an effect on splicing. However, no significant splicing motif alteration was detected using Human Splicing Finder (Desmet et al., 2009). ENCODE data show that the large LD block tagged by this SNP contains multiple transcription factor binding sites, DNase hypersensitive sites, histone methylation sites (methylation of lysine 4 at histone 3, H3K4Me3 and methylation of lysine 4 at histone 1, H3K4Me1), and histone acetylation sites (acetylation of lysine 27 at histone 3, H3K27Ac) that may affect regulation of the nearby genes. rs7720298, associated with decreased urine stream, lies in an intronic region just downstream of exon 30 of DNAH5 (MIM 603335; NM_001369.2:c.4950+1233G>C) encoding the dynein, axonemal, heavy chain 5 protein that is part of a microtubule-associated motor protein complex. Rare mutations in DNAH5 can result in development of abnormal cilia and flagella in cells that lead to primary ciliary dyskinesia, which is a disorder characterized in part by chronic respiratory tract infections (Escudier et al., 2009). In addition to playing an important role in the lung, DNAH5 is expressed in both kidney and bladder tissue, suggesting a biologic role in normal function of the urinary tract (Uhlen et al., 2015). This SNP does not lie in a site of known transcription factor binding or chromatin modification from the ENCODE catalog nor is it an eQTL based on data from GTEx, but the region of LD tagged by rs7720298 contains several transcription factor binding sites and sites of DNase hypersensitivity measured in ENCODE cell lines, suggesting that it may tag a site of transcriptional regulation.
Discussion & Conclusions This meta-analysis aimed to identify SNPs associated with late radiotherapy toxicity in a single tumor site. By having a total sample size in a single stage analysis of >1500 men with prostate cancer (versus ~600 in published studies), the study identified two risk loci. This study had ≥99% power to identify common SNPs (MAFs ≥10%) that confer a relatively large increased risk for developing late toxicity (OR ≥2.0). Identification of multiple loci in this single-stage meta-analysis versus single loci in published GWAS that used a staged approach is consistent with the increased power. The meta-analysis showed that heterogeneity in radiotherapy datasets is not a barrier for future multi-cohort radiogenomic studies. The absence of significant SNPs associated with rectal bleeding is consistent with the relatively limited statistical power. Most common variants identified via GWAS have more modest effects (ORs 1.15 to 1.5) than this study was powered to detect. Our group previously reported an excess of associations at the p<5×10 level for rectal bleeding showing many SNPs should be identified as sample sizes increase (Barnett et al., 2014). In addition, our current approach of dichotomizing toxicity at grade 0 versus grade 1 or worse and considering two years of follow-up may not be optimal for rectal bleeding, which could be explored as our radiogenomic cohorts increase.