Bioinformatic Method to Define Epigenetically Regulated Enhancer Elements Associated with Cancer
Sabedot TS, Cassel SH, Gao GF, Lareau CA, Cherniack A, Lazar A, Kadoch C, and Noushmehr H. Bioinformatic Method to Define Epigenetically Regulated Enhancer Elements Associated with Cancer. Cancer Res 2019; 79(13).
BACKGROUND: Several mechanisms involved in gene regulation are altered in cancer. Cataloging these alterations can lead to a better understanding of tumorigenesis. In addition, the alterations can be used to classify patients with similar clinical features and thereby lead to better targeted treatment. Epigenetics (e.g. DNA methylation) is the process by which cells define gene regulation and aberrant DNA methylation patterns have been observed in many cancer types. Alterations in non-promoter (intergenic regions) have been shown to be tightly associated with functional genomic elements such as enhancers or transcription factor binding. In order to identify altered candidate functional elements associated with specific gene or pathways, we developed a method to integrate enhancer, DNA methylation and gene expression data, using tumor and non-tumor data. METHOD: Using epigenome-wide platform (Illumina 850K), CpG probes were separated into promoters and intergenic regions. The intragenic CpGs were further filtered by overlapping with known functional enhancer database from multiple studies. The nearest genes to each CpG enhancer is further stratified based on differential gene expression. Each CpG/gene pair is classified as methylated or unmethylated by sample, using a 50% methylation cutoff. The mean expression of the methylated samples is calculated and pairs with lower than the bottom 10% (1.28 standard deviation) of the mean expression in the unmethylated group of samples are selected. Finally, CpG/gene pairs with at least 75% of the methylated samples have expression values lower than the mean expression in the unmethylated group of samples are classified as epigenetically silenced. CpG/gene pairs unmethylated and upregulated are called as epigenetically active. By separating samples into different epigenetically deregulated states (silenced or active), we can further characterize each sample by evaluating the association or enrichment for specific clinical features such as outcome, treatment, age at diagnosis, etc. RESULTS: As a proof of concept, we applied our method across the TCGA PanCan cohort to identify potential enhancers regulating genes encoding subunits of the SWI/SNF protein complex. Our method was able to detect several deregulated enhancers associated with SWI/SNF genes specifically altered in each tumor type, independent of mutation. We validated the results using Hi-C data from primary cancer cell lines.