177 breast cancer samples with Caucasian ancestry were selected from TCGA database using EIGENSTRAT. The corresponding RNA-sequencing datasets are available in BAM format from “Cancer Genomics Hub” (CGHub)(https://cghub.ucsc.edu/). We marked all the exonic SNP loci (marker SNP, Table S6, related to Figure 2) mapping to the three TF genes (ESR1, MYC and KLF4) based on NCBI dbSNP build 135. Then for each sample j, we retrieved all the RNA-sequencing reads mapped to the marker SNP locus i and counted for the occurrence of reference (Aij) and alternative (Bij) alleles.