We selected 250,160 SNP loci with allele frequencies above 0.05 from the SNP profiles of 407 breast cancer samples (the subjects) and 415 HapMap cell lines (the controls). Then we combined the profiles of the 259,001 SNPs from the breast cancer samples and the HapMap cell lines. The combined SNP profiles were run on the EIGENSTRAT program and the top two principal components (PC1 and PC2) were retrieved. From the 273 samples that co-clustered with HapMap CEU controls, we further selected 171 cases with high ESR1 expression levels (above 1) and 48 cases with low ESR1 expression (below 0) with both segmented copy number and methylation measures available.