The set of structural inference SNPs we chose for the PCA consists of 12,898 SNPs that had low background LD measured in the joint PLCO and NHS control samples (r 2 less than 0.004 for any pair located within 500 kb on the same chromosome). The restriction to SNPs with very low local pairwise correlation ensures that the PCA findings reflect the genome-wide variation pattern, and are not overly influenced by regional LD pattern. WTCCC also adopted this strategy [20].