To assemble a set of common SNPs informative for inference of population substructure (called structure inference SNPs) for GWAS, initially we identified a set of 40,817 autosomal SNPs common to Affymetrix 500 k, Illumnia HumanHap300 and Illumina HumanHap550, filtered on the basis of a completion rate greater than 95% in both CGEMS scans, minor allele frequency (MAF) >5%, and a fitness for Hardy-Weinberg proportion exact test P-value >10−3 in both control sets. From this pool of SNPs, using our described selection algorithm we selected 12,898 structure inference SNPs that had low background LD in the joint PLCO and NHS control samples (r 2 less than 0.004 for any pair located within 500 kb on the same chromosome). The detailed list is provided in the Table S2, together with a visual representation of the position and observed MAF of the SNPs on the chromosomes (Figure S1).