groups we determined the informativeness (In) [22] for each of >300K SNPs. The 20,000 SNPs with the highest In values were then selected to capture the most informative SNPs. To ensure both a more uniform genome-wide distribution and minimize linkage disequilibrium the set of putative European substructure ancestry informative markers (ESAIMS) were chosen to obtain the markers with highest In with a minimum inter-SNP distance >500 Kb. This resulted in a set of 1441 SNPs (Table S1).