The current study identifies SNPs that are particularly informative for European population substructure (Tables S1 and S2). This includes two SNP sets: one that distinguishes substructure along the “north/south” gradient and the other that distinguishes substructure along a west-east gradient among northern European groups tested. Together these ESAIMs appear to provide good control for subpopulation differences in the NYCP individuals as demonstrated by testing a real dataset using both EIGENSTRAT and structured association methods. Additional studies will be necessary to further optimize ESAIM sets and in particular to determine their efficacy in additional European and European American sample groups that may have different ancestral representation. Finally, it is worth noting that particularly informative ESAIMs may correspond to population selection events and hence also be linked to important biologic processes. The most informative locus for the “north/south” distinction, a lactase gene associated SNP, has been previously noted in this regard [6]. Another strong candidate for selection includes the IRF4 gene that is an important immunologic response regulator [30–33], and ongoing studies are examining these and other genes for evidence of positive selection in different subgroups.