A set of 952 self-identified participants of diverse European descent genotyped with >300K SNPs was used for the first phase of European population substructure analysis. This participant group predominantly included European Americans as well as smaller numbers of individuals from Italy and Spain (see Methods). In order to reduce potential noise created by continental admixture this study included only those individuals who did not have evidence of non-European continental ancestry (see Methods). The genotypes were examined using the principle component analysis (PCA) algorithm implemented in the EIGENSTRAT program [11], a computational method that enables rapid analyses of very large datasets. Using multiple criteria including ANOVA, a split half reliability test (see Methods) and a test for normality of distribution, substructure was present in multiple principle components (Table 1). However, most of the variance among the populations was observed in the first principal component (PC). This PC accounted for >5 fold the variance of the second PC.