Chunk #6 — Results — Principal Component and Cluster Analyses Show Major Differences between European Populations

Source: Analysis and application of European genetic substructure using 300 K SNP information.
Embedded: yes

Text

The same dataset was also examined using a Bayesian clustering algorithm (STRUCTURE) [21]. For these analyses we examined three sets of >3500 SNPs that were selected randomly except for the criterion that the minimum inter-SNP distance was >500 Kb (see Methods). This was done to both ensure genome-wide distribution and eliminate linkage disequilibrium between SNPs. This analysis similar to our previously reported studies was most consistent with two population groups (K = 2) explaining the major substructure in this set of European individuals (Figure 1C). The distribution of the individuals (K = 2) was similar to that shown on the first axis of the PCA (Figure 1D) and the individual population contributions were highly correlated with the first PC scores (r2 > 0.95 for each of the three random sets compared with the for the 500K SNP data analyzed by PCA).