The clustering was performed with PLINK23. First variants were filtered with the QC criteria described above. The resulting set of SNP’s was LD (linkage disequilibrium) pruned with window size of 50, shift size of 5 and correction (r2) threshold of 0.2 (–indep-pairwise 50 5 0.2). Then IBS (identity-by-state) similarity between individuals was computed (–genome) with the pruned data. Clustering was performed with this similarity matrix (–cluster). This yielded the four clusters shown in Fig. S2. Self-identified ethnicity/race composition of the clusters are shown in Fig. S2. Cluster 1 primarily contains Asians and some Hispanics. Cluster 2 contains almost all Whites and some Hispanics. Cluster 3 contains almost all Blacks. Cluster 4 mostly consists of Hispanics (See Table S1 for details).