For initial STRUCTURE analyses we selected random SNPs based on a minimum inter-SNP distance 500 kb; there was no evidence for LD among adjacent markers in each self identified ethnic set (r2 < 0.2). The selected sets contained 3500 to 4500 SNPs that were suitable for STRUCTURE analyses. Larger SNP sets have extraordinary computational time requirements for accurate estimates of the parameter values when applied to studies with large sample sizes.