LD was calculated using the phased genotype information accompanying the 1000 Genomes Project pilot release (17). VCFTools (33) was used to perform the calculation, using an LD threshold of r2 = 0.80, and a maximum distance between variants of 200 kb. Results from VCFTools were then consolidated such that for every variant in our database, a list of linked variants is accessible for each of the three populations, along with an r2 value.