For our simulations we used the 381 diploid European individuals from the phase 1 release of the 1000 Genomes Project (June 2011)2. The 381 individuals include 87 CEPH individuals of North European ancestry (CEU), 93 Finnish individuals from Finland (FIN), 89 British individuals from England and Scotland (GBR), 98 Tuscan individuals (TSI), and 14 individuals from the Iberian peninsula (IBS). Genotype calls and haplotypic phase was inferred from low-coverage sequencing (4x) using an imputation strategy that borrowed information across samples and loci. The 762 haplotypes were split at random between two panels of 381 haplotypes; one panel was used to build simulated data, and the other was used as an imputation reference panel. We simulated data for 100 samples by randomly sampling (without replacement) pairs of haplotypes from the simulation panel. All simulation results were generated over 10 distinct 5Mb regions (total of 50Mb) across the genome, randomly chosen to represent the average genome-wide recombination rate and SNP density (Supplementary Note). Reads spanning polymorphic sites identified in the 1000 Genomes Project were simulated assuming a fixed error rate of 1%,