Genotypes, and, where possible, haplotypes, were inferred for all SNPs and short indels, and for most larger deletions (see Supplementary Information, and Table 1). For the low coverage data, statistically phased genotypes were derived by using LD structure in addition to sequence information at each site, in part guided by the HapMap 3 phased haplotypes. SNP genotype accuracy varied considerably by pilot, coverage and allele frequency. In the low coverage project, the overall genotype error rate (based on a consensus of multiple methods) was 1-3% (Fig. 2c, Supplementary Fig. 10). The accuracy at heterozygous sites, a more sensitive measure than overall accuracy, was approximately 90% for the lowest frequency variants, increased to over 95% for intermediate frequencies and dropped to 70-80% for the highest frequency variants (i.e., those where the reference allele is the rare allele). We note that these numbers are derived from sites that can be genotyped using array technology, and performance may be lower in harder to access regions of the genome. We find only minor differences in genotype accuracy between populations, reflecting differences in coverage as well as haplotype diversity and extent of LD.