rate of 0.2%. Very roughly, these correspond to the performance of early versions of next generation re-sequencing technologies; newer versions of these technologies can generate longer and more accurate reads and should thus outperform the simulations presented here. We then re-sequenced between 100 and 400 individuals at different depths and used our approach to reconstruct haplotypes and genotypes for each individual. Note that the simulated reads are typically too short to include useful information on phase (because they will generally include only zero or one sites that truly differ from the reference sequence). In addition, given the large number of bases examined, they will also suggest a large number of false polymorphic sites. To control false-positive variant calls, it is imperative to confirm true polymorphic sites either by examining overlapping similar reads from the same individual or, potentially, from other individuals who share a similar haplotype.