Chunk #1 — RESULTS — Haplotype estimation

Source: Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.
Embedded: yes

Text

These results demonstrate that pre-phasing can greatly speed up the imputation process, but the accuracy of imputation with this shortcut may depend on how well the GWAS haplotypes were estimated. The accuracy of computationally estimated haplotypes depends on a number of factors including marker density, relatedness of sampled individuals, sample size, and demography12,13. In founder populations14, long-range haplotypes can be estimated very accurately even with modest sample sizes9. For example, by comparing the results of population- and trio-based phasing in Finnish samples from the FUSION study of type 2 diabetes2,15, we estimate that population phasing produces <1 switch error16 per 5.5 megabases (Mb). These results were aided by the relatively large number of genotyped individuals (>2,000) and by the fact that Finland is a founder population in which long haplotypes are shared between apparently unrelated individuals. In more diverse populations, haplotype estimation may often be less accurate. For example, average distances between switches in European GWAS datasets are typically in the range of 0.6 to 1.4 Mb17.