Chunk #77 — Discussion — Computational strategies for imputation with large, sequence-based reference panels

Source: Genotype imputation with thousands of genomes.
Embedded: yes

Text

Another technique for increasing the efficiency of imputation is called “pre-phasing.” The idea is to (pre-)phase the assayed genotypes in a GWAS dataset, then impute directly into the inferred haplotypes; this speeds up imputation by more than an order of magnitude at the cost of a small amount of accuracy (B. Howie and C. Fuchsberger, unpublished data). In principle, most imputation methods could use this approach, and researchers can already download implementations based on the IMPUTE2 and MaCH models (the MaCH implementation is called “minimac”). We have found that khap has similar accuracy characteristics in both unphased and pre-phased GWAS datasets (data not shown), so we view pre-phasing as being complementary to our surrogate family approximation: both approaches speed up imputation, and they can be used together for even greater efficiency.