paperKB
coga / coga-kb
Help
Sign in

Chunk #56 — Results — Computational benchmarking

Source
Genotype imputation with thousands of genomes.
Embedded
yes

Text

Table 2 illustrates the computational benefits of our surrogate family approximation. In the Cosmopolitan reference panel with 4800 haplotypes, reducing khap from 4800 to 500 decreased IMPUTE2’s running time by a factor of 4.8. Another way of viewing this is to notice that with khap fixed at 500, IMPUTE2's running time increased by only a factor of 1.4 when moving from a panel with 1000 haplotypes to a panel with 4800 haplotypes. By comparison, Beagle's running time increased by a factor of 9 with the same panels. In this setting, fixing khap fixes the cost of the imputation calculations used by IMPUTE2, so the 1.4-fold increase in running time at khap = 500 reflects the additional time needed to evaluate a larger number of haplotypes when choosing which 500 to use for imputation. Preliminary experiments suggest that this evaluation step could be shortened by ignoring divergent haplotypes after the first few iterations of the algorithm (data not shown), which would make the overall running time almost independent of the number of reference haplotypes for fixed khap.