Chunk #23 — Online Methods — Computational Costs

Source: Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.
Embedded: yes

Text

Many existing imputation methods (e.g., MaCH and IMPUTE1) use analytical integration to account for the unknown phase of GWAS genotypes. The computational cost of this approach is proportional to the number of GWAS individuals (N), the number of genotyped markers in the reference panel (MREF), and the square of the number of reference haplotypes (H), or O(N * MREF * H2). Some methods, such as fastPHASE23 and Beagle25, reduce H by grouping similar haplotypes into clusters. The quadratic term affects all markers, whether they are typed in a GWAS or just in the reference panel. Consequently, the computational cost grows quickly with reference panel size, and it can be time-consuming to run these methods on modern reference datasets.