Chunk #3 — Materials and Methods — IMPUTE2 algorithm

Source: Genotype imputation with thousands of genomes.
Embedded: yes

Text

To reduce the computational burden of Step 1, Howie et al. (2009) introduced an approximation that restricts each phasing update to a set of k template haplotypes, which are chosen separately for each individual at each iteration; the other templates are implicitly assigned copying probabilities of zero. The k templates are chosen by computing Hamming distances between an individual's current sampled haplotypes and each possible template haplotype. We refer to the k templates with the smallest distances as “surrogate family members” because they (ideally) share recent ancestry with the study individual. [These haplotypes were called “informed conditioning states” in the Howie et al. (2009) paper and early versions of the IMPUTE2 documentation. We now prefer the nomenclature used here because of the approximation’s relationship to the “surrogate parent” phasing method of Kong et al. (2008).]