paperKB
coga / coga-kb
Help
Sign in

Chunk #62 — Online Methods — Step 3: Approximate HMM decoding

Source
Fast and accurate long-range phasing in a UK Biobank cohort.
Embedded
yes

Text

Second, we compute an approximate Viterbi decoding of a diploid HMM similar to the Li-Stephens model46 using the sets of local reference haplotypes found above. A path through the HMM consists of a sequence of state pairs (one maternal reference haplotype and one paternal reference haplotype) at each location; we score a path according to the number of transitions on the maternal side, the number of transitions on the paternal side, and the number (and types) of Mendel errors between the proband and surrogate parents. An exact Viterbi decoding of this HMM using dynamic programming requires O(MK3) time (for K2 state pairs and O(K) possible transitions per position), which is too expensive for us; instead, we perform the dynamic programming within a beam search, pruning the search space from K2 state pairs to the top P=100–200 state pairs at each location and thus limiting the complexity to O(MKP). We then phase the proband according to the approximate Viterbi path.