paperKB
coga / coga-kb
Help
Sign in

Chunk #61 — Online Methods — Step 3: Approximate HMM decoding

Source
Fast and accurate long-range phasing in a UK Biobank cohort.
Embedded
yes

Text

First, we compile a set of reference haplotypes for the proband for each SNP block. This procedure begins analogously to the first component of step 2, identifying long haplotype matches using a fast O(MN) search within a seed-and-extend framework. To ensure that both maternal and paternal surrogates are represented among the reference haplotypes, we augment the set of long haplotype matches with complementary haplotypes found using LSH. In total, we store K≤80 reference haplotypes per block.