Chunk #26 — Methods — Initialization and MCMC iterations

Source: Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.
Embedded: yes

Text

The experience of the 1000GP analysis group is that phasing approaches based on HMMs such as Thunder and Impute2 are slow to converge when applied to low-coverage sequence data if the starting haplotype estimates are initialised randomly. It has been observed that the Beagle method does not have this property, and that Thunder and Impute2 benefit from the use from using an initial set of haplotypes estimated via Beagle. The 1000GP Phase 1 haplotyes were estimated in this way by first running Beagle and then using these haplotypes as initial estimates in the Thunder model [1].