paperKB
coga / coga-kb
Help
Sign in

Chunk #31 — Methods — Initialization and MCMC iterations

Source
Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.
Embedded
yes

Text

Finally, to complete the model, we only use a subset of all available haplotypes when updating each individual as done in SHAPEIT2. We used a carefully chosen subset containing K1 = 400 haplotypes that most closely match the haplotypes of the individual being updated [10]. Note that the haplotype matching is carried out on overlapping windows of size W = 0.1Mb. Moreover, we also found useful to use an additional set of K2 = 200 randomly chosen haplotypes to help the mixing of the MCMC. So in total, we used K = 600 conditioning haplotypes. Using such a large number of conditioning haplotypes is facilitated since SHAPEIT2 has linear complexity with K.