paperKB
coga / coga-kb
Help
Sign in

Chunk #25 — Methods — The phasing model for low coverage sequence data

Source
Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.
Embedded
yes

Text

We use the SHAPEIT2 model for the terms P(X{1}1,X{1}2∣H) and P(X{s}1,X{s}2,X{s−1}1,X{s−1}2∣H) We do not give more details here since a complete description can be found in the SHAPEIT2 paper [8]. The genotype likelihoods enter the model in the term P (R|X1, X2) as a product over all L sites as P(R∣X1,X2)=∏l=1LP(R∣Gl=AlXl1+AlXl2) which implies that P(R∣X{1}1,X{1}2)=∏l=b1e1P(R∣Xl1,Xl2) P(R∣X{s}1,X{s}2,X{s−1}1,X{s−1}2)=∏l=bs−1esP(R∣Xl1,Xl2)