Chunk #5 — INITIAL EVALUATION OF IMPUTED GENOTYPES AND HAPLOTYPES — HAPLOTYPING

Source: MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes.
Embedded: yes

Text

and tallied three measures of haplotyping quality [Marchini et al., 2006]: (1) the number of incorrectly imputed missing genotypes, (2) among heterozygous sites, the number of consecutive sites that are phased incorrectly with respect to each other (this is the number of “flips” required to transform estimated haplotypes into the true haplotypes, after masking incorrectly imputed sites), and (3) the number of perfectly inferred haplotypes. The three measures were averaged over all 100 regions and the results are summarized in Table I. For comparison, the table also includes results for PHASE [Stephens and Scheet, 2005b; Stephens et al., 2001] and fastPHASE [Scheet and Stephens, 2006], two state of the art haplotyping algorithms [Marchini et al., 2006], and for BEAGLE [Browning, 2006] and PL-EM [Qin et al., 2002], two alternative haplotyping algorithms that are very computationally efficient. Table I clearly shows that our method is competitive in all three measures: our method results in slightly fewer incorrectly imputed genotypes, requires slightly fewer flips to transform imputed haplotypes into the true haplotypes, and produces slightly more correctly inferred haplotypes over the entire 1 Mb stretch than PHASE, which was the second best method. Furthermore, note that estimates of haplotypes and missing genotypes