Chunk #46 — DISCUSSION

Source: MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes.
Embedded: yes

Text

Our method uses an HMM to describe genetic variation along each haplotype. It is clear that when HMM models are applied to genetic data, many opportunities for identifying computational efficiencies exist [Abecasis et al., 2002; Gudbjartsson et al., 2000; Idury and Elston, 1997; Kruglyak and Lander, 1998; Lander and Green, 1987]. In the methods section we describe several optimizations that we have already implemented, including a general strategy for reducing memory requirements for the Baum algorithm [Baum, 1972; Wheeler and Hughey, 2000]. We expect that further efficiencies will be forthcoming. Our model is implemented in the MaCH package (freely available with C++ source code from our website, see http://www.sph.umich.edu/csg/abecasis/mach/). Our implementation can be used to carry out all the analyses described in this paper. Specifically, it can estimate haplotypes, impute missing genotypes in a variety of populations, using the HapMap sample or another set of densely genotyped individuals as a reference, analyze shotgun re-sequencing data from high-throughput technologies now being developed, and carry out simple tests of association.