Chunk #25 — Materials and Methods — Choice of conditioning states

Source: A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.
Embedded: yes

Text

One solution, implemented in the phasing routine of the MACH software, is to use only a random subset of the available haplotypes for each update. For example, when sampling a new haplotype pair from in Step 1 of our algorithm for Scenario A, we could use a random subset of k haplotypes drawn from to build the conditional distribution, rather than the default approach of using all of the haplotypes. This approximation to the model will generally decrease accuracy, but it will also cause the computational burden of the phasing updates to increase linearly (for fixed k), rather than quadratically, with the number of chromosomes in the dataset.