Chunk #9 — Introduction

Source: A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.
Embedded: yes

Text

We apply the same inference principles to Scenario B as to Scenario A: at each MCMC iteration we phase all of the observed data, pooling information across samples typed on common sets of SNPs to estimate each haplotype pair, then perform haploid imputation assuming that all of the haplotype guesses are correct. One novelty of this scenario is that, at SNPs in U2, the reference panel may contain thousands of chromosomes, in contrast to HapMap Phase II panels that contain only 120–180 chromosomes each. In principle, this added depth should improve imputation accuracy at SNPs in U2, with notable gains at rare SNPs. The latter point is especially relevant because rare SNPs are an important source of power in imputation analyses [5],[6]. Scenario B also introduces the problem of multiple reference panels genotyped on different, hierarchical sets of SNPs. Many next-generation imputation datasets will follow this paradigm, which presents modeling challenges that remain largely unexplored.