Chunk #2 — Introduction

Source: A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.
Embedded: yes

Text

To date, most imputation analyses have used reference panels composed of haplotypes from Phase II of the International HapMap Project, together with study samples genotyped on commercial genome-wide SNP arrays. Figure 1 depicts this arrangement, which we call Scenario A. To understand how imputation methods work in this setting, it helps to observe that the SNPs exist in a natural hierarchy, such that they can be partitioned into two disjoint sets: a set T that is typed in both the study sample and the reference panel, and a set U that is untyped in the study sample but typed in the reference panel. Informally, most imputation methods phase the study genotypes at SNPs in T and look for perfect or near matches between the resulting haplotypes and the corresponding partial haplotypes in the reference panel—haplotypes that match at SNPs in T are assumed to also match at SNPs in U. This is the fundamental basis of genotype imputation.