Chunk #7 — Genotype Imputation

Source: Critical Issues in the Inclusion of Genetic and Epigenetic Information in Prevention and Intervention Trials.
Embedded: yes

Text

There are numerous approaches to direct imputation of genotypes. Initially, the goal of such approaches was to exploit linkage disequilibrium to allow the testing of ungenotyped markers (Clark & Li, 2007). For example, a known, functional SNP may be highly correlated with multiple nearby genotyped SNPs. Given an accurate reference panel as the source of correlation between markers, the un-genotyped marker can be imputed and tested for association (Clark & Li, 2007). As GWAS data became widely available, the use grew with the goal of creating common marker sets to allow meta-analysis of datasets genotyped on different GWAS panels. These approaches rely on assigning genotypes above a specified level of certainty and subsequent analysis, using standard approaches, of the resultant inferred genotypes (Lin & Huang, 2007). Genome-wide SNP imputation is commonly performed using Impute2 (Howie, Donnelly, & Marchini, 2009) or MACH (Li, Willer, Ding, Scheet, & Abecasis, 2010) on data that are pre-phased using SHAPEIT (Delaneau, Zagury, & Marchini, 2013). Pre-phasing refers to the computational process of constructing haplotypes, or linear combinations of alleles along a chromosome, prior to imputation.