Chunk #14 — Methods — Imputation Performance Metrics

Source: Assessment of genotype imputation performance using 1000 Genomes in African American studies.
Embedded: yes

Text

observed agreement between imputed and true genotypes (i.e., concordance rate) and subtracting out chance agreement, based on the sum of products of marginal frequencies that would occur if genotypes were called at random [30]. Therefore, the IQS metric is particularly useful for evaluating imputation accuracy of low frequency SNPs. Concordance and IQS results were averaged across all masked SNPs and then averaged across the 10 different sets of randomly masked SNPs. Third, we calculated r2hat (estimated squared correlation between each imputed genotype and its true underlying genotype) using the genotype dosage values and then averaged the r2hat values across all polymorphic imputed SNPs. Each software program generates its own imputation quality metric for each SNP (info for IMPUTE2, allelic r2 for BEAGLE, and r2 for MaCH and MaCH-Admix), as reviewed by Marchini and Howie [2]. The program-specific metrics are highly correlated [2], but because they are different in character, calculation of r2hat (script available at http://www.sph.umich.edu/csg/yli/software.html) was needed to generate a single, common metric to assess imputation quality across the programs.