paperKB
coga / coga-kb
Help
Sign in

Chunk #14 — MATERIALS AND METHODS — IMPUTATION ACCURACY

Source
Practical considerations for imputation of untyped markers in admixed populations.
Embedded
yes

Text

Each program returns the full probability distribution of the imputed genotypes at each SNP for each individual. We generated discrete imputed genotypes by accepting a call if the posterior probability for a genotype reached a pre-specified threshold or recorded the genotype as missing otherwise. For the phase II data, of the 10,788 SNPs experimentally genotyped for chromosome 22, 10,224 SNPs were present in the combined reference panel (Fig. 1). We masked the experimentally determined genotypes for a randomly selected 2% of these SNPs in the study sample, yielding ~200 masked SNPs for each of the three reference panels (phase II CEU, phase II YRI, and phase II CEU+YRI). Similarly, when using the phase III reference data, we masked 200 of the SNPs in the study sample. Due to the fact that both experimentally determined and imputed genotypes are called with some degree of error, we cannot know which call (if either) is correct, so we report concordance rather than accuracy. Concordance was defined as the proportion of genotype calls for which both imputed alleles matched the experimentally determined genotype call for a SNP, averaged over all masked SNPs. The genotype error rate was defined as one minus the concordance.