paperKB
coga / coga-kb
Help
Sign in

Chunk #7 — RESULTS — Evaluation of imputation accuracy using sequence data

Source
Fast and accurate genotype imputation in genome-wide association studies through pre-phasing.
Embedded
yes

Text

One caveat to the comparisons above is that SNPs on GWAS arrays tend to be more common (e.g., see Supplementary Fig. 1) and easier to impute than unascertained SNPs2. We addressed this issue by performing a cross-validation in the EUR panel of 1000 Genomes Phase I, which includes a more complete set of SNPs discovered by low-pass whole-genome and high-pass exome sequencing in >1,000 individuals. For each of the 381 EUR individuals in turn, we masked genotypes on chromosome 10 at all sites except those included on Affymetrix 500k SNP arrays, then imputed the missing sites using the Affymetrix 500k scaffold and the remaining 760 EUR haplotypes. To mimic pre-phasing in a GWAS, we reduced the EUR dataset to sites present on the array scaffold, re-phased the genotypes, and then used these estimated haplotypes when imputing masked genotypes for a given individual.