paperKB
coga / coga-kb
Help
Sign in

Chunk #11 — Results — In-sample imputation and GWAS imputation accuracy

Source
Fast and accurate long-range phasing in a UK Biobank cohort.
Embedded
yes

Text

We next investigated the utility of Eagle for genotype imputation. First, to project the imputation accuracy that will be achievable in the UK population using LRP-based methods once a reference panel of N≈150,000 sequenced UK samples becomes available (Supplementary Fig. 1), we performed in-sample imputation of masked genotypes in the UK Biobank data set (Online Methods and Supplementary Note). In these benchmarks (Supplementary Fig. 2 and Supplementary Tables 9–11), Eagle and SHAPEIT2 both achieved mean in-sample imputation R2>0.75 down to a minor allele frequency of 0.1%. As in our switch error benchmarks (Table 1), Eagle was slightly more accurate than SHAPEIT2 run with default parameters and achieved accuracy similar to SHAPEIT2 run with K=200 states; compared to SHAPEIT2 10×15K, Eagle 1×150K was much more accurate (Supplementary Fig. 2 and Supplementary Table 9). In-sample imputation on N samples bears some similarities to standard GWAS phasing and imputation on a target sample using a reference panel of size N (as both tasks entail copying shared haplotypes—identified based on data at typed SNPs—from a set of N samples); however, the two tasks also