Chunk #11 — Results — In-sample imputation and GWAS imputation accuracy

Source: Fast and accurate long-range phasing in a UK Biobank cohort.
Embedded: yes

Text

We next investigated the utility of Eagle for genotype imputation. First, to project the imputation accuracy that will be achievable in the UK population using LRP-based methods once a reference panel of N≈150,000 sequenced UK samples becomes available (Supplementary Fig. 1), we performed in-sample imputation of masked genotypes in the UK Biobank data set (Online Methods and Supplementary Note). In these benchmarks (Supplementary Fig. 2 and Supplementary Tables 9–11), Eagle and SHAPEIT2 both achieved mean in-sample imputation R2>0.75 down to a minor allele frequency of 0.1%. As in our switch error benchmarks (Table 1), Eagle was slightly more accurate than SHAPEIT2 run with default parameters and achieved accuracy similar to SHAPEIT2 run with K=200 states; compared to SHAPEIT2 10×15K, Eagle 1×150K was much more accurate (Supplementary Fig. 2 and Supplementary Table 9). In-sample imputation on N samples bears some similarities to standard GWAS phasing and imputation on a target sample using a reference panel of size N (as both tasks entail copying shared haplotypes—identified based on data at typed SNPs—from a set of N samples); however, the two tasks also