Chunk #63 — Results — Scenario B — Accuracy comparison on full dataset

Source: A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.
Embedded: yes

Text

One important point about this figure is that the IMPUTE v2 curve is in exactly the same place as in Figure 4. It follows from our modeling approach that simply adding SNPs to the set U1, as we have done here, will not affect the imputation of SNPs in U2. Conversely, Figure 5 shows that adding SNPs to U1 actually makes BEAGLE's imputation results worse at SNPs in U2: between the restricted and full datasets, the best-guess discordance increased from 3.46% to 4.01% in panel A and from 0.93% to 1.04% in panel B. We observed a similar decline in accuracy at rare SNPs, which are not shown separately in Figure 5. Hence, in the full Scenario B dataset, which we regard as a more realistic application of these methods, IMPUTE v2 achieves a best-guess discordance that is 15–18% smaller than BEAGLE's. In the Discussion, we propose an explanation for the change in BEAGLE's results between the full and restricted datasets.