paperKB
coga / coga-kb
Help
Sign in

Chunk #10 — Results — Phasing accuracy

Source
Fast and accurate long-range phasing in a UK Biobank cohort.
Embedded
yes

Text

Finally, we assessed the accuracy of analysis options for efficiently phasing N≈150,000 samples. In addition to Eagle (run on all N≈150,000 samples together, 1×150K), we considered batching approaches requiring up to 3x the running time of Eagle 1×150K. Based on our running time benchmarks (Fig. 2a and Supplementary Table 1), SHAPEIT2 or HAPI-UR analysis of the data in 10 batches of N≈15,000 samples (10×15K) satisfied this constraint. We benchmarked each method on three chromosome-scale tests: the short arm of chromosome 1 (26,695 SNPs), chromosome 10 (31,090 SNPs), and chromosome 20 (16,367 SNPs), amounting to 12% of the genome. Our results (Supplementary Table 8) confirmed our previous benchmarks (Figure 2) and were consistent across chromosomes. In particular, we observed that Eagle analysis of all N≈150,000 samples together completed 3x faster than SHAPEIT2 10×15K analysis while achieving a 77% (1%) decrease in switch error rate: 0.31% (0.01%) for Eagle 1×150K vs. 1.35% (0.05%) for SHAPEIT2 10×15K (Supplementary Table 8).