paperKB
coga / coga-kb
Help
Sign in

Chunk #18 — Discussion

Source
Reference-based phasing using the Haplotype Reference Consortium panel.
Embedded
yes

Text

We note that Eagle2 targets a distinct user group compared to very recent work on phasing very large cohorts13,14. In particular, our Eagle1 method13 is targeted at phasing very large (N>100,000) cohorts and achieves much lower accuracy than both Eagle2 and previous methods when used to phase smaller cohorts. The SHAPEIT3 method14 is likewise targeted at phasing “biobank scale datasets.” The information provided in the paper describing SHAPEIT3 (ref.14) indicates that its primary advance is removing a quadratic complexity component of the SHAPEIT2 algorithm that becomes significant as N increases beyond 10,000 samples; this computational speedup comes at the expense of reduced accuracy. The benchmarks in ref.14 suggest that if used to perform HRC-based phasing at Nref=32,470, SHAPEIT3 would be ≈3x faster but roughly 20% less accurate than SHAPEIT2; in contrast, Eagle2 is ≈20x faster and ≈10% more accurate than SHAPEIT2 at this sample size. (In practice, the SHAPEIT license precludes its use for reference-based phasing on the Sanger and Michigan HRC servers.)