Chunk #34 — Methods — 1000GP phase 1 low coverage sequence data

Source: Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.
Embedded: yes

Text

Beagle was run using 20 iterations instead of the 10 by default, otherwise all other default settings were used. SHAPEIT2 was run using 78 iterations: 12 stages of 4 pruning iterations plus 30 main iterations. The estimation was carried out in windows of size W = 0.1 Mb, using k = 600 conditioning haplotypes; 400 chosen by Hamming distance and 200 chosen at random. All these computation were done using a ~1000 CPU nodes cluster. SHAPEIT2 and Beagle required ~289, ~99 CPU months to phase the whole genome 1000GP Phase 1 data set.