paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #34 — Methods — 1000GP phase 1 low coverage sequence data

Source
Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.
Embedded
yes

Text

Beagle was run using 20 iterations instead of the 10 by default, otherwise all other default settings were used. SHAPEIT2 was run using 78 iterations: 12 stages of 4 pruning iterations plus 30 main iterations. The estimation was carried out in windows of size W = 0.1 Mb, using k = 600 conditioning haplotypes; 400 chosen by Hamming distance and 200 chosen at random. All these computation were done using a ~1000 CPU nodes cluster. SHAPEIT2 and Beagle required ~289, ~99 CPU months to phase the whole genome 1000GP Phase 1 data set.