paperKB
coga / coga-kb
Help
Sign in

Chunk #33 — Methods — 1000GP phase 1 low coverage sequence data

Source
Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel.
Embedded
yes

Text

We downloaded the GLs for 1,092 1000GP samples from ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20110521/. This dataset contains GLs for 36,820,992 SNPs, 1,384,273 short bi-allelic indels and 14,017 structural variations (SVs). The GLs for SNPs were computed using SNPtools [15], for Indels using [16] and SVs using [17]. We ran Beagle and SHAPEIT2 on the whole genome in chunks of 1.4 Mb with a 0.2 Mb overlaps between flanking chunks.