paperKB
coga / coga-kb
Help
Sign in

Chunk #0 — Online methods — Union site list

Source
A reference panel of 64,976 haplotypes for genotype imputation.
Embedded
yes

Text

Every study provided us with their most recent version of their haplotypes in VCF format with one VCF for every autosome. For every cohort, bcftools (v0.2.0-rc12) was used to create an entire-autosome, SNP-only site list with alternate and total allele count information from these per-chromosome haplotypes. Multiallelic SNPs were broken into biallelics using ‘bcftools norm’. These per-cohort site lists were merged into a single file using an in-house Perl script that correctly merges alternate and total allele counts. We created site lists called MAC2 and MAC5 containing only sites with a minor allele count (MAC) across all studies of >= 2 and >=5, respectively, using bcftools. These sites lists contained 95,855,206 and 51,060,347 sites, respectively.