Chunk #51 — Materials and Methods — Imputation of genetic data

Source: Bayesian test for colocalisation between pairs of genetic association studies using summary statistics.
Embedded: yes

Text

To speed up the imputation process, the genome was broken into small chunks that were phased and imputed separately and then re-assembled. This was achieved using the ChunkChromosome tool (http://genome.sph.umich.edu/wiki/ChunkxChromosome), and specifying chunks of 1000 SNPs, with an overlap window of 200 SNPs on each side, which improves accuracy near the edges during the phasing step. Each chunk was phased using the program MACH1 with the number of states set to 300 and the number of rounds of MCMC set to 20 for all chunks. Phased haplotypes were used as a basis for imputation of untyped SNPs using the software Minimac with 1000 Genomes European ancestry reference haplotypes (phase1 version 3, March 2012) to impute SNPs not genotyped on the Illumina array. Variants with a MAF less than 0.001 were also excluded post-imputation. The data was then collated in probability format that can be used by the R Package snpStats [39].