Chunk #37 — Methods — Imputed coverage estimation

Source: Smokescreen: a targeted genotyping array for addiction research.
Embedded: yes

Text

Imputation coverage was estimated for the Smokescreen array using an approach similar to that described by Nelson et al. [62]. 1000 Genomes Project Phase 1 data (March 2012 release) were extracted for the Smokescreen content as an imputation inference set. All 1000 Genomes Project Phase 1 SNPs were used as the reference set. For each population (EUR, ASN, YRI), groups of 10 samples were created. For each group, the samples were kept in the inference set and excluded from the reference set, and imputed separately from other groups. Samples with known or cryptic relatedness with other samples in the 1000 Genomes Project were excluded both from the inference set groups and reference sets. Beagle version v4.0 release 1230 was used for phasing and imputation of chromosomes 1 to 22 with default settings [90]. Imputation was broken up by chromosome and results were then combined for all SNPs and groups of samples. For each SNP, we computed the correlation between the imputed dosages and measured genotypes from the 1000 Genomes Project (obsRSQ). The obsRSQ were then summarized overall (genome-wide), for the addiction-genes and the fine-mapping regions for each population.