Chunk #40 — Introduction — Practical considerations and opportunities

Source: The ENIGMA Consortium: large-scale collaborative analyses of neuroimaging and genetic data.
Embedded: yes

Text

First, all genotypic data was imputed to the HapMap3 reference dataset to correct for diversity of the genotyping chips (this reference was subsequently updated to the 1,000 Genomes dataset for ENIGMA2, which began in May 2012). The imputation protocols (detailed at enigma.ini.usc.edu) and the standard reference datasets allow the consistent reporting of genotypes at the same set of genetic loci across cohorts. Imputation effectively adds prior knowledge to the data and may in this way also increase the power of a study. Not all ENIGMA cohorts are Caucasian: the GOBS cohort consists of Mexican-Americans and the NIMH cohort contains a significant number of African-Americans. As in any GWAS, population structure is taken into account during the statistical modeling of associations, to ensure that differences in SNP frequencies with ancestry are not picked up as spurious associations. Also in the imputation step, the appropriate reference populations are used for each individual, which may in some cases be Yoruban or Hispanic, as well as the CEU population that is used to represent Caucasians. However, it is not a computationally trivial task to