paperKB
coga / coga-kb
Help
Sign in

Chunk #17 — SAMPLE QUALITY — Population substructure

Source
Quality control procedures for genome-wide association studies.
Embedded
yes

Text

One strategy for avoiding bias induced by population stratification is to ensure that study samples are drawn from a relatively homogenous population. One of the sites in the eMERGE network represents such a sample, as over 98% of the study sample self-reported “Caucasian,” on a study questionnaire. This percentage is consistent with data from the 2000 Census [24], and self-report often shows very high correspondence with genetically inferred ancestry [25]. Some clinics record ethnicity via observer-report (typically a clerk or nurse’s aide). Even in this settings, observer-reported ancestry closely matches genetically inferred ancestry, especially for populations of European descent [26]. However, population-based diverse samples are often desirable for genetic association studies focused on characterizing previous GWAS or candidate gene discoveries made in one population [27]. Further, combining samples from multiple sites for a joint analysis may result in population stratification in the combined sample, if both allele frequencies and outcomes differ between sites.