Chunk #5 — Background and Objectives

Source: The utility of empirically assigning ancestry groups in cross-population genetic studies of addiction.
Embedded: yes

Text

The most common approach to assessing ancestry and population structure is the use of ancestry principal component analysis (PCA), which has been applied to adjust for global and local population structures such as Eastern-Western European differences9 and regional differences in China.10 The principal components (PCs) can be used to exclude outliers from otherwise homogenous groups and, importantly, used as covariates in association analyses to reduce effects of population stratification. However, the statistical justification for excluding outliers is often not clear (visual inspection), simplistic, or non-multivariate. Samples that show evidence for admixture or have missing self reported race/ethnicity/ancestry are often excluded which unnecessarily reduces sample size. Critically, the choice of PCs included in GWAS as covariates is commonly not done empirically. That is, not appropriately assessing if they are associated with the trait of interest or with some other technical artefact such as batch effect. The practice of blindly including PCs in GWAS (typically 10–20) can negatively impact results. First, if too many PCs are included association models may become overfit, which can reduce the power to detect etiologically relevant variation