paperKB
coga / coga-kb
Help
Sign in

Chunk #24 — Pitfalls of the analysis — Pitfall 3: Population stratification similarity

Source
Pitfalls of predicting complex traits from SNPs.
Embedded
yes

Text

Another way in which prediction accuracy can be inflated is if the discovery and validation samples contain similar patterns of population stratification and the eventual target population is not similarly stratified. For example, this could occur if discovery and validation samples are independently sampled from a stratified population such as European Americans63. The question of whether this inflation should be viewed as a pitfall depends on the ultimate goal of the analysis. If the goal is to conduct prediction in European Americans, it is entirely appropriate to leverage ancestry information to the fullest extent possible, and this inflation is not a pitfall (because discovery, validation and target samples are similarly stratified). On the other hand, if the goal is to assess the prediction accuracy that could be achieved using less structured application populations, then this inflation is a pitfall. As an example, we show that population stratification was inflating prediction accuracy in the FHS analysis (See BOX 3 for details). A more serious problem is when there is confounding between ancestry and disease status both in discovery and validation case-control