Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations.
- Authors
- Wang, Ying; Guo, Jing; Ni, Guiyan; Yang, Jian; Visscher, Peter M; Yengo, Loic
- Year
- 2020
- Journal
- Nature communications
- PMID
- 32737319
- DOI
- 10.1038/s41467-020-17719-y
- PMCID
- PMC7395791
Polygenic scores (PGS) have been widely used to predict disease risk using variants identified from genome-wide association studies (GWAS). To date, most GWAS have been conducted in populations of European ancestry, which limits the use of GWAS-derived PGS in non-European ancestry populations. Here, we derive a theoretical model of the relative accuracy (RA) of PGS across ancestries. We show through extensive simulations that the RA of PGS based on genome-wide significant SNPs can be predicted accurately from modelling linkage disequilibrium (LD), minor allele frequencies (MAF), cross-population correlations of causal SNP effects and heritability. We find that LD and MAF differences between ancestries can explain between 70 and 80% of the loss of RA of European-based PGS in African ancestry for traits like body mass index and type 2 diabetes. Our results suggest that causal variants underlying common genetic variation identified in European ancestry GWAS are mostly shared across continents.
Trans-ancestry relative prediction accuracy of PGS in different simulation scenarios.Relative accuracies (RA) were calculated as the ratio of the squared correlation between PGS and simulated trait in UKB participants of non-European ancestry over the same squared correlation estimated in 10,000 independent UKB participants of European ancestry (Methods). We varied trait heritability (h2 = 0.25 and 0.5) and numbers of MC causal variants (MC = 1000, 5000 and 10,000) in the simulations. RAobs refers to the observed RA calculated. The predicted RA labelled as RApred1 is estimated using Eq. (1) based on parameters calculated from SNP pairs of PGS-SNPs and known causal variants within 100 kb; RApred2 refers to RA calculated using SNP pairs of PGS-SNPs and candidate causal variants using Eq. (2). RApred3 refers to the naive predicted RA using Eq. (1) when assuming that PGS-SNPs are the causal variants. The numbers under the ancestry labels in x-axis denoted the pairwise FST calculated using HapMap3 SNPs between discovery population and target population (see Supplementary Note 2). Boxes represent the first and third quantiles and whiskers are 1.5-folds the interquartile range. The points represent the RA for 100 replicates. The median estimates are shown as the horizontal line in the boxes.
Impact of negative selection on PGS trans-ancestry relative accuracies.Relative accuracies (RA) of PGS in different ancestries under various strengths of negative selection. Traits were simulated with a heritability h2 = 0.5 and assuming MC = 5000 causal variants. Negative selection was modelled using a parameter S such that smaller values of S indicate stronger strength of selection. Values of S are denoted S1 and S2 in the discovery population and target populations, respectively. We considered thee scenarios: a S1 = S2 = โ0.5; b S1 = โ0.5, S2 = โ0.75; and c S1 = โ0.75, S2 = โ0.5. RAobs, RApred1, RApred2 and RApred3 labels are defined as in the legend of Fig. 1. Boxes represent the first and third quantiles and whiskers are 1.5-folds the interquartile range. The points represent the RA for 100 replicates. The median estimates are shown as the horizontal line in the boxes.
Trans-ancestry relative prediction accuracy of PGS of 5 quantitative traits and three common diseases.aโc Relative accuracies (RA) are calculated as the ratio of the squared correlation between PGS and traits/diseases in UKB participants of non-European ancestry over the same squared correlation estimated in ~20,000 independent UKB participants of European ancestry. We report here only ancestry-trait/disease pairs, with a significant reduction in RA (Wald test, p-value <0.05). Data for all ancestry-trait/disease pairs are provided in Supplementary Table 2. RApred(LD+MAF) refers to the RA predicted from Eq. (2) only using information from LD and MAF differences between ancestries. RAobs refers to observed RA calculated using independent genome-wide significant trait-associated SNPs. Panels dโf show the proportion of the loss of accuracy (LOA) explained by LD and MAF (Supplementary Fig. 12) calculated as 100% ร (1 โ RApred(LD+MAF))/(1 โ RAobs). The grey dashed lines are y = 100% and y = 50%. Error bars in the figure represent the standard errors of observed RA or proportion of LOA explained by LD and MAF in each ancestry-trait/disease pair, which calculation is detailed in Supplementary Note 7.
No entities extracted from this document yet.
No uploaded files.
In this knowledge base
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Ancient DNA reveals pervasive directional selection across West Eurasia. | Akbari A et al. | โ | 2026 | โ |
| Cross-Ancestry Polygenic Prediction: Comparing Methods and Assessing Transferability Across Traits. | Momin MM et al. | โ | 2026 | โ |
| Mind the gap: Characterizing bias due to population mismatch in two-sample Mendelian randomization. | Li J et al. | โ | 2026 | โ |
| The effect of long-range linkage disequilibrium on allele-frequency dynamics under stabilizing selection. | Negm S et al. | โ | 2026 | โ |
| Three open questions in polygenic score portability. | Wang JY et al. | โ | 2026 | โ |
| Characterizing selection on complex traits through conditional frequency spectra. | Patel RA et al. | โ | 2025 | โ |
| Common and rare variant analyses implicate late-infancy cerebellar development and immune genes in ADHD. | Zhong Y et al. | โ | 2025 | โ |
| Complex genetic effects linked to plasma protein abundance in the UK Biobank. | Sigurdsson AI et al. | โ | 2025 | โ |
| Data Representation Bias and Conditional Distribution Shift Drive Predictive Performance Disparities in Multi-Population Machine Learning | Kumar S et al. | โ | 2025 | โ |
| Eleven Grand Challenges for Inflammatory Bowel Disease Genetics and Genomics. | Gibson G et al. | โ | 2025 | โ |
| Evaluating polygenic risk score prediction performance for Alzheimer's disease in a population-based Hispanic cohort using single- and multi-ancestry models. | Xu Y et al. | โ | 2025 | โ |
| Family-based genome-wide association study designs for increased power and robustness. | Guan J et al. | โ | 2025 | โ |
| Fine-scale population structure and widespread conservation of genetic effect sizes between human groups across traits. | Hu S et al. | โ | 2025 | โ |
| Hidden structure in polygenic scores and the challenge of disentangling ancestry interactions in admixed populations. | Aw AJ et al. | โ | 2025 | โ |
| How accurate is genomic prediction across wild populations? | Aase K et al. | โ | 2025 | โ |
| Leveraging genome-wide association studies to better understand the etiology of cancers. | Sonehara K et al. | โ | 2025 | โ |
| Methodological opportunities in genomic data analysis to advance health equity. | Lehmann B et al. | โ | 2025 | โ |
| Novel Method for Predicting Lp(a) From Genomic Testing Identifies ASCVD Risk Across a Diverse Cohort. | Telis N et al. | โ | 2025 | โ |
| Pan-UK Biobank genome-wide association analyses enhance discovery and resolution of ancestry-enriched effects. | Karczewski KJ et al. | โ | 2025 | โ |
| Polygenic prediction and gene regulation networks. | Poyatos JF | โ | 2025 | โ |
| Polygenic prediction of body mass index and obesity through the life course and across ancestries. | Smit RAJ et al. | โ | 2025 | โ |
| Polygenic prediction of human complex traits using ancient DNA. | Mathieson I | โ | 2025 | โ |
| Polygenic Risk Scores for Preeclampsia Prediction Beyond Gold-Standard Clinical Models in Multiethnic Populations. | Ardissino M et al. | โ | 2025 | โ |
| Population structure limits the use of genomic data for predicting phenotypes and managing genetic resources in forest trees. | Slavov GT et al. | โ | 2025 | โ |
| Precision medicine for obesity: current evidence and insights for personalization of obesity pharmacotherapy. | Anazco D et al. | โ | 2025 | โ |
| Recent Statistical Innovations in Human Genetics. | Balding DJ et al. | โ | 2025 | โ |
| Risk factors affecting polygenic score performance across diverse cohorts. | Hui D et al. | โ | 2025 | โ |
| Statistical construction of calibrated prediction intervals for polygenic score-based phenotype prediction. | Xu C et al. | โ | 2025 | โ |
| Testing for differences in polygenic scores in the presence of confounding. | Blanc J et al. | โ | 2025 | โ |
| The distribution of highly deleterious variants across human ancestry groups. | Stolyarova A et al. | โ | 2025 | โ |
| Transferability of polygenic risk scores for metabolic and cardiovascular traits in an underrepresented population. | Pasookhush P et al. | โ | 2025 | โ |
| Analysis of Evolutionary Conservation, Expression Level, and Genetic Association at a Genome-wide Scale Reveals Heterogeneity Across Polygenic Phenotypes. | Giel AS et al. | โ | 2024 | โ |
| Aspiring toward equitable benefits from genomic advances to individuals of ancestrally diverse backgrounds. | Wang Y et al. | โ | 2024 | โ |
| Assessing the predictive efficacy of European-based systolic blood pressure polygenic risk scores in diverse Brazilian cohorts. | Teixeira SK et al. | โ | 2024 | โ |
| Bayesian approach to assessing population differences in genetic risk of disease with application to prostate cancer. | Timmins IR et al. | โ | 2024 | โ |
| Body mass index stratification optimizes polygenic prediction of type 2 diabetes in cross-biobank analyses. | Ojima T et al. | โ | 2024 | โ |
| BridgePRS leverages shared genetic effects across ancestries to increase polygenic risk score portability. | Hoggart CJ et al. | โ | 2024 | โ |
| Calibrated prediction intervals for polygenic scores across diverse contexts. | Hou K et al. | โ | 2024 | โ |
| Causal interpretations of family GWAS in the presence of heterogeneous effects. | Veller C et al. | โ | 2024 | โ |
| Distinct genetic liability profiles define clinically relevant patient strata across common diseases. | Trastulla L et al. | โ | 2024 | โ |
| Divorce, genetic risk, and suicidal thoughts and behaviors in a sample with recurrent major depressive disorder. | Edwards AC et al. | โ | 2024 | โ |
| Genetic and molecular architecture of complex traits. | Lappalainen T et al. | โ | 2024 | โ |
| Genetic modifiers of rare variants in monogenic developmental disorder loci. | Kingdom R et al. | โ | 2024 | โ |
| Genomic analysis of intracranial and subcortical brain volumes yields polygenic scores accounting for variation across ancestries. | Garcรญa-Marรญn LM et al. | โ | 2024 | โ |
| Leveraging functional genomic annotations and genome coverage to improve polygenic prediction of complex traits within and between ancestries. | Zheng Z et al. | โ | 2024 | โ |
| Managing differential performance of polygenic risk scores across groups: Real-world experience of the eMERGE Network. | Lewis ACF et al. | โ | 2024 | โ |
| Mapping the relative accuracy of cross-ancestry prediction. | Lupi AS et al. | โ | 2024 | โ |
| Multi-trait GWAS for diverse ancestries: mapping the knowledge gap. | Troubat L et al. | โ | 2024 | โ |
| Novel ancestry-specific primary open-angle glaucoma loci and shared biology with vascular mechanisms and cell proliferation. | Lo Faro V et al. | โ | 2024 | โ |
| Optimizing clinico-genomic disease prediction across ancestries: a machine learning strategy with Pareto improvement. | Gao Y et al. | โ | 2024 | โ |
| Phenomewide Association Study of Health Outcomes Associated With the Genetic Correlates of 25 Hydroxyvitamin D Concentration and Vitamin D Binding Protein Concentration. | Kresge HA et al. | โ | 2024 | โ |
| Polygenic prediction for underrepresented populations through transfer learning by utilizing genetic similarity shared with European populations. | Zhu Y et al. | โ | 2024 | โ |
| Population Heterogeneity and Selection of Coronary Artery Disease Polygenic Scores. | Debernardi C et al. | โ | 2024 | โ |
| Principles and methods for transferring polygenic risk scores across global populations. | Kachuri L et al. | โ | 2024 | โ |
| Recent advances in polygenic scores: translation, equitability, methods and FAIR tools. | Xiang R et al. | โ | 2024 | โ |
| The association between DNA methylation and human height and a prospective model of DNA methylation-based height prediction. | Wang Z et al. | โ | 2024 | โ |
| The Mexican Biobank Project promotes genetic discovery, inclusive science and local capacity building. | Sohail M et al. | โ | 2024 | โ |
| Worldwide distribution of genetic factors related to severity of COVID-19 infection. | Esteban ME et al. | โ | 2024 | โ |
| 150 risk variants for diverticular disease of intestine prioritize cell types and enable polygenic prediction of disease susceptibility. | Wu Y et al. | โ | 2023 | โ |
| 15 years of GWAS discovery: Realizing the promise. | Abdellaoui A et al. | โ | 2023 | โ |
| Addressing the Challenge of Biomedical Data Inequality: An Artificial Intelligence Perspective. | Gao Y et al. | โ | 2023 | โ |
| Amplification is the primary mode of gene-by-sex interaction in complex human traits. | Zhu C et al. | โ | 2023 | โ |
| A new method for multiancestry polygenic prediction improves performance across diverse populations. | Zhang H et al. | โ | 2023 | โ |
| An overview of DNA methylation-derived trait score methods and applications. | Nabais MF et al. | โ | 2023 | โ |
| Applying polygenic risk score methods to pharmacogenomics GWAS: challenges and opportunities. | Zhai S et al. | โ | 2023 | โ |
| Biobank-scale methods and projections for sparse polygenic prediction from machine learning. | Raben TG et al. | โ | 2023 | โ |
| Boosting the power of genome-wide association studies within and across ancestries by using polygenic scores. | Campos AI et al. | โ | 2023 | โ |
| Bridging the diversity gap: Analytical and study design considerations for improving the accuracy of trans-ancestry genetic prediction. | Bocher O et al. | โ | 2023 | โ |
| Can polygenic risk scores help explain disease prevalence differences around the world? A worldwide investigation. | Jain PR et al. | โ | 2023 | โ |
| Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. | Hou K et al. | โ | 2023 | โ |
| Comparing Pruning and Thresholding with Continuous Shrinkage Polygenic Score Methods in a Large Sample of Ancestrally Diverse Adolescents from the ABCD Study<sup>ยฎ</sup>. | Ahern J et al. | โ | 2023 | โ |
| Cross-ancestry analyses identify new genetic loci associated with 25-hydroxyvitamin D. | Wang X et al. | โ | 2023 | โ |
| Does ethnicity influence dementia, stroke and mortality risk? Evidence from the UK Biobank. | Bonnechรจre B et al. | โ | 2023 | โ |
| Estimation and implications of the genetic architecture of fasting and non-fasting blood glucose. | Qiao Z et al. | โ | 2023 | โ |
| Factors affecting the accuracy of genomic prediction in joint pig populations. | Zhao W et al. | โ | 2023 | โ |
| Genetic risk prediction in Hispanics/Latinos: milestones, challenges, and social-ethical considerations. | Maldonado BL et al. | โ | 2023 | โ |
| Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts. | Wang Y et al. | โ | 2023 | โ |
| Integration of genetic fine-mapping and multi-omics data reveals candidate effector genes for hypertension. | van Duijvenboden S et al. | โ | 2023 | โ |
| Low and differential polygenic score generalizability among African populations due largely to genetic diversity. | Majara L et al. | โ | 2023 | โ |
| Molecular genetics of neuropsychiatric illness: some musings. | Janardhanan M et al. | โ | 2023 | โ |
| Optimal strategies for learning multi-ancestry polygenic scores vary across traits. | Lehmann B et al. | โ | 2023 | โ |
| Polygenic prediction across populations is influenced by ancestry, genetic architecture, and methodology. | Wang Y et al. | โ | 2023 | โ |
| Polygenic risk score prediction of multiple sclerosis in individuals of South Asian ancestry. | Breedon JR et al. | โ | 2023 | โ |
| Polygenic risk scores for disease risk prediction in Africa: current challenges and future directions. | Fatumo S et al. | โ | 2023 | โ |
| Polygenic scores in cancer. | Yang X et al. | โ | 2023 | โ |
| Polygenic scoring accuracy varies across the genetic ancestry continuum. | Ding Y et al. | โ | 2023 | โ |
| Advances in integrative African genomics. | Zhang C et al. | โ | 2022 | โ |
| African-specific alleles modify risk for asthma at the 17q12-q21 locus in African Americans. | Washington C et al. | โ | 2022 | โ |
| A saturated map of common genetic variants associated with human height. | Yengo L et al. | โ | 2022 | โ |
| Assessing polygenic risk score models for applications in populations with under-represented genomics data: an example of Vietnam. | Pham D et al. | โ | 2022 | โ |
| Challenges and Opportunities for Developing More Generalizable Polygenic Risk Scores. | Wang Y et al. | โ | 2022 | โ |
| Clarifying the causes of consistent and inconsistent findings in genetics. | Dattani S et al. | โ | 2022 | โ |
| Concerns about the use of polygenic embryo screening for psychiatric and cognitive traits. | Lencz T et al. | โ | 2022 | โ |
| Development and validation of a polygenic hazard score to predict prognosis and adjuvant chemotherapy benefit in early-stage non-small cell lung cancer. | Li DH et al. | โ | 2022 | โ |
| Gattaca as a lens on contemporary genetics: marking 25 years into the film's "not-too-distant" future. | Ogbunugafor CB et al. | โ | 2022 | โ |
| Gene-based polygenic risk scores analysis of alcohol use disorder in African Americans. | Lai D et al. | โ | 2022 | โ |
| Genome-wide association study identifies TNFSF15 associated with childhood asthma. | Kim KW et al. | โ | 2022 | โ |
| Genome-wide risk prediction of common diseases across ancestries in one million people. | Mars N et al. | โ | 2022 | โ |
| Glaucoma Genetic Risk Scores in the Million Veteran Program. | Waksmunski AR et al. | โ | 2022 | โ |
| Human genetic admixture through the lens of population genomics. | Gopalan S et al. | โ | 2022 | โ |
| Importance of Including Non-European Populations in Large Human Genetic Studies to Enhance Precision Medicine. | Ju D et al. | โ | 2022 | โ |
| Improving polygenic prediction in ancestrally diverse populations. | Ruan Y et al. | โ | 2022 | โ |
| Incorporating family history of disease improves polygenic risk scores in diverse populations. | Hujoel MLA et al. | โ | 2022 | โ |
| Large uncertainty in individual polygenic risk score estimation impacts PRS-based risk stratification. | Ding Y et al. | โ | 2022 | โ |
| Leveraging fine-mapping and multipopulation training data to improve cross-population polygenic risk scores. | Weissbrod O et al. | โ | 2022 | โ |
| PGS-server: accuracy, robustness and transferability of polygenic score methods for biobank scale studies. | Yang S et al. | โ | 2022 | โ |
| Polygenic prediction of educational attainment within and between families from genome-wide association analyses in 3 million individuals. | Okbay A et al. | โ | 2022 | โ |
| Polygenic risk, population structure and ongoing difficulties with race in human genetics. | Kaplan JM et al. | โ | 2022 | โ |
| Polygenic risk score improves the accuracy of a clinical risk score for coronary artery disease. | King A et al. | โ | 2022 | โ |
| Polygenic Risk Score in African populations: progress and challenges. | Adam Y et al. | โ | 2022 | โ |
| Polygenic score accuracy in ancient samples: Quantifying the effects of allelic turnover. | Carlson MO et al. | โ | 2022 | โ |
| Population differentiation of polygenic score predictions under stabilizing selection. | Yair S et al. | โ | 2022 | โ |
| Portability of 245 polygenic scores when derived from the UK Biobank and applied to 9 ancestry groups from the same cohort. | Privรฉ F et al. | โ | 2022 | โ |
| Special Issue editorial: Leveraging genetically informative study designs to understand the development and familial transmission of psychopathology. | Wilson S et al. | โ | 2022 | โ |
| Stability of polygenic scores across discovery genome-wide association studies. | Schultz LM et al. | โ | 2022 | โ |
| Ten challenges for clinical translation in psychiatric genetics. | Derks EM et al. | โ | 2022 | โ |
| The SCRIPT trial: study protocol for a randomised controlled trial of a polygenic risk score to tailor colorectal cancer screening in primary care. | Saya S et al. | โ | 2022 | โ |
| Towards a global view of multiple sclerosis genetics. | Jacobs BM et al. | โ | 2022 | โ |
| Transferability of genetic loci and polygenic scores for cardiometabolic traits in British Pakistani and Bangladeshi individuals. | Huang QQ et al. | โ | 2022 | โ |
| Genetic Risk Prediction of COVID-19 Susceptibility and Severity in the Indian Population. | Prakrithi P et al. | โ | 2021 | โ |
| Genetic risk scores for cardiometabolic traits in sub-Saharan African populations. | Ekoru K et al. | โ | 2021 | โ |
| Haplotype-aware inference of human chromosome abnormalities. | Ariad D et al. | โ | 2021 | โ |
| Improved prediction of fracture risk leveraging a genome-wide polygenic risk score. | Lu T et al. | โ | 2021 | โ |
| Inclusion of variants discovered from diverse populations improves polygenic risk score transferability. | Cavazos TB et al. | โ | 2021 | โ |
| Incorporating European GWAS findings improve polygenic risk prediction accuracy of breast cancer among East Asians. | Ji Y et al. | โ | 2021 | โ |
| Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets. | Mรกrquez-Luna C et al. | โ | 2021 | โ |
| Leveraging Single-Cell RNA-seq Data to Uncover the Association Between Cell Type and Chronic Liver Diseases. | Ye X et al. | โ | 2021 | โ |
| Maintenance of Complex Trait Variation: Classic Theory and Modern Data. | Koch EM et al. | โ | 2021 | โ |
| New Polygenic Risk Score to Predict High Myopia in Singapore Chinese Children. | Lanca C et al. | โ | 2021 | โ |
| Populations, Traits, and Their Spatial Structure in Humans. | Sohail M et al. | โ | 2021 | โ |
| Quantifying genetic heterogeneity between continental populations for human height and body mass index. | Guo J et al. | โ | 2021 | โ |
| Statistical models and computational tools for predicting complex traits and diseases. | Chung W | โ | 2021 | โ |
| The evolution of group differences in changing environments. | Harpak A et al. | โ | 2021 | โ |
| The omnigenic model and polygenic prediction of complex traits. | Mathieson I | โ | 2021 | โ |
| Validation of an Integrated Risk Tool, Including Polygenic Risk Score, for Atherosclerotic Cardiovascular Disease in Multiple Ethnicities and Ancestries. | Weale ME et al. | โ | 2021 | โ |
| Accurate and Scalable Construction of Polygenic Scores in Large Biobank Data Sets. | Yang S et al. | โ | 2020 | โ |
| Demographic history mediates the effect of stratification on polygenic scores. | Zaidi AA et al. | โ | 2020 | โ |
| Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations. | Martin AR et al. | โ | 2020 | โ |
| Polygenic Scores for Height in Admixed Populations. | Bitarello BD et al. | โ | 2020 | โ |