Ancestry deconvolution and partial polygenic score can improve susceptibility predictions in recently admixed individuals.
- Authors
- Marnetto, Davide; Pärna, Katri; Läll, Kristi; Molinaro, Ludovica; Montinaro, Francesco; Haller, Toomas; Metspalu, Mait; Mägi, Reedik; Fischer, Krista; Pagani, Luca
- Year
- 2020
- Journal
- Nature communications
- PMID
- 32242022
- DOI
- 10.1038/s41467-020-15464-w
- PMCID
- PMC7118071
Polygenic Scores (PSs) describe the genetic component of an individual's quantitative phenotype or their susceptibility to diseases with a genetic basis. Currently, PSs rely on population-dependent contributions of many associated alleles, with limited applicability to understudied populations and recently admixed individuals. Here we introduce a combination of local ancestry deconvolution and partial PS computation to account for the population-specific nature of the association signals in individuals with admixed ancestry. We demonstrate partial PS to be a proxy for the total PS and that a portion of the genome is enough to improve susceptibility predictions for the traits we test. By combining partial PSs from different populations, we are able to improve trait predictability in admixed individuals with some European ancestry. These results may extend the applicability of PSs to subjects with a complex history of admixture, where current methods cannot be applied.
Schematic workflow.A graphical representation of the workflow we adopted to obtain normalized PS and ancestry specific pPS. White boxes represent input data, the two key steps of ancestry deconvolution and partial PS computation have an orange background.
Population-wide Polygenic Scores (PS) and ancestry specific partial PS.PS distributions for seven reference populations (pastel colors), three admixed populations (yellow) and their relative ancestry specific partial PS (red and blue). Reference population medians are represented with dashed lines. The width of the boxplots is proportional to the median size of the ancestry fraction used to compute each aspPS. Four different PS for different phenotypes are shown: (a) T2D28, (b) breast cancer30 (c) height29, (d) BMI29. Significant differences with randomly assigned ancestral components are encoded as: *: p ≤ 0.05, **: p ≤ 0.005, ***: p ≤ 10−5, (one-sided Wilcoxon signed-rank test). Sample sizes and exact P-values are reported in Supplementary Data 1. For each distribution, the box represent the interquartile range (IQR = Q3−Q1), the line across the box indicate the median, the whiskers extend to the most extreme data points within Q1−1.5IQR and Q3 + 1.5IQR, outliers are omitted. CEU: North-West Europeans from Utah; IBS: Iberians from Spain; TSI: Tuscans from Italy; CHB: Han from Beijing; YRI: Yoruba from Nigeria; LWK: Luhya from Kenya; GUMUZ: Gumuz from Ethiopia; EGYPT: Egyptians; ETHIOPIA: Amhara, Oromo, Wolayta and Ethiopian Somali from Ethiopia; ASW: African-Americans from South-West USA.
pPS predictivity.We plugged in four trait prediction models pPS obtained with genomic subsets of variable sizes (the same resulting from local ancestry analysis in our admixed individuals), in a non-admixed sample set derived from EstBB. Each point represents the performance of a different subset of the genome, applied to all individuals in the population; on the horizontal axis is reported the fraction of genomic SNPs included in each subset, while the vertical axis represents its predictivity, expressed in R2 (for binary traits we used Nagelkerke R2). Red dots represent pPS not significantly improving the base model without PS (p > 0.05, likelihood ratio test). The dashed line represents the total PS predictivity. (a) Type 2 diabetes, (b) breast cancer, (c) height, (d) BMI.
Population-wide PS and aspPS in UKBB admixed individuals.PS distributions for four reference populations (pastel colors), three admixed populations (yellow) and their relative ancestry specific partial PS (red, blue, green). Reference population medians are represented with dashed lines. The width of the boxplots is proportional to the median size of the ancestry fraction used to compute each aspPS. Two different PS are shown: (a) height29 and (d) BMI29. Significant differences with randomly assigned ancestral components are encoded as: *: p ≤ 0.05, **: p ≤ 0.005, ***: p ≤ 10−5 (one-sided Wilcoxon signed-rank test). Sample sizes and exact P-values are reported in Supplementary Data 1. For each distribution, the box represent the interquartile range (IQR = Q3−Q1), the line across the box indicate the median, the whiskers extend to the most extreme data points within Q1−1.5IQR and Q3 + 1.5IQR, outliers are omitted. (c) PS bias, defined as mean PS difference not explained by trait difference, is compared with FST against the reference population. All populations extracted from UKBB are represented, showing for UK EURAFR the fraction of European ancestry, UK EUR is the reference population for UKBB-based PSs, while UK EAS is the reference for BBJ-based PSs. EUR indicates european descent, EAS east asian descent, AFR african descent; combinations indicate admixed samples. FAREUR indicates Europeans far from the UKBB core.
Predictivity in admixed genomes.Each plot shows the improvement in R2 when adding a PS to the base, non-genetic model. The line color depicts which PS configuration has been used: traditional total PS (PSUKBB or PSBBJ according to the Biobank of origin), partial ancestry specific PSs or combined ancestry specific PS. Dots represent the realized R2 improvement in each set without resampling, while bars represent standard deviation derived from n=5000 bootstrap replications. a Added R2 for height in UKBB samples with admixed African and European ancestry, no casPS was available. b Added R2 for height in UKBB samples with admixed East Asian and European ancestry. c Added R2 for BMI in UKBB samples with admixed African and European ancestry; no casPS was available. d Added R2 for BMI in UKBB samples with admixed East Asian and European ancestry. EUR indicates european descent, EAS east asian descent, AFR african descent; combinations indicate admixed samples.
No entities extracted from this document yet.
No uploaded files.
In this knowledge base
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Clinical translation of polygenic scores for prostate cancer screening. | Ratner D et al. | — | 2026 | → |
| Clinical use of polygenic risk scores: current status, barriers and future directions. | Kullo IJ | — | 2026 | → |
| Admixed and single-continental genome segments of the same ancestry have distinct linkage disequilibrium patterns. | Lee H et al. | — | 2025 | → |
| CADET: Enhanced transcriptome-wide association analyses in admixed samples using eQTL summary data. | Head ST et al. | — | 2025 | → |
| Characterizing features affecting local ancestry inference performance in admixed populations. | Honorato-Mauer J et al. | — | 2025 | → |
| Data simulation to optimize frameworks for genome-wide association studies in diverse populations. | Mugo JW et al. | — | 2025 | → |
| Decreased Clearance of Low-Density Lipoprotein Cholesterol is Causally Associated With Increased Mortality of Septic Shock. | Takahashi N et al. | — | 2025 | → |
| Fine-scale population structure and widespread conservation of genetic effect sizes between human groups across traits. | Hu S et al. | — | 2025 | → |
| Hidden structure in polygenic scores and the challenge of disentangling ancestry interactions in admixed populations. | Aw AJ et al. | — | 2025 | → |
| Improved allele frequencies in gnomAD through local ancestry inference. | Kore P et al. | — | 2025 | → |
| Incorporating multiracial and multiethnic experiences into genetic counseling practice and research: A necessary opportunity. | Lowe C et al. | — | 2025 | → |
| Leveraging global genetics resources to enhance polygenic prediction across ancestrally diverse populations. | Pain O | — | 2025 | → |
| Leveraging local ancestry and cross-ancestry genetic architecture to improve genetic prediction of complex traits in admixed populations. | Zhou G et al. | — | 2025 | → |
| Multiomics in atherosclerotic cardiovascular disease. | Nordestgaard LT et al. | — | 2025 | → |
| Opportunities and challenges of local ancestry in genetic association analyses. | Sun Q et al. | — | 2025 | → |
| Psychiatric genetics in the diverse landscape of Latin American populations. | Bruxel EM et al. | — | 2025 | → |
| STREAM-PRS: a multi-tool pipeline for streamlining polygenic risk score computation. | Becelaere S et al. | — | 2025 | → |
| The accuracy of polygenic score models for BMI and Type II diabetes in the Native Hawaiian population. | Lo YC et al. | — | 2025 | → |
| The Estonian Biobank's journey from biobanking to personalized medicine. | Milani L et al. | — | 2025 | → |
| Tracing human genetic histories and natural selection with precise local ancestry inference. | Lerga-Jaso J et al. | — | 2025 | → |
| Admix-kit: an integrated toolkit and pipeline for genetic analyses of admixed populations. | Hou K et al. | — | 2024 | → |
| Assessing the Risk Stratification of Breast Cancer Polygenic Risk Scores in a Brazilian Cohort. | Barreiro RAS et al. | — | 2024 | → |
| Complex trait susceptibilities and population diversity in a sample of 4,145 Russians. | Usoltsev D et al. | — | 2024 | → |
| Genetic and molecular architecture of complex traits. | Lappalainen T et al. | — | 2024 | → |
| Improving polygenic risk prediction in admixed populations by explicitly modeling ancestral-differential effects via GAUDI. | Sun Q et al. | — | 2024 | → |
| Methodologies underpinning polygenic risk scores estimation: a comprehensive overview. | Ndong Sima CAA et al. | — | 2024 | → |
| Polygenic risk for suicide attempt is associated with lifetime suicide attempt in US soldiers independent of parental risk. | Stein MB et al. | — | 2024 | → |
| Principles and methods for transferring polygenic risk scores across global populations. | Kachuri L et al. | — | 2024 | → |
| Promoting equity in polygenic risk assessment through global collaboration. | Kullo IJ | — | 2024 | → |
| Recent advances in polygenic scores: translation, equitability, methods and FAIR tools. | Xiang R et al. | — | 2024 | → |
| shaPRS: Leveraging shared genetic effects across traits or ancestries improves accuracy of polygenic scores. | Kelemen M et al. | — | 2024 | → |
| The PRIMED Consortium: Reducing disparities in polygenic risk assessment. | Kullo IJ et al. | — | 2024 | → |
| Causal effects on complex traits are similar for common variants across segments of different continental ancestries within admixed individuals. | Hou K et al. | — | 2023 | → |
| FairPRS: adjusting for admixed populations in polygenic risk scores using invariant risk minimization. | Machado Reyes D et al. | — | 2023 | → |
| Global Biobank analyses provide lessons for developing polygenic risk scores across diverse cohorts. | Wang Y et al. | — | 2023 | → |
| Implementing Reporting Standards for Polygenic Risk Scores for Atherosclerotic Cardiovascular Disease. | Smith JL et al. | — | 2023 | → |
| Improving genetic risk prediction across diverse population by disentangling ancestry representations. | Gyawali PK et al. | — | 2023 | → |
| Local Ancestry Inference for Complex Population Histories | Pearson A et al. | — | 2023 | — |
| Power of inclusion: Enhancing polygenic prediction with admixed individuals. | Tanigawa Y et al. | — | 2023 | → |
| Strategies for the Genomic Analysis of Admixed Populations. | Tan T et al. | — | 2023 | → |
| A Principal Component Informed Approach to Address Polygenic Risk Score Transferability Across European Cohorts. | Pärna K et al. | — | 2022 | → |
| A saturated map of common genetic variants associated with human height. | Yengo L et al. | — | 2022 | → |
| Challenges and Opportunities for Developing More Generalizable Polygenic Risk Scores. | Wang Y et al. | — | 2022 | → |
| Clinical utility of polygenic risk scores for coronary artery disease. | Klarin D et al. | — | 2022 | → |
| Development of a clinical polygenic risk score assay and reporting workflow. | Hao L et al. | — | 2022 | → |
| Enrichment analyses identify shared associations for 25 quantitative traits in over 600,000 individuals from seven diverse ancestries. | Smith SP et al. | — | 2022 | → |
| Gene-based polygenic risk scores analysis of alcohol use disorder in African Americans. | Lai D et al. | — | 2022 | → |
| Genetics and epigenetics of self-injurious thoughts and behaviors: Systematic review of the suicide literature and methodological considerations. | Mirza S et al. | — | 2022 | → |
| Genome-wide risk prediction of common diseases across ancestries in one million people. | Mars N et al. | — | 2022 | → |
| Including diverse and admixed populations in genetic epidemiology research. | Caliebe A et al. | — | 2022 | → |
| Incorporating family history of disease improves polygenic risk scores in diverse populations. | Hujoel MLA et al. | — | 2022 | → |
| Leveraging fine-mapping and multipopulation training data to improve cross-population polygenic risk scores. | Weissbrod O et al. | — | 2022 | → |
| Long-Lived Individuals Show a Lower Burden of Variants Predisposing to Age-Related Diseases and a Higher Polygenic Longevity Score. | Torres GG et al. | — | 2022 | → |
| Polygenic Risk Score in African populations: progress and challenges. | Adam Y et al. | — | 2022 | → |
| Polygenic risk scores for CARDINAL study. | Adebamowo CA et al. | — | 2022 | → |
| Polygenic Risk Scores for Cardiovascular Disease: A Scientific Statement From the American Heart Association. | O'Sullivan JW et al. | — | 2022 | → |
| SALAI-Net: species-agnostic local ancestry inference network. | Oriol Sabat B et al. | — | 2022 | → |
| Use of Polygenic Risk Scores for Coronary Heart Disease in Ancestrally Diverse Populations. | Dikilitas O et al. | — | 2022 | → |
| Admixture Has Shaped Romani Genetic Diversity in Clinically Relevant Variants. | Font-Porterias N et al. | — | 2021 | → |
| Allele frequency differentiation at height-associated SNPs among continental human populations. | Chen M et al. | — | 2021 | → |
| Changes in the fine-scale genetic structure of Finland through the 20th century. | Kerminen S et al. | — | 2021 | → |
| Detecting Genetic Ancestry and Adaptation in the Taiwanese Han People. | Lo YH et al. | — | 2021 | → |
| Genetic propensity for risky behavior and depression and risk of lifetime suicide attempt among urban African Americans in adolescence and young adulthood. | Rabinowitz JA et al. | — | 2021 | → |
| Inclusion of variants discovered from diverse populations improves polygenic risk score transferability. | Cavazos TB et al. | — | 2021 | → |
| Multi-Omic Approaches to Identify Genetic Factors in Metabolic Syndrome. | Clark KC et al. | — | 2021 | → |
| Neuropsychiatric Genetics of Psychosis in the Mexican Population: A Genome-Wide Association Study Protocol for Schizophrenia, Schizoaffective, and Bipolar Disorder Patients and Controls. | Camarena B et al. | — | 2021 | → |
| New Polygenic Risk Score to Predict High Myopia in Singapore Chinese Children. | Lanca C et al. | — | 2021 | → |
| Populations, Traits, and Their Spatial Structure in Humans. | Sohail M et al. | — | 2021 | → |
| Responsible use of polygenic risk scores in the clinic: potential benefits, risks and gaps. | Polygenic Risk Score Task Force of the International Common Disease Alliance | — | 2021 | → |
| Statistical genetics and polygenic risk score for precision medicine. | Konuma T et al. | — | 2021 | → |
| Polygenic Scores for Height in Admixed Populations. | Bitarello BD et al. | — | 2020 | → |
| Validation of a Genome-Wide Polygenic Score for Coronary Artery Disease in South Asians. | Wang M et al. | — | 2020 | → |