Chunk #8 — RESULTS — Simulations

Source: Improving polygenic prediction in ancestrally diverse populations.
Embedded: yes
Text

We first applied single-discovery methods to GWAS summary statistics generated by 100K simulated EUR samples and 20K non-EUR (EAS or AFR) samples, and evaluated their predictive performance, measured by the squared correlation (R2) between the simulated and predicted phenotypes, in 20K target samples, which were evenly split into a validation dataset and a testing dataset (Fig. 2; Supplementary Table 1). As expected, when the target population was EUR, PRS trained on the larger EUR GWAS were substantially more accurate than PRS trained on non-EUR GWAS (Fig. 2; left panels). However, when the target population was EAS or AFR, PRS trained on ancestry-matched non-EUR GWAS were more predictive than EUR PRS (Fig. 2; right panels), even though the sample sizes of the non-EUR GWAS were much smaller (20K vs. 100K). Among the three single-discovery methods examined, Bayesian methods (LDpred2 and PRS-CS) consistently outperformed PT. PRS-CS appeared to be more accurate than LDpred2 in both within- and cross-population prediction when the discovery GWAS was well-powered, while LDpred2 was more accurate when the discovery sample size was limited, likely reflecting the strengths and limitations of the different priors used in PRS-CS and LDpred2 (Supplementary Note).