Chunk #53 — Methods — Deterministic accuracy of PGS in trans-ancestry genetic prediction

Source: Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations.
Embedded: yes

Text

When using Eq. (1) with known causal variants, we firstly matched GWS SNPs to them to calculate LD correlation and allele frequencies between populations (results shown as RApred1 in the simulations). It was done by constraining the window centred at each GWS SNP as 100 kb and then selecting those pairs including known causal variants. This window was based on the report that ~95% top lead SNPs (with MAF >0.01) identified from GWASs are within 100 kb distance from the causal variants in European ancestry21. Although the causal variants are often unknown or unobserved in a classical GWAS, they are usually tagged by numerous SNPs. Therefore, we took advantage of the information regarding fine-mapping precision of GWAS studies and selected candidate causal variants as those SNPs in LD r2 >0.45 with GWS SNPs and located within 100 kb window21. Those GWS SNPs and candidate causal variants pairs were then used in Eq. (2), with results referring to as RApred2 in the simulations. When assuming the PGS-SNPs as causal variants, we estimated the accuracies using Eq. (1) where the LD correlation was replaced with 1 (results denoted as RApred3 in the simulations). The LD correlations were estimated using PLINK1.9052 (−r).