Chunk #15 — Results — Performance of the method on simulated data

Source: Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations.
Embedded: yes

Text

Our simulations utilise existing genotypes at ~1.1 million common HapMap3 SNPs imputed in 351,983 unrelated UK Biobank (UKB) participants. These participants were categorised into four ancestry homogeneous groups corresponding to European ancestry (EUR; NEUR = 333,263), East-Asian ancestry (EAS; NEAS = 2257), South-Asian ancestry (SAS, NSAS = 9448) and African ancestry (AFR; NAFR = 7015). The European ancestry group was further divided into a discovery set of N = 313,284 participants in which GWS SNPs were identified (Methods), a validation set in which the accuracy of PGS within-European-ancestry was quantified and a reference group in which we predicted the accuracy of PGS. A thorough description of how these groups were defined is given in the Methods section. As our main focus is to predict the fraction of the RA that can be attributed to alleles frequencies and LD differences between populations, we therefore assumed that effect sizes of causal variants are perfectly correlated across populations, i.e. ρb = 1 and that heritability is constant across populations, i.e. \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$h_2^2 = h_1^2 =