paperKB
coga / coga-kb
Help
Sign in

Chunk #5 — Materials and Methods — Polygenic risk score using a single training population

Source
Multiethnic polygenic risk scores improve risk prediction in diverse populations.
Embedded
yes

Text

The parameters RLD2 and PT are commonly tuned using on validation data to optimize prediction accuracy (International Schizophrenia Consortium et al., 2009; Stahl et al., 2012). While in theory this procedure is susceptible to over fitting, in practice, validation sample sizes are typically large, and RLD2 and PT are selected from a small discrete set of parameter choices, so over fitting is considered to have a negligible effect. Accordingly, in this work, we consider RLD2 ∈ {0.1, 0.2, 0.5, 0.8} and PT ∈ {1.0, 0.8, 0.5, 0.4, 0.3, 0.2, 0.1, 0.08, 0.05, 0.02, 0.01, 10−3, 10−4, 10−5, 10−6, 10−7, 10−8}, and we always report results corresponding to the best choices of these parameters. In all of our primary analyses involving two training populations (see below), values of RLD2 and PT were optimized based only on PRS in a single training population, to ensure that PRS using two training populations did not gain any relative advantage from the optimization of these parameters.