Finally, we sought to determine the proportion of variance (R2) in persistent externalizing explained by variants within the SNP set. In order to achieve a total R2 across the SNPs tested, we used a polygenic risk score approach with five-fold within-sample validation following recommended methods (Vrieze, McGue, Miller, Hicks, & Iacono, 2013). In order to avoid gross overfitting that occurs when the same data are used as both discovery and target samples, the data were first split into five nearly equally sized groups (N = 337, subgroup Ns = 67, with one random group having N = 68). Next, we conducted linear association analyses in PLINK (exactly as done in the full sample analysis) five times, each time using 4/5th of the total data set (i.e., leaving out one sub-sample). The group not included in analysis then became the target sample for computing polygenic risk scores. As such, each participant was included in a ‘discovery’ sample four times, and was in the target sample once. Importantly, no individual was included in both the discovery and target samples in any analysis workflow.