First, we estimated the effect of GPS across each P-value threshold to determine the most predictive score (based on model R2) for each alcohol phenotype. We then tested whether relationship status moderated the association of the genome-wide polygenic scores. In the instances where we found evidence for a significant interaction, we fitted a more robust model for evaluating G × E [41], which includes all G × covariate and E × covariate interaction terms. Finally, we tested for sexspecific G × E by including a three-way interaction term. We determined whether estimates were significant using an α of P < 0.05 (two-sided test). Because the FinnTwin12 data is a family-based data set, we evaluated all hypotheses using a linear mixed model with random intercepts for each family in the lme4 [54] package in in R version 3.5.1 [55]. We estimated effect size (ΔR2) using a method designed for mixed effects models [56] with the MuMIn package [57].