The applied statistical approach was designed to answer the study aims (i–iii). First (i), Elastic Net (EN) regression models were used to identify and select the specific PD criteria that were the best predictors of lifetime AUD (Logistic model) over and above the other criteria [8–10]. The EN method is able to use many predictor variables and set regression coefficients of unnecessary variables to zero, automatically selecting them out from the model. It was designed to improve the performance of classic regression methods when the number of predictor variables is large. To examine the robustness of the findings, the variable-selection analysis was replicated using age of AUD onset (Cox’s proportional-hazards model) as the outcome, and both the outcomes were further analyzed in the wave 2 data using all the available PD information (totaling 4 partial replications). EN models were estimated using “glmnet” R package, version 2.0–5. The supplementary material provides more details on EN regression.