Regression models were used to test the association with the polygenic risk scores based on smoking (predictor variable) and alcohol- and cannabis variables (independent variables). Linear regression models were used for continuous variables and logistic regression models for the dichotomous outcome variables. Regression analyses were carried out in STATA (version 9.0) and corrected for family clustering by employing the robust cluster option. Sex and birth cohort were added as covariates. To make clear how much variance is explained by the risk score itself and how much by the covariates, the R2 will be presented of the regression models including only the polygenic risk score (model 1), the regression model with risk score and sex (model 2) and the regression model with risk score, sex and age.