Chunk #13 — Methods — Statistical methods — Multiple imputation

Source: Childhood conduct disorder trajectories, prior risk factors and cannabis use at age 16: birth cohort study.
Embedded: yes

Text

Multiple imputation was based on the 7218 individuals with information on the conduct–problem trajectory, the main exposure considered derived from those respondents who had four or more of the six SDQ measures required for the original mixture modelling 10. The preliminary multinomial regression models described above employed list-wise deletion, and hence sample sizes varied from the full sample of 7218 for gender to samples of down to 4000 for some of the less complete measures. Consequently, the complete case sample for which we could perform multivariable models was smaller than 7218. To address this problem, missing data imputation was carried out by chained equations 26 using the ice routine 27 in Stata to restore the sample size to 7218 for all analyses. The imputation model contained the conduct problem trajectory assignment, the other outcomes and risk factors described above, as well as a number of additional auxiliary variables known to be related both to missingness and to the key variables of interest in our models. One hundred data sets were imputed, and these data were exported to Latent Gold. Final model estimates were pooled using Rubin's rules 28.