Chunk #17 — Methods — Missing data

Source: Clinical and social outcomes of adolescent self harm: population based birth cohort study.
Embedded: yes

Text

Our primary analyses were conducted on an imputed dataset based on those with complete data on self harm with and without suicidal intent at 16 years (n=4799). The number of respondents with complete data (on outcome and all covariates) ranged from 1743 (for not achieving ≥3 A levels) to 2777 (for not achieving ≥5 GCSE or equivalent A*-C grades). To create multiple copies of datasets in which missing values were replaced by imputed values sampled from their predictive distribution, we conducted multiple imputation by chained equations using the ice command in Stata.37 This method assumes that data are missing at random, whereby any systematic differences between the missing and the observed values can be explained by differences in observed data.38 Overall, we generated 100 imputed datasets for each outcome of interest. In the imputation models we included all variables used in the analysis along with several additional auxiliary variables. These included variables found to be predictive of missingness (see supplementary tables 1a and b), indicators of socioeconomic adversity, personal characteristics, and maternal psychopathology as well as strong correlates of the