The predictions from the final model were then tested on illegal behavior, including arrests. Because the illegal behaviors were reported as binary (i.e., whether a behavior did or did not occur), we examined them in the context of receiver operating characteristic (ROC) curves, which examine the diagnostic utility of a given assessment tool by evaluating the tradeoff between its sensitivity and specificity to predict the outcome. ROC curves were estimated using the ROCR package (Sing, Sander, Beerenwinkel, & Lengauer, 2005) in R. All of the descriptive statistics (means, standard deviations, and Pearson correlations) and the unconditional multilevel models are from the raw, non-imputed data set.