As seen in supplementary Table S1, several sites included only medicated patients whereas other sites only included non-medicated patients. To assess whether the high performance reported above for classifying medication status could be explained by site-differences in the covariates (e.g., site ID) rather than neuroimaging data, we performed the following control experiments. We reran both medication and main diagnosis classifications without using covariates (using only brain data), using covariates only, and by using neuroimaging data after removing the effect of the covariates using multiple linear regression. To maintain independence between the training and test sets, regression coefficients were estimated on the training data and applied to the test data.