Our findings highlight the impact of validation schemes on classification performance and suggest poor discrimination between OCD patients and HC when combining data from multiple sites. In contrast, discrimination between subgroups of patients based on medication status enabled fair individual subject classification. However, our control experiments indicated that non-brain covariates such as age, sex and site can heavily affect classification performance, dependent on the relation between the structural neuroimaging data and those covariates. Yet, even after removal of the covariate effects, the results still indicated that medication use is associated with substantial differences in brain anatomy that are widely distributed, whereas gross gray matter anatomy of patients with OCD was comparable to that of healthy controls. At the same time, this also suggests that clinical heterogeneity contributes to the poor performance of structural MRI as a disease biomarker.