paperKB
coga / coga-kb
Help
Sign in

Chunk #6 — Materials and methods — Multivariate classification and validation

Source
Structural neuroimaging biomarkers for obsessive-compulsive disorder in the ENIGMA-OCD consortium: medication matters.
Embedded
yes

Text

Participants with >10% missing entries were excluded (n = 276), and median imputation was used for missing MRI data on the training set. Continuous features were centered around median zero and scaled according to their interquartile range. FreeSurfer variables were combined with covariates age, sex, and site by concatenating individual feature vectors. Categorical covariates were one-hot encoded prior to classification. All analyses were performed separately for pediatric and adult patients, and both groups combined. Common MVPA classifiers were applied: support vector machine (SVM) with linear and non-linear (radial-basis-function (RBF)) kernels, logistic regression (LR) with L1 and L2 regularization, Gaussian processes classification (GPC) with a linear kernel, and two decision-tree based ensemble methods, namely the random forest classifier (RFC) and the XGBoost (XGB) algorithm29–32. A neural network was also implemented (fully connected; 3 hidden layers with 60, 40, and 20 nodes respectively). SVM and LR classifiers were combined with and without automatic dimensionality reduction via principal component analysis (PCA), using the minimal number of components explaining 90% of the variance. Hyper-parameters for SVM (linear and non-linear), LR and XGB were optimized