Chunk #21 — 2. Methods — 2.7. Random Forest Classification Model and Parameters

Source: Random Forest Classification of Alcohol Use Disorder Using fMRI Functional Connectivity, Neuropsychological Functioning, and Impulsivity Measures.
Embedded: yes

Text

first, the model trains itself using a training data for creating each tree based on bootstrap aggregating (bagging). At the second level, the algorithm randomly selects a subset of features to split at each node while growing a decision tree for group classification. In order to maximize the classification accuracy (by reducing the errors or impurity), only a single best feature (variable) among a random subset of features is selected at each internal node. This process is recursively repeated until one of the three conditions is met: (i) the tree has either reached a specified depth, (ii) the number of samples in a node becomes lower than the set threshold, and (iii) when all the samples are grouped into the same category [84]. Some of the important concepts and parameters of the Random Forest classification method are listed in Table A1 (see Appendix A).