paperKB
coga / coga-kb
Help
Sign in

Chunk #27 — 2. Materials and Methods — 2.7. RF Classification Model and Parameters

Source
Random Forest Classification of Alcohol Use Disorder Using EEG Source Functional Connectivity, Neuropsychological Functioning, and Impulsivity Measures.
Embedded
yes

Text

of variables randomly selected for the splitting decision at each node. Two levels of randomness are used by the RF to construct the ensemble of trees: first, the model trains itself using a training data for creating each tree based on bootstrap aggregating (bagging). At the second level, the algorithm randomly selects a subset of features to split at each node while growing a decision tree for group classification. In order to maximize the classification accuracy (by reducing the errors or impurity), only a single best feature (variable) among a random subset of features is selected at each internal node. This process is recursively repeated until one of the three conditions is met: (i) the tree has either reached a specified depth, (ii) the number of samples in a node becomes lower than the set threshold, and (iii) when all the samples are grouped into the same category [106]. Some of the important concepts and parameters of RF classification method are listed in Box A1 (see Appendix A).