Chunk #66 — Discussion — Outlook

Source: Random forest versus logistic regression: a large-scale benchmark experiment.
Embedded: yes

Text

In this paper, we mainly focus on RF with default parameters as implemented in the widely used package randomForest and only briefly consider parameter tuning using a tuning procedure implemented in the package tuneRanger as an outlook. The rationale for this choice was to provide evidence for default values and thereby the analysis strategy most researchers currently apply in practice. The development of reliable and practical parameter tuning strategies, however, is crucial and more attention should be devoted in the future. Tuning strategies should be themselves compared in benchmark studies. Beyond the special case of RF, particular attention should be given to the development of user-friendly tools such as tuneRanger [4], considering that one of the main reasons for using default values is probably the ease-of-use—an important aspect in the hectic academic context. By presenting the results on the average superiority with default values over LR, we by no means want to definitively establish these default values. Instead, our study is intended as a fundamental first step towards well-designed studies providing solid well-delimited evidence on the performance.