Secondly, as all real data studies, our study considers datasets following different unknown distributions. It is not possible to control the various datasets’ characteristics that may be relevant with respect to the performance of RF and LR. Simulations fill this gap and often yield some valuable insights into the performance of methods in various settings that a real data study cannot give.