paperKB
coga / coga-kb
Help
Sign in

Chunk #27 — Methods — Inclusion criteria and subgroup analyses

Source
Random forest versus logistic regression: a large-scale benchmark experiment.
Embedded
yes

Text

We conjecture that, from published studies, datasets are occasionally removed from the experiment a posteriori because the results do not meet the expectations/hopes of the researchers. While the vast majority of researchers certainly do not cheat consciously, such practices may substantially introduce bias to the conclusion of a benchmarking experiment; see previous literature [27] for theoretical and empirical investigation of this problem. Therefore, “fishing for datasets” after completion of the benchmark experiment should be prohibited, see Rule 4 of the “ten simple rules for reducing over-optimistic reporting” [28].