paperKB
coga / coga-kb
Help
Sign in

Chunk #35 — Methods — Availability of data and materials

Source
Random forest versus logistic regression: a large-scale benchmark experiment.
Embedded
yes

Text

Emphasis is placed on the reproducibility of our results. Firstly, the code implementing all our analyses is fully available from GitHub [35]. For visualization-only purposes, the benchmarking results are available from this link, so that our graphics can be quickly generated by mouse-click. However, the code to re-compute these results, i.e. to conduct the benchmarking study, is also available from GitHub. Secondly, since we use specific versions of R and add-on packages and our results may thus be difficult to reproduce in the future due to software updates, we also provide a docker image [36]. Docker automates the deployment of applications inside a so called “Docker container” [37]. We use it to create an R environment with all the packages we need in their correct version. Note that docker is not necessary here (since all our codes are available from GitHub), but very practical for a reproducible environment and thus for reproducible research in the long term.