Although these results should be considered with caution, since they are possibly highly dependent on the particular distribution of the meta-features over the 243 datasets and confounding may be an issue, we conclude from “Explaining differences: datasets’ meta-features” section that meta-features substantially affect Δacc. This points out the importance of the definition of clear inclusion criteria for datasets in a benchmark experiment and of the consideration of the meta-features’ distributions.