The reason for this is that, even if the true distribution of the data in both branches was very similar, due to random variations in the sample and the deterministic variable and cutpoint selection strategy of classification trees, it is extremely unlikely that the same splitting variable – and also the exact same cutpoint – would be selected in both branches. However, even a slightly different cutpoint in the same variable would, strictly speaking, represent an interaction. Therefore it is stated in the literature that classification trees cannot (or rather, are extremely unlikely to) represent additive functions that consist only of main effects, while they are perfectly well suited for representing complex interactions.