Classification and regression trees are provably Bayes consistent, i.e. in principle they can approximate any decision boundary, whether linear or highly nonlinear, given a sufficiently large data set and allowed to grow at a proper rate (see, e.g., Devroye, Györfi, and Lugosi 1996). For linear functions, the problem from a practical point of view is that a single tree’s step-function approximation will be rather poor. Ensembles of trees, however, can approximate functions more smoothly by averaging over the single trees’ step-functions.