After the early seminal work on automated interaction detection by Morgan and Sonquist (1963) the two most popular algorithms for classification and regression trees (abbreviated as classification trees in most of the following), CART and C4.5, were introduced by Breiman et al. (1984) and independently by Quinlan (1986, 1993). Their nonparametric approach and the straightforward interpretability of the results have added much to the popularity of classification trees (cf., e.g., Hannöver, Richard, Hansen, Martinovich, and Kordy 2002; Kitsantas, Moore, and Sly 2007, for applications on the treatment effect in patients with eating disorders and determinants of adolescent smoking habits). As an advancement of single classification trees, random forests (Breiman 2001a), as well as its predecessor method bagging (Breiman 1996a, 1998), are so-called “ensemble methods”, where an ensemble or committee of classification trees is aggregated for prediction. This section introduces the main concepts of classification trees, that are then employed as so called “base learners” in the ensemble methods bagging and random forests.