paperKB
coga / coga-kb
Help
Sign in

Chunk #61 — The Methods — How Do Ensemble Methods Work? — Bagging

Source
An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests.
Embedded
yes

Text

Grandvalet (2004) also points out that the key effect of bagging is that it equalizes the influence of particular observations – which proves beneficial in the case of “bad” leverage points, but may be harmful when “good” leverage points, that could improve the model fit, are downweighted. The same effect can be achieved not only by means of bootstrap sampling as in standard bagging, but also by means of subsampling (Grandvalet 2004), that is preferable in many applications because it guarantees unbiased variable selection (Strobl, Boulesteix, Zeileis, and Hothorn 2007, see also section “Bias in variable selection and variable importance”). Ensemble construction can also be viewed in the context of Bayesian model averaging (cf., e.g., Domingos 1997; Hoeting, Madigan, Raftery, and Volinsky 1999, for an introduction). For random forests, which we will consider in the next section, Breiman (2001a, p. 25) states that they may also be viewed as a Bayesian procedure (and continues: “Although I doubt that this is a fruitful line of exploration, if it could explain the bias reduction, I might become more of a Bayesian.”).