where xi,1,…,xi,p stand for the observed values of X1,…,Xp for the ith observation. As an illustration, we display in Fig. 1 the partial dependence plots obtained by logistic regression and random forest for three simulated datasets representing classification problems, each including n=1000 independent observations. For each dataset the variable Y is simulated according to the formula \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\log (P(Y=1)/P(Y=0))=\beta _{0}+\beta _{1}X_{1}+\beta _{2}X_{2}+\beta _{3}X_{1}X_{2}+\beta _{4}X_{1}^{2}$\end{document}log(P(Y=1)/P(Y=0))=β0+β1X1+β2X2+β3X1X2+β4X12. The first dataset (top) represents the linear scenario (β1≠0, β2≠0, β3=β4=0), the second dataset (middle) an interaction (β1≠0, β2≠0, β3≠0, β4=0) and the third (bottom) a case of non-linearity (β1=β2=β3=0, β4≠0). For all three datasets the random vector (X1,X2)⊤ follows distribution \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal {N}_{2}(0,I)$\end{document}N2(0,I), with I representing the identity matrix. The data points are represented in the left column, while the PDPs are displayed in the right column for RF, logistic regression as well as the true logistic regression model (i.e. with the true coefficient values instead of fitted values). We see that RF captures the dependence and non-linearity structures in cases 2 and 3, while logistic regression, as expected, is not able to.