The importance of accounting for relevant confounds is exemplified in Fig 6e, which shows a strong apparent association between total white matter volume and fat-free body mass (one scatter point per subject) without de-confounding. In fact, this association is largely driven by the average differences in body mass and head size between sexes (see color coding) and disappears after adjusting for sex, age and head size. This is an example of Simpson’s paradox43, in which suboptimal pooling across variables (here, sex) results in a misleading association. Other pitfalls include failing to consider study population selection bias44 and inappropriate de-confounding of variables that are caused by (and not feeding into) the variables of interest45. While there is no guarantee that UK Biobank is an unbiased sample of the full population, that does not imply that studies using subsets of the data have to retain any biases (though again it is still possible for bias to arise44); one important aspect of study design will be the method of sub-selection of Biobank subjects to feed into an analysis. In the case of focused