Databases of genetic variation are important for our understanding of human population history and biology1–5, but also provide critical resources for the clinical interpretation of variants observed in patients suffering from rare Mendelian diseases6,7. The filtering of candidate variants by frequency in unselected individuals is a key step in any pipeline for the discovery of causal variants in Mendelian disease patients, and the efficacy of such filtering depends on both the size and the ancestral diversity of the available reference data.