Both of these state-reduction approaches speed up imputation, but our cross-validations show that IMPUTE2 attains higher accuracy than Beagle in practice, especially at low-frequency variants in datasets that have higher haplotype diversity (e.g. those with recent African ancestry). We suggest that this is because clustering models have inherent difficulties capturing low-frequency variation: by grouping similar haplotypes into clusters, these methods obscure the differences between those haplotypes, which reduces the ability to impute low-frequency variants. This could explain why the accuracy disparity between IMPUTE2 and Beagle was largest in African populations, which have higher genetic diversity than non-African populations and hence a larger fraction of low-frequency haplotypes. Methods like Beagle may be able to make up some of this ground by using more clusters, but this will further increase the computational load.