A crucial and missing component of microbiome research is a robust and comprehensive reference set of microbiome samples and metadata about those samples that are available for public, unrestricted use. Such a dataset would characterize what we know about diversity of the human microbiome and its relationship to the health and lifestyle choices of individuals, providing much-needed context against which to compare findings of focused studies such as those on particular disease populations. This reference would allow researchers to place their study in the framework of what is already known in order to better interpret observed patterns (compelling examples of this can be found in [17, 18]). It would also enable stringent hypothesis testing and evaluation of effect sizes. A robust reference dataset must be built on top of a cross-sectional study design in order to understand the variation in the population, while also including rich longitudinal components to enable an understanding of how species structure changes over time.