The biggest barrier to integrating biological knowledge with agnostic GEWIS data may be the lack of ontologies designed to bring together information from SNPs, genes, and pathways, but also their relevant environmental substrates, known relationships to disease, metabolic parameters, and toxicological information. The creation of such a database is arguably one of the most important contributions of the Human Genome Epidemiology Network (HuGE NET) project129, but is highly labor-intensive because expert curation of the literature is needed; their valuable series of reviews on specific topics130,131 does not replace the need for a searchable database that could provide prior covariate information in a systematic and unbiased manner. Automatic literature-mining approaches132,133 have been developed that can help assign sets of genes to shared pathways or interaction networks. However, they are still vulnerable to bias in what is investigated and published; the current literature on G×E interactions is very sparse, highly subject to publication bias, poorly replicated, and tends to reflect a “looking under the lamppost” mentality in terms of what gets studied. Other genomic or pathway ontologies134-136 tend to be limited to purely genetic information and are only partially useful for G×E modeling.