We view this group of 50 hallmarks as an initial set deriving from the gene set clusters where the relationship to a biological theme was clear during manual review. Notably, this first set already corresponds to a broad coverage of cellular processes representing about half of the gene sets in the MSigDB. We plan to move forward with a program to enhance and expand the collection, encouraged by the current results that demonstrate an increase in signal strength and the good summarization capability of the hallmarks. We believe this collection will prove to be a valuable user resource for the community and provide even more precise results when used with enrichment analysis methods.