We queried the GEO (Barrett et al., 2007) and ArrayExpress (Rustici et al., 2013) to find relevant human, mouse or rat expression datasets for each of the 73 clusters of the previous step. Each dataset was required to contain at least 3 samples in each phenotypic class. We also verified that these datasets were not used to define any of the hallmarks. For subsequent steps we chose 43 of these clusters, for which we identified at least 3 datasets for refinement and a fourth independent one for validation. These 43 clusters were annotated with 50 biological themes. Seven clusters gave rise to two due to heterogeneity within founding gene sets (Supplemental Experimental Procedures, Note 1 and Table S4). We plan to continue assigning themes and processing the remaining clusters to develop additional hallmarks in the future.