We sought biological validation that the hallmark gene sets are able to detect their annotated processes by measuring their performance against experimental data from established protein reporters of pathway activation. For this we used a subset of the Cancer Cell Line Encyclopedia (CCLE) gene expression dataset (Barretina et al., 2012). Besides gene expression, the CCLE repository (www.broadinstitute.org/ccle) maintains detailed genomic, proteomic, and pharmacologic records for about 1,000 cancer cell lines. We projected the CCLE gene expression dataset onto hallmark gene sets using ssGSEA. To define protein abundance phenotypes we utilized the CCLE reverse phase protein array (RPPA) data. RPPA is an antibody-based assay that quantifies expression of proteins and allows concordant interrogation of multiple proteins in many samples (Spurrier et al., 2008). For a large panel of CCLE cell lines, we obtained the RPPA abundances of 8 proteins: AR, BCL2, CDH2 (N-cadherin), ESR1, KDR (VEGFR2), MYC, SMAD3, STAT5A, and a variant of STAT3 phosphorylated at Tyr705 (STAT3_pY705). Next, we matched 9 relevant ssGSEA hallmark profiles to those phenotypes using the IC, and assessed the significance of their matching scores using