We next tested the CMap for its ability to produce biologically meaningful connections. While our analysis of replicate measurements demonstrated that L1000 is robust, it is conceivable that as the size of the dataset increased, so might biological and technical noise, thereby obscuring real signal. To address this, we compiled 7,578 perturbational signatures from public sources from which we identified 1,143 perturbational profiles (across multiple expression platforms; Table S5)–that matched a CMap-L1000v1 perturbagen, and were therefore eligible for Recall analysis. For each query, we assessed whether it connected to its equivalent in CMap-L1000v1 at a high level of confidence (defined as NP <=0.05, FDR <= 0.25 and |τ| >= 90). 909/1,143 queries (80%) exhibited the expected connectivity. We note that the inference of expression values from landmarks was essential to recovering connections. 20% of connections were lost when the analysis was restricted to landmarks only. Furthermore, 48 query signatures contained zero landmark transcripts and were therefore not analyzable without inference of the remainder of the transcriptome.