Chunk #19 — Results — CMap query methodology

Source: A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles.
Embedded: yes

Text

The connectivity workflow involves interrogating the CMap database of signatures with a query (a set of differentially expressed genes representing a biological state of interest). Each of the signatures in the database represents a weighted average across the 3 biological replicate perturbations (see STAR Methods). This moderated z-score procedure serves to mitigate the effects of uncorrelated or outlier replicates (Figure 2C). The similarity of the query to each of the CMap signatures is computed, thus yielding a rank ordered list of the 473,647 signatures in the CMap-L1000v1 dataset. However, simply sorting by degree of similarity can be misleading because it does not address issues such as magnitude of gene expression change or specificity of observed connections.