The essence of our computational screen, driven by a supervised, machine learning based algorithm - MetaNeighbour (Crow et al., 2017) is to detect whether a given set of genes shows correlated expression among cells of the same identity (Figure 2A). As our single cell transcriptomes derive from 6 PCPs, this data structure allowed characterizing the similarity between all cell pairs using co-variation of expression level in many known gene sets and measure whether a given set correctly links cells of known identity. In a network formalism, cells are linked as probabilistically related based on the similarity of their transcriptional profiles across a given set of genes (Figure 2A). This network classifies cells based on their proximity: closely linked neighbors are predicted to share an identity (see Methods). A subset of PCP labels is first applied to cells, giving a sub-network with known identities which can classify unlabeled cells. We then hold back the PCP identity of some cells and attempt to predict their identities using this sub-network of known identities. The efficacy of this test, reported as mean area under