The Moran dataset 47 was obtained from GEO (accession GSE8397). Processing of the U133a and U133b Cel files was done separately. The data was read in using the ReadAffy function from the R affy package 98, then Robust Multi-array Averaging (RMA) was applied. The U133a and U133b array expression data were merged after applying RMA. Probe annotations and mapping to HGNC symbols was done using the biomaRt R package 99. Differential expression analysis was performed using limma 100 taking age and gender as covariates. The Lesnick dataset 46 was obtained from GEO (accession GSE7621). Data was processed as for the Moran dataset: however, age was not available to use as a covariate. The Disjkstra dataset 50 was obtained from GEO (accession GSE49036) and processed as above: the gender and RIN values were used as covariates. As the transcriptome datasets measured gene expression in the substantia nigra, we only kept cell types that are present in the substantia nigra or ventral midbrain for our EWCE 11 analysis. We computed a new specificity matrix based on the substantia nigra or ventral midbrain