We performed UMAP using the RunUMAP function in the Seurat package (v.3.2.0) using LSI components 2 to 40 for the PBMC multiome dataset, components 2 to 50 for the CRC tumor dataset, 2 to 30 for the PBMC scATAC-seq dataset and 2 to 100 for the BICCN mouse brain dataset. The first LSI component was excluded from each analysis as it typically captures sequencing depth (technical variation) and was highly correlated with the total number of counts for the cell. The RunUMAP function uses the uwot R package to compute two-dimensional UMAP coordinates48 (https://CRAN.R-project.org/package=uwot).