We normalized gene expression UMI count data using SCTransform45 and performed PCA on the SCTransform Pearson residual matrix using the RunPCA function in Seurat. We found the 20 nearest neighbors for each cell using the FindNeighbors function, with dims = 1:50 to use the first 50 principal components and annotated cell types in the PBMC dataset by label transfer from a publicly available multimodal PBMC reference dataset42. We identified anchor cells41 between the query and reference datasets using the FindTransferAnchors function in Seurat v4, with reference.reduction = ’spca’ to use a precomputed reference dimension reduction object. We then computed cell type predictions for each cell in the query using the TransferData function in Seurat. As erythrocytes are not nucleated and the query PBMC dataset was derived from cell nuclei, we assigned a small number of cells that were incorrectly predicted as erythrocytes to the most common predicted class of those cells’ 20 nearest neighbors.