To demonstrate the potential of atomic sketch integration to perform ‘community-wide’ analysis, we first considered scRNA-seq datasets of the human lung. During the COVID-19 pandemic, there has been widespread scRNA-seq data collection from respiratory tissues, particularly by the Human Cell Atlas Lung Biological Network58. Leveraging a recently published ‘database’ of scRNA-seq studies59, as well as collection of openly released lung and upper airway datasets from the Human Cell Atlas60 , we assembled a group of 19 datasets spanning and 1,525,710 total individual cells. We created an atomic dictionary consisting of 5,000 cells from each dataset (95,000 total atoms), integrated these cells, and then reconstructed the full datasets. Our atomic sketch integration procedure performed all these steps (including preprocessing) in 55 minutes, using a single computational core. We found that the integrated latent space preserved the neighbor relationships between cell types independently assigned in each dataset (‘KNN purity’), but also mixed cells across datasets (‘Local Inverse Simpsons Index’) (Supplementary Fig. 5c-e).