paperKB
coga / coga-kb
Help
Sign in

Chunk #26 — Results — Utilizing dictionary learning for massively scalable integration

Source
Dictionary learning for integrative, multimodal and scalable single-cell analysis.
Embedded
yes

Text

We were inspired by previous work on ‘geometric sketching’ which first selects a representative subset of cells (a ‘sketch’) across all datasets, integrates them, and then propagates the integrated result back to the full dataset54. This pioneering approach substantially improves the scalability of integration as the heaviest computational steps are focused on subsets of the data. However, this approach is dependent on the results of principal components analysis that must first be performed on the full dataset. As datasets continue to grow in scale, more sophisticated computational infrastructure is required to load full collections of data into memory, and even performing dimensional reduction can become a limiting step. We aimed to devise a strategy that could integrate large compendiums of datasets, without ever needing to simultaneously analyze or perform intensive computation on the full set of cells.