Chunk #77 — STAR Methods — QUANTIFICATION AND STATISTICAL ANALYSIS — ICA based analysis and clustering

Source: Molecular Diversity and Specializations among the Cells of the Adult Mouse Brain.
Embedded: yes

Text

To detect and remove cells with high scores on doublet and outlier ICs, we simulated a Gaussian centered at the mode of the IC cell loading distribution, and flagged cells that were situated at the far-right of the distribution. The mode was detected by performing a kernel density estimation of the IC loadings using the density() function in R, and the standard deviation was calculated across all scores for that IC. Doublets and Outliers were identified as cells whose upper-bound p-value was less than 0.01 (FDR-corrected). Only ICs annotated as Biological ICs were included in the generation of the SNN graph for clustering; Technical ICs (Doublet, Outlier, and Artifact) were not included. We note however, additional “technical” influences may exist in Biological ICs. Our goal was to subcluster the data such that, as best as possible, cells with strong cell loading for each Biological IC defined their own particular subcluster. To do this, we clustered the cells across a range of the parameters k (number of nearest neighbors used in SNN generation) and r (resolution parameter in SLM), inspected the