paperKB
coga / coga-kb
Help
Sign in

Chunk #28 — Results and discussion — Detection of count outliers

Source
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.
Embedded
yes

Text

Parametric methods for detecting differential expression can have gene-wise estimates of LFC overly influenced by individual outliers that do not fit the distributional assumptions of the model [24]. An example of such an outlier would be a gene with single-digit counts for all samples, except one sample with a count in the thousands. As the aim of differential expression analysis is typically to find consistently up- or down-regulated genes, it is useful to consider diagnostics for detecting individual observations that overly influence the LFC estimate and P value for a gene. A standard outlier diagnostic is Cook’s distance [25], which is defined within each gene for each sample as the scaled distance that the coefficient vector, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document} $\vec {\beta }_{i}$ \end{document}β→i, of a linear model or GLM would move if the sample were removed and the model refit.