Chunk #50 — Results and discussion — Comparative benchmarks — Benchmark for RNA sequencing data

Source: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.
Embedded: yes

Text

False positive rate To evaluate the false positive rate of the algorithms, we considered mock comparisons from a dataset with many samples and no known condition dividing the samples into distinct groups. We used the RNA-seq data of Pickrell et al. [17] for lymphoblastoid cell lines derived from unrelated Nigerian individuals. We chose a set of 26 RNA-seq samples of the same read length (46 base pairs) from male individuals. We randomly drew without replacement ten samples from the set to compare five against five, and this process was repeated 30 times. We estimated the false positive rate associated with a critical value of 0.01 by dividing the number of P values less than 0.01 by the total number of tests; genes with zero sum of read counts across samples were excluded. The results over the 30 replications, summarized in Figure 7, indicated that all algorithms generally controlled the number of false positives. DESeq (old) and Cuffdiff 2 appeared overly conservative in this analysis, not using up their type-I error budget.