A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data.
- Authors
- Xie, Yang; Pan, Wei; Khodursky, Arkady B
- Year
- 2005
- Journal
- Bioinformatics (Oxford, England)
- PMID
- 16188930
- DOI
- 10.1093/bioinformatics/bti685
MOTIVATION: False discovery rate (FDR) is defined as the expected percentage of false positives among all the claimed positives. In practice, with the true FDR unknown, an estimated FDR can serve as a criterion to evaluate the performance of various statistical methods under the condition that the estimated FDR approximates the true FDR well, or at least, it does not improperly favor or disfavor any particular method. Permutation methods have become popular to estimate FDR in genomic studies. The purpose of this paper is 2-fold. First, we investigate theoretically and empirically whether the standard permutation-based FDR estimator is biased, and if so, whether the bias inappropriately favors or disfavors any method. Second, we propose a simple modification of the standard permutation to yield a better FDR estimator, which can in turn serve as a more fair criterion to evaluate various statistical methods. RESULTS: Both simulated and real data examples are used for illustration and comparison. Three commonly used test statistics, the sample mean, SAM statistic and Student's t-statistic, are considered. The results show that the standard permutation method overestimates FDR. The overestimation is the most severe for the sample mean statistic while the least for the t-statistic with the SAM-statistic lying between the two extremes, suggesting that one has to be cautious when using the standard permutation-based FDR estimates to evaluate various statistical methods. In addition, our proposed FDR estimation method is simple and outperforms the standard method.
No figures extracted from this document.
No chunks β full text not yet ingested.
No entities extracted from this document yet.
No uploaded files.
No citations found.
In this knowledge base
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Asymmetric integration of various cancer datasets for identifying risk-associated variants and genes. | Wang R et al. | β | 2025 | β |
| Consistent differential effects of bupropion and mirtazapine in major depression. | Strobl EV | β | 2025 | β |
| Effective integration of multi-omics with prior knowledge to identify biomarkers via explainable graph neural networks. | Tripathy RK et al. | β | 2025 | β |
| Unique Behavior Profiles that Specify Mental Distress in Autism | Strobl EV | β | 2025 | β |
| Large changes in detected selection signatures after a selection limit in mice bred for voluntary wheel-running behavior. | Hillis DA et al. | β | 2024 | β |
| Cancer metabolites: promising biomarkers for cancer liquid biopsy. | Wang W et al. | β | 2023 | β |
| Disturbance and restoration of soil microbial communities after in-situ thermal desorption in a chlorinated hydrocarbon contaminated site. | Shentu J et al. | β | 2023 | β |
| JUMPptm: Integrated software for sensitive identification of post-translational modifications and its application in Alzheimer's disease study. | Poudel S et al. | β | 2023 | β |
| The systematic comparison between Gaussian mirror and Model-X knockoff models. | Chen S et al. | β | 2023 | β |
| Null-free False Discovery Rate Control Using Decoy Permutations. | He K et al. | β | 2022 | β |
| Quantitative Comparison of Statistical Methods for Analyzing Human Metabolomics Data. | Henglin M et al. | β | 2022 | β |
| Deep Profiling of Microgram-Scale Proteome by Tandem Mass Tag Mass Spectrometry. | Liu D et al. | β | 2021 | β |
| Habitat heterogeneity induced by pyrogenic organic matter in wildfire-perturbed soils mediates bacterial community assembly processes. | Zhang L et al. | β | 2021 | β |
| High-dimensional variable selection for ordinal outcomes with error control. | Fu H et al. | β | 2021 | β |
| Neural-level associations of non-verbal pragmatic comprehension in young Finnish autistic adults. | Kotila A et al. | β | 2021 | β |
| Bioinformatics Methods for Mass Spectrometry-Based Proteomics Data Analysis. | Chen C et al. | β | 2020 | β |
| Deep Multilayer Brain Proteomics Identifies Molecular Networks in Alzheimer's Disease Progression. | Bai B et al. | β | 2020 | β |
| Transcriptomic and open chromatin atlas of high-resolution anatomical regions in the rhesus macaque brain. | Yin S et al. | β | 2020 | β |
| A Review of Microarray Datasets: Where to Find Them and Specific Characteristics. | Alonso-Betanzos A et al. | β | 2019 | β |
| Circulating microparticle proteins obtained in the late first trimester predict spontaneous preterm birth at less than 35 weeks' gestation: a panel validation with specific characterization by parity. | McElrath TF et al. | β | 2019 | β |
| MAP: model-based analysis of proteomic data to detect proteins with significant abundance changes. | Li M et al. | β | 2019 | β |
| Polygenic approaches to detect gene-environment interactions when external information is unavailable. | Lin WY et al. | β | 2019 | β |
| Metabolic oscillations on the circadian time scale in <i>Drosophila</i> cells lacking clock genes. | Rey G et al. | β | 2018 | β |
| statTarget: A streamlined tool for signal drift correction and interpretations of quantitative mass spectrometry-based omics data. | Luan H et al. | β | 2018 | β |
| A permutation-based non-parametric analysis of CRISPR screen data. | Jia G et al. | β | 2017 | β |
| False discovery rate control incorporating phylogenetic tree increases detection power in microbiome-wide multiple testing. | Xiao J et al. | β | 2017 | β |
| Genome-wide DNA methylation and transcriptome analyses reveal genes involved in immune responses of pig peripheral blood mononuclear cells to poly I:C. | Wang H et al. | β | 2017 | β |
| Robust gene selection methods using weighting schemes for microarray data analysis. | Kang S et al. | β | 2017 | β |
| A Fuzzy Permutation Method for False Discovery Rate Control. | Yang YH et al. | β | 2016 | β |
| Evaluation of proteomic biomarkers associated with circulating microparticles as an effective means to stratify the risk of spontaneousΒ preterm birth. | Cantonwine DE et al. | β | 2016 | β |
| Long intergenic non-coding RNA expression signature in human breast cancer. | Zhang Y et al. | β | 2016 | β |
| Analytical methods in untargeted metabolomics: state of the art in 2015. | Alonso A et al. | β | 2015 | β |
| Screening of feature genes in distinguishing different types of breast cancer using support vector machine. | Wang Q et al. | β | 2015 | β |
| Assessing multivariate gene-metabolome associations with rare variants using Bayesian reduced rank regression. | Marttinen P et al. | β | 2014 | β |
| Integration of microarray profiles associated with cardiomyopathy and the potential role of Ube3a in apoptosis. | Zhang J et al. | β | 2014 | β |
| An efficient weighted graph strategy to identify differentiation associated genes in embryonic stem cells. | Zhang J et al. | β | 2013 | β |
| Genetic markers of comorbid depression and alcoholism in women. | Procopio DO et al. | β | 2013 | β |
| Signal propagation in protein interaction network during colorectal cancer progression. | Jiang Y et al. | β | 2013 | β |
| A comparison of two classes of methods for estimating false discovery rates in microarray studies. | Hansen E et al. | β | 2012 | β |
| Improving power of genome-wide association studies with weighted false discovery rate control and prioritized subset analysis. | Lin WY et al. | β | 2012 | β |
| Literature aided determination of data quality and statistical significance threshold for gene expression studies. | Xu L et al. | β | 2012 | β |
| Presenting the uncertainties of odds ratios using empirical-Bayes prediction intervals. | Lin WY et al. | β | 2012 | β |
| Analysis of phosphoproteomics data. | Schaab C | β | 2011 | β |
| An Exponential-Gamma Convolution Model for Background Correction of Illumina BeadArray Data. | Chen M et al. | β | 2011 | β |
| Genomic regions identified by overlapping clusters of nominally-positive SNPs from genome-wide studies of alcohol and illegal substance dependence. | Johnson C et al. | β | 2011 | β |
| An approach to evaluate the reliability of hybridization-based and sequencing-based gene expression profiling technologies. | Yang DY et al. | β | 2010 | β |
| False discovery rate and permutation test: an evaluation in ERP data analysis. | Lage-Castellanos A et al. | β | 2010 | β |
| Incorporating prior knowledge to facilitate discoveries in a genome-wide association study on age-related macular degeneration. | Lin WY et al. | β | 2010 | β |
| Molecular and anatomical signatures of sleep deprivation in the mouse brain. | Thompson CL et al. | β | 2010 | β |
| Power calculations for multicenter imaging studies controlled by the false discovery rate. | Suckling J et al. | β | 2010 | β |
| Role of spectral counting in quantitative proteomics. | Lundgren DH et al. | β | 2010 | β |
| Statistical methods for integrating multiple types of high-throughput data. | Xie Y et al. | β | 2010 | β |
| A Bayesian approach to efficient differential allocation for resampling-based significance testing. | Jensen ST et al. | β | 2009 | β |
| Comments on the analysis of unbalanced microarray data. | Kerr KF | β | 2009 | β |
| Evaluating reproducibility of differential expression discoveries in microarray studies by considering correlated molecular changes. | Zhang M et al. | β | 2009 | β |
| Large-scale detection of ubiquitination substrates using cell extracts and protein microarrays. | Merbl Y et al. | β | 2009 | β |
| Potential bias in GO::TermFinder. | Flight RM et al. | β | 2009 | β |
| Probability fold change: a robust computational approach for identifying differentially expressed gene lists. | Deng X et al. | β | 2009 | β |
| Properties of balanced permutations. | Southworth LK et al. | β | 2009 | β |
| Wnt antagonist gene polymorphisms and renal cancer. | Hirata H et al. | β | 2009 | β |
| An efficient method to identify differentially expressed genes in microarray experiments. | Qin H et al. | β | 2008 | β |
| A new test statistic based on shrunken sample variance for identifying differentially expressed genes in small microarray experiments. | Hirakawa A et al. | β | 2008 | β |
| Apparently low reproducibility of true differential expression discoveries in microarray studies. | Zhang M et al. | β | 2008 | β |
| Comments on 'On correcting the overestimation of the permutation-based false discovery rate estimator'. | Xie Y | β | 2008 | β |
| Estimating the false discovery rate using mixed normal distribution for identifying differentially expressed genes in microarray data analysis. | Hirakawa A et al. | β | 2008 | β |
| On correcting the overestimation of the permutation-based false discovery rate estimator. | Jiao S et al. | β | 2008 | β |
| Protein identification and Peptide expression resolver: harmonizing protein identification with protein expression data. | Kearney P et al. | β | 2008 | β |
| Ranking analysis of F-statistics for microarray data. | Tan YD et al. | β | 2008 | β |
| Robustified MANOVA with applications in detecting differentially expressed genes from oligonucleotide arrays. | Xu J et al. | β | 2008 | β |
| Universal false discovery rate estimation methodology for genome-wide association studies. | Forner K et al. | β | 2008 | β |
| A comprehensive evaluation of SAM, the SAM R-package and a simple modification to improve its performance. | Zhang S | β | 2007 | β |
| A constrained polynomial regression procedure for estimating the local False Discovery Rate. | Dalmasso C et al. | β | 2007 | β |
| Empirical Bayes identification [correction of identication] of tumor progression genes from microarray data. | Ghosh D et al. | β | 2007 | β |
| Estimating p-values in small microarray experiments. | Yang H et al. | β | 2007 | β |
| Incorporating prior information via shrinkage: a combined analysis of genome-wide location data and gene expression data. | Xie Y et al. | β | 2007 | β |
| Ventral tegmental transcriptome response to intermittent nicotine treatment and withdrawal in BALB/cJ, C57BL/6ByJ, and quasi-congenic RQI mice. | Vadasz C et al. | β | 2007 | β |
| A simple implementation of a normal mixture approach to differential gene expression in multiclass microarrays. | McLachlan GJ et al. | β | 2006 | β |
| Construction of null statistics in permutation-based multiple testing for multi-factorial microarray experiments. | Gao X | β | 2006 | β |
| Epistemological issues in omics and high-dimensional biology: give the people what they want. | Mehta TS et al. | β | 2006 | β |
| Identification of estrogen-responsive genes in the parenchyma and fat pad of the bovine mammary gland by microarray analysis. | Li RW et al. | β | 2006 | β |
| Inheritance patterns of transcript levels in F1 hybrid mice. | Cui X et al. | β | 2006 | β |
| Survival analysis of longitudinal microarrays. | Rajicic N et al. | β | 2006 | β |