Reduction of selection bias in genomewide studies by resampling.
- Authors
- Sun, Lei; Bull, Shelley B
- Year
- 2005
- Journal
- Genetic epidemiology
- PMID
- 15761913
- DOI
- 10.1002/gepi.20068
The accuracy of gene localization, the reliability of locus-specific effect estimates, and the ability to replicate initial claims of linkage and/or association have emerged as major methodological concerns in genomewide studies of complex diseases and quantitative traits. To address the issue of multiple comparisons inherent in genomewide studies, the use of stringent criteria for assessing statistical significance has been generally acknowledged as a strategy to control type I error. However, the application of genomewide significance criteria does not take account of the selection bias introduced into parameter estimates, e.g., estimates of locus-specific effect size of disease/trait loci. Some have argued that reliable locus-specific parameter estimates can only be obtained in an independent sample. In this report, we examine statistical resampling techniques, including cross-validation and the bootstrap, applied to the initial sample to improve the estimation of locus-specific effects. We compare them with the naive method in which all data are used for both hypothesis testing and parameter estimation, as well as with the split-sample approach in which part of the data are reserved for estimation. Upward bias of the naive estimator and inadequacy of the split-sample approach are derived analytically under a simple quantitative trait model. Simulation studies of the resampling methods are performed for both the simple model and a more realistic genomewide linkage analysis. Our results suggest that cross-validation and bootstrap methods can substantially reduce the estimation bias, especially when the effect size is small or there is no genetic effect.
No figures extracted from this document.
No chunks β full text not yet ingested.
No entities extracted from this document yet.
No uploaded files.
No citations found.
In this knowledge base
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| FUTURE-AI: international consensus guideline for trustworthy and deployable artificial intelligence in healthcare. | Lekadir K et al. | β | 2025 | β |
| The winner's curse under dependence: repairing empirical Bayes using convoluted densities. | Hawinkel S et al. | β | 2025 | β |
| Winner's curse in rare variant analysis: effect size estimation bias depends on effect direction and the association method used. | Soave D et al. | β | 2025 | β |
| An evaluation of synthetic data augmentation for mitigating covariate bias in health data. | Juwara L et al. | β | 2024 | β |
| COMT and Neuregulin 1 Markers for Personalized Treatment of Schizophrenia Spectrum Disorders Treated with Risperidone Monotherapy. | Bondrescu M et al. | β | 2024 | β |
| SumVg: Total Heritability Explained by All Variants in Genome-Wide Association Studies Based on Summary Statistics with Standard Error Estimates. | So HC et al. | β | 2024 | β |
| Review and further developments in statistical corrections for Winner's Curse in genetic association studies. | Forde A et al. | β | 2023 | β |
| The hidden factor: accounting for covariate effects in power and sample size computation for a binary trait. | Zhang Z et al. | β | 2023 | β |
| Clarifying the causes of consistent and inconsistent findings in genetics. | Dattani S et al. | β | 2022 | β |
| Estimation of genetic variance contributed by a quantitative trait locus: correcting the bias associated with significance tests. | Xie F et al. | β | 2021 | β |
| Estimation of total mediation effect for high-dimensional omics mediators. | Yang T et al. | β | 2021 | β |
| Selective peak inference: Unbiased estimation of raw and standardized effect size at local maxima. | Davenport S et al. | β | 2020 | β |
| Efficient estimation of disease odds ratios for follow-up genetic association studies. | Hu J et al. | β | 2019 | β |
| Estimating the quality of optimal treatment regimes. | Sies A et al. | β | 2019 | β |
| Estimation of Mediation Effect for High-dimensional Omics Mediators with Application to the Framingham Heart Study | Yang T et al. | β | 2019 | β |
| Power, false discovery rate and Winner's Curse in eQTL studies. | Huang QQ et al. | β | 2018 | β |
| Rank Conditional Coverage and Confidence Intervals in High-Dimensional Problems. | Morrison J et al. | β | 2018 | β |
| Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association. | Grinde KE et al. | β | 2017 | β |
| Model averaging for treatment effect estimation in subgroups. | Bornkamp B et al. | β | 2017 | β |
| The projack: a resampling approach to correct for ranking bias in high-throughput studies. | Zhou YH et al. | β | 2016 | β |
| Using local multiplicity to improve effect estimation from a hypothesis-generating pharmacogenetics study. | Zou W et al. | β | 2016 | β |
| Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data. | Shi J et al. | β | 2016 | β |
| An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies. | Thompson WK et al. | β | 2015 | β |
| Associations of toll-like receptor (TLR)-4 single nucleotide polymorphisms and rheumatoid arthritis disease progression: an observational cohort study. | Davis MLR et al. | β | 2015 | β |
| Resampling to Address the Winner's Curse in Genetic Association Analysis of Time to Event. | Poirier JG et al. | β | 2015 | β |
| Single nucleotide polymorphism in the neuroplastin locus associates with cortical thickness and intellectual ability in adolescents. | DesriviΓ¨res S et al. | β | 2015 | β |
| Re-ranking sequencing variants in the post-GWAS era for accurate causal variant identification. | Faye LL et al. | β | 2013 | β |
| Estimating genetic effects and quantifying missing heritability explained by identified rare-variant associations. | Liu DJ et al. | β | 2012 | β |
| A flexible genome-wide bootstrap method that accounts for ranking and threshold-selection bias in GWAS interpretation and replication study design. | Faye LL et al. | β | 2011 | β |
| BR-squared: a practical solution to the winner's curse in genome-wide scans. | Sun L et al. | β | 2011 | β |
| Quantifying and correcting for the winner's curse in quantitative-trait association studies. | Xiao R et al. | β | 2011 | β |
| Tweedie's Formula and Selection Bias. | Efron B | β | 2011 | β |
| A genome-wide scan for common alleles affecting risk for autism. | Anney R et al. | β | 2010 | β |
| Correcting "winner's curse" in odds ratios from genomewide association findings for major complex human diseases. | Zhong H et al. | β | 2010 | β |
| Shrinkage estimation of effect sizes as an alternative to hypothesis testing followed by estimation in high-dimensional biology: applications to differential gene expression. | Montazeri Z et al. | β | 2010 | β |
| Empirical Bayes and semi-Bayes adjustments for a vast number of estimations. | StrΓΆmberg U | β | 2009 | β |
| Genome-wide association for major depressive disorder: a possible role for the presynaptic protein piccolo. | Sullivan PF et al. | β | 2009 | β |
| Quantifying and correcting for the winner's curse in genetic association studies. | Xiao R et al. | β | 2009 | β |
| Ranking bias in association studies. | Jeffries NO | β | 2009 | β |
| Unbiased estimation of odds ratios: combining genomewide association scans with replication studies. | Bowden J et al. | β | 2009 | β |
| Bias-reduced estimators and confidence intervals for odds ratios in genome-wide association studies. | Zhong H et al. | β | 2008 | β |
| Estimating odds ratios in genome scans: an approximate conditional likelihood approach. | Ghosh A et al. | β | 2008 | β |
| Multiple superoxide dismutase 1/splicing factor serine alanine 15 variants are associated with the development and progression of diabetic nephropathy: the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications Genetics study. | Al-Kateb H et al. | β | 2008 | β |
| Association studies in an era of too much information: clinical analysis of new biomarker and genetic data. | Loscalzo J | β | 2007 | β |
| Confidence intervals for putative quantitative trait loci - development and applications of new linkage methods. | Papachristou C et al. | β | 2007 | β |
| Estimating the number and size of the main effects in genome-wide case-control association studies. | Kuo PH et al. | β | 2007 | β |
| Flexible design for following up positive findings. | Yu K et al. | β | 2007 | β |
| Multiple comparisons distortions of parameter estimates. | Jeffries NO | β | 2007 | β |
| Optimal selection of markers for validation or replication from genome-wide association studies. | Greenwood CM et al. | β | 2007 | β |
| Overcoming the winner's curse: estimating penetrance parameters from case-control data. | Zollner S et al. | β | 2007 | β |
| Locus-specific heritability estimation via the bootstrap in linkage scans for quantitative trait loci. | Wu LY et al. | β | 2006 | β |
| Comparison of single-nucleotide polymorphisms and microsatellite markers for linkage analysis in the COGA and simulated data sets for Genetic Analysis Workshop 14: Presentation Groups 1, 2, and 3. | Wilcox MA et al. | β | 2005 | β |
| Resampling methods to reduce the selection bias in genetic effect estimation in genome-wide scans. | Wu LY et al. | β | 2005 | β |