Detecting multiple associations in genome-wide studies.
- Authors
- Dudbridge, Frank; Gusnanto, Arief; Koeleman, Bobby P C
- Year
- 2006
- Journal
- Human genomics
- PMID
- 16595075
- DOI
- 10.1186/1479-7364-2-5-310
- PMCID
- PMC3500180
Recent developments in the statistical analysis of genome-wide studies are reviewed. Genome-wide analyses are becoming increasingly common in areas such as scans for disease-associated markers and gene expression profiling. The data generated by these studies present new problems for statistical analysis, owing to the large number of hypothesis tests, comparatively small sample size and modest number of true gene effects. In this review, strategies are described for optimising the genotyping cost by discarding promising genes at an earlier stage, saving resources for the genes that show a trend of association. In addition, there is a review of new methods of analysis that combine evidence across genes to increase sensitivity to multiple true associations in the presence of many non-associated genes. Some methods achieve this by including only the most significant results, whereas others model the overall distribution of results as a mixture of distributions from true and null effects. Because genes are correlated even when having no effect, permutation testing is often necessary to estimate the overall significance, but this can be very time consuming. Efficiency can be improved by fitting a parametric distribution to permutation replicates, which can be re-used in subsequent analyses. Methods are also available to generate random draws from the permutation distribution. The review also includes discussion of new error measures that give a more reasonable interpretation of genome-wide studies, together with improved sensitivity. The false discovery rate allows a controlled proportion of positive results to be false, while detecting more true positives; and the local false discovery rate and false-positive report probability give clarity on whether or not a statistically significant test represents a real discovery.
No figures extracted from this document.
| # | Section | Preview |
|---|---|---|
| 40 | Concluding remarks | Several aspects of the analysis of genome-wide studies have been discussed, including study design,β¦ |
| 41 | Concluding remarks | The field will continue to develop rapidly as more studies are completed and there is much scope for⦠|
No entities extracted from this document yet.
No uploaded files.
| Citation | PMID | DOI | Status |
|---|---|---|---|
| AllisonDBGadburyGLHeoMA mixture model approach for the analysis of microarray gene expression dataComp Stat Data Anal200339120 | β | β | β |
| AustinMAHardingSMcElroyCGenebanks: A comparison of eight proposed international genetic databasesCommunity Genet20036374510.1159/00006954412748437 | β | β | β |
| BaileyTLGrundyWNClassifying proteins by family using the product of correlated p-valuesProc RECOMB9919991014 | β | β | β |
| BartonNHKeightleyPDUnderstanding quantitative genetic variationNat Rev Genet2002311211182378710.1038/nrg700 | β | β | β |
| BenjaminiYHochbergYControlling the false discovery rate -- A practical and powerful approach to multiple testingJR Stat Soc B199557289300 | β | β | β |
| BoehnkeMLimits of resolution of genetic linkage studies: Implications for the positional cloning of human disease genesAm J Hum Genet1994553793908037215PMC1918352 | β | β | β |
| CampNJFarnhamJMCorrecting for multiple analyses in genomewide linkage studiesAnn Hum Genet20016557758210.1046/j.1469-1809.2001.6560577.x11851987 | β | β | β |
| CardonLRPalmerLJPopulation stratification and spurious allelic associationLancet200336159860410.1016/S0140-6736(03)12520-212598158 | β | β | β |
| CheverudJMA simple correction for multiple comparisons in interval mapping genome scansHeredity200187525810.1046/j.1365-2540.2001.00901.x11678987 | β | β | β |
| ColesSAn Introduction to Statistical Modelling of Extreme Values2001Springer, London, UK | β | β | β |
| DudbridgeFKoelemanBPEfficient computation of significance levels for multiple associations in large studies of correlated data, including genomewide association studiesAm J Hum Genet20047542443510.1086/42373815266393PMC1182021 | β | β | β |
| DudbridgeFKoelemanBPRank truncated product of P-values, with application to genomewide association scansGenet Epidemiol20032536036610.1002/gepi.1026414639705 | β | β | β |
| EfronBTibshiraniREmpirical Bayes methods and false discovery rates for microarraysGenet Epidemiol200223708610.1002/gepi.112412112249 | β | β | β |
| EfronBTibshiraniRStoreyJDTusherVEmpirical Bayes analysis of a microarray experimentJ Am Stat Assoc2001961151116010.1198/016214501753382129 | β | β | β |
| FreimerNSabattiCThe human phenome projectNat Genet200334152110.1038/ng0503-1512721547 | β | β | β |
| GlazierAMNadeauJHAitmanTFinding genes that underlie complex traitsScience20022982345234910.1126/science.107664112493905 | β | β | β |
| GoemanJJvan de GeerSAde KortFvan HouwelingenHCA global test for groups of genes: Testing association with a clinical outcomeBioinformatics200420939910.1093/bioinformatics/btg38214693814 | β | β | β |
| GusnantoAPlonerAPawitanYFold-change estimation of differentially expressed genes using mixture mixed-modelStat Appl Genet Mol Biol200542610.2202/1544-6115.114516646844 | β | β | β |
| HalderIShriverMDMeasuring and using admixture to study the genetics of complex diseasesHum Genomics2003152621560153310.1186/1479-7364-1-1-52PMC3525000 | β | β | β |
| HayesBGoddardMEThe distribution of the effects of genes affecting quantitative traits in livestockGenet Sel Evol20013320922910.1186/1297-9686-33-3-20911403745PMC2705405 | β | β | β |
| HellerMJDNA microarray technology: Devices, systems, and applicationsAnnu Rev Biomed Eng2002412915310.1146/annurev.bioeng.4.020702.15343812117754 | β | β | β |
| HohJOttJMathematical multi-locus approaches to localizing complex human trait genesNat Rev Genet200347017091295157110.1038/nrg1155 | β | β | β |
| HolmSA simple sequentially rejective multiple test procedureScand J Statist197966570 | β | β | β |
| HuNWangCHuYGenome-wide association study in esophageal cancer using GeneChip mapping 10 K arrayCancer Res2005652542254610.1158/0008-5472.CAN-04-324715805246 | β | β | β |
| International HapMap ConsortiumThe International HapMap ProjectNature200342678979610.1038/nature0216814685227 | β | β | β |
| IoannidisJPWhy most published research findings are falsePloS Med2005269670110.1371/journal.pmed.0020124PMC118232716060722 | β | β | β |
| IshwaranHRaoJSDetecting differentially expressed genes in microarrays using Bayesian model selectionJ Am Stat Assoc20039843845510.1198/016214503000224 | β | β | β |
| KauermannGEilersPModeling microarray data using a threshold mixture modelBiometrics20046037638710.1111/j.0006-341X.2004.00182.x15180663 | β | β | β |
| KleinRJZeissCChewEYComplement factor H polymorphism in age-related macular degenerationScience200530838538910.1126/science.110955715761122PMC1512523 | β | β | β |
| KonigIRZieglerAGroup sequential study designs in genetic-epidemiological case-control studiesHum Hered200356637210.1159/00007373414614240 | β | β | β |
| KornELTroendleJFMcShaneLMSimonRControlling the number of false discoveries: Application to high-dimensional genomic dataJ Stat Plan Inference200412437939810.1016/S0378-3758(03)00211-8 | β | β | β |
| LiaoJGLinYSelvanayagamZEShihWJA mixture model for estimating the local false discovery rate in DNA microarray analysisBioinformatics2004202694270110.1093/bioinformatics/bth31015145810 | β | β | β |
| LinDYAn efficient Monte Carlo approach to assesssing statistical significance in genomic studiesBioinformatics20052178178710.1093/bioinformatics/bti05315454414 | β | β | β |
| LockhartDJWinzelerEAGenomics, gene expression and DNA arraysNature200040582783610.1038/3501570110866209 | β | β | β |
| LoweCECooperJDChapmanJMCost-effective analysis of candidate genes using htSNPS: A staged approachGenes Immun2004530130510.1038/sj.gene.636406415029236 | β | β | β |
| ManlyKFNettletonDHwangJTGenomics, prior probability, and statistical tests of multiple hypothesesGenome Res200414997100110.1101/gr.215680415173107 | β | β | β |
| MarchiniJDonnellyPCardonLRGenome-wide strategies for detecting multiple loci that influence complex diseasesNat Genet20053741341710.1038/ng153715793588 | β | β | β |
| NealeBMShamPCThe future of association studies: Gene-based analysis and replicationAm J Hum Genet20047535336210.1086/42390115272419PMC1182015 | β | β | β |
| NyholtDRA simple correction for multiple testing for singlenucleotide polymorphisms in linkage disequilibrium with each otherAm J Hum Genet20047476576910.1086/38325114997420PMC1181954 | β | β | β |
| OphoffRAEscamillaMAServiceSKGenomewide linkage disequilibrium mapping of severe bipolar disorder in a population isolateAm J Hum Genet20027156557410.1086/34229112119601PMC379193 | β | β | β |
| OzakiKOhnishiYIidaAFunctional SNPs in the lymphotoxin-alpha gene that are associated with susceptibility to myocardial infarctionNat Genet20023265065410.1038/ng104712426569 | β | β | β |
| PattersonNHattangadiNLaneBMethods for highdensity admixture mapping of disease genesAm J Hum Genet200474979100010.1086/42087115088269PMC1181990 | β | β | β |
| PesarinFMultivariate Permutation Tests With Applications in Biostatistics2001Wiley, Chichester, UK | β | β | β |
| PoundsSMorrisSWEstimating the occurrence of false positives and false negatives in microarray studies by approximating and partitioning the empirical distribution of P-valuesBioinformatics2003191236124210.1093/bioinformatics/btg14812835267 | β | β | β |
| RischNJSearching for genetic determinants in the new millenniumNature200040584785610.1038/3501571810866211 | β | β | β |
| SagatopanJMElstonRCOptimal two-stage genotyping in population-based association studiesGenet Epidemiol20032514915710.1002/gepi.1026012916023PMC8978311 | β | β | β |
| SagatopanJMVenkatramanESBeggCBTwo-stage designs for gene-disease association studies with sample size constraintsBiometrics20046058959710.1111/j.0006-341X.2004.00207.x15339280PMC8985053 | β | β | β |
| SagatopanJMVerbelDAVenkatramanESTwo-stage designs for genetic association studiesBiometrics20025816317010.1111/j.0006-341X.2002.00163.x11890312PMC8978151 | β | β | β |
| SaitoAKamataniNStrategies for genome-wide association studies: Optimization of study designs by the stepwise focusing methodJ Hum Genet20024736036510.1007/s10038020005012111370 | β | β | β |
| SalyakinaDSeamanSRBrowningBLEvaluation of Nyholt's procedure for multiple testing correctionHum Hered200560192510.1159/00008754016118503 | β | β | β |
| SchadtEEMonksSADrakeTAGenetics of gene expression surveyed in maize, mouse and manNature200342229730210.1038/nature0143412646919 | β | β | β |
| SeamanSRMΓΌller-MyhsokBRapid simulation of P values for product methods and multiple-testing adjustment in association studiesAm J Hum Genet2004763994081564538810.1086/428140PMC1196392 | β | β | β |
| SeamanSRMΓΌller-MyhsokBReply to LinAm J Hum Genet20057751451510.1086/432818PMC119639215645388 | β | β | β |
| ShamPBaderJSCraigIDNA pooling: A tool for large-scale association studiesNat Rev Genet200238628711241531610.1038/nrg930 | β | β | β |
| StoreyJDA direct approach to false discovery ratesJR Stat Soc B20026447949810.1111/1467-9868.00346 | β | β | β |
| StoreyJDTibshiraniRStatistical significance for genomewide studiesProc Natl Acad Sci USA20031009440944510.1073/pnas.153050910012883005PMC170937 | β | β | β |
| SyvanenACToward genome-wide SNP genotypingNat Genet200537S5S1010.1038/ng155815920530 | β | β | β |
| ThomasDCClaytonDGBetting odds and genetic associationsJ Nat Cancer Inst20049642142310.1093/jnci/djh09415026459 | β | β | β |
| WacholderSChanockSGarcia-ClosasMAssessing the probability that a positive report is false: an approach for molecular epidemiology studiesJ Nat Cancer Inst20049643444210.1093/jnci/djh07515026468PMC7713993 | β | β | β |
| WangWYBarrattBJClaytonDGToddJAGenomewide association studies: theoretical and practical concernsNat Rev Genet2005610911810.1038/nrg152215716907 | β | β | β |
| WeissKMTerwilligerJDHow many diseases does it take to map a gene with SNPs?Nat Genet20002615115710.1038/7986611017069 | β | β | β |
| ZaykinDVStatistical Analysis of Genetic Associations1999PhD thesis, North Carolina State University Raleigh, NC | β | β | β |
| ZaykinDVZhivotovskyLAWestfallPHWeirBSTruncated product method for combining P-valuesGenet Epidemiol20022217018510.1002/gepi.004211788962 | β | β | β |
In this knowledge base
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Genomic Landscape of Susceptibility to Severe COVID-19 in the Slovenian Population. | Kovanda A et al. | β | 2024 | β |
| Quantifying posterior effect size distribution of susceptibility loci by common summary statistics. | Vsevolozhskaya OA et al. | β | 2020 | β |
| Quantifying posterior effect size distribution of susceptibility loci by common summary statistics | Vsevolozhskaya OA et al. | β | 2019 | β |
| Re-assessment of multiple testing strategies for more efficient genome-wide association studies. | Otani T et al. | β | 2018 | β |
| Multiple Testing in the Context of Gene Discovery in Sickle Cell Disease Using Genome-Wide Association Studies. | Kuo KHM | β | 2017 | β |
| Precision assessment of heterogeneity of lymphedema phenotype, genotypes and risk prediction. | Fu MR et al. | β | 2016 | β |
| Rare Variants Association Analysis in Large-Scale Sequencing Studies at the Single Locus Level. | Jeng XJ et al. | β | 2016 | β |
| Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies. | Kuo CL et al. | β | 2015 | β |
| Developing Peripheral Blood Gene Expression-Based Diagnostic Tests for Coronary Artery Disease: a Review. | Rhees B et al. | β | 2015 | β |
| Genetic variations in the VEGF pathway as prognostic factors in metastatic colorectal cancer patients treated with oxaliplatin-based chemotherapy. | ParΓ©-Brunet L et al. | β | 2015 | β |
| Monoacylglycerol lipase (MGLL) polymorphism rs604300 interacts with childhood adversity to predict cannabis dependence symptoms and amygdala habituation: Evidence from an endocannabinoid system-level analysis. | Carey CE et al. | β | 2015 | β |
| Combined analysis with copy number variation identifies risk loci in lung cancer. | Li X et al. | β | 2014 | β |
| Genetic variants in MUC4 gene are associated with lung cancer risk in a Chinese population. | Zhang Z et al. | β | 2013 | β |
| The use of haplotypes in the identification of interaction between SNPs. | Ken-Dror G et al. | β | 2013 | β |
| Travelling the world of gene-gene interactions. | Steen KV | β | 2012 | β |
| "Statistics 101"--a primer for the genetics of complex human disease. | Sinsheimer J | β | 2011 | β |
| Association of a mineralocorticoid receptor gene polymorphism with hypertension in a Spanish population. | Martinez F et al. | β | 2009 | β |
| Gene discovery through imaging genetics: identification of two novel genes associated with schizophrenia. | Potkin SG et al. | β | 2009 | β |
| Addiction Reviews. Preface. | Uhl GR | β | 2008 | β |
| An ensemble learning approach jointly modeling main and interaction effects in genetic association studies. | Zhang Z et al. | β | 2008 | β |
| Association of interacting genes in the toll-like receptor signaling pathway and the antibody response to pertussis vaccination. | Kimman TG et al. | β | 2008 | β |
| Biostatistical aspects of genome-wide association studies. | Ziegler A et al. | β | 2008 | β |
| Genetic variants in peroxisome proliferator-activated receptor-gamma gene are associated with risk of lung cancer in a Chinese population. | Chen D et al. | β | 2008 | β |
| Genome-wide significance for dense SNP and resequencing data. | Hoggart CJ et al. | β | 2008 | β |
| Identification of mitochondrial disease genes through integrative analysis of multiple datasets. | Aiyar RS et al. | β | 2008 | β |
| Molecular genetics of addiction and related heritable phenotypes: genome-wide association approaches identify "connectivity constellation" and drug target genes with pleiotropic effects. | Uhl GR et al. | β | 2008 | β |
| Molecular genetics of successful smoking cessation: convergent genome-wide association study results. | Uhl GR et al. | β | 2008 | β |
| ABCB1 genotype and PGP expression, function and therapeutic drug response: a critical review and recommendations for future research. | Leschziner GD et al. | β | 2007 | β |
| Quantitative mass spectrometry in proteomics: a critical review. | Bantscheff M et al. | β | 2007 | β |
| Transcriptomic signatures in breast cancer. | Fu J et al. | β | 2007 | β |
| A tutorial on statistical methods for population association studies. | Balding DJ | β | 2006 | β |