Gene set analysis of genome-wide association studies: methodological issues and perspectives.
- Authors
- Wang, Lily; Jia, Peilin; Wolfinger, Russell D; Chen, Xi; Zhao, Zhongming
- Year
- 2011
- Journal
- Genomics
- PMID
- 21565265
- DOI
- 10.1016/j.ygeno.2011.04.006
- PMCID
- PMC3852939
Recent studies have demonstrated that gene set analysis, which tests disease association with genetic variants in a group of functionally related genes, is a promising approach for analyzing and interpreting genome-wide association studies (GWAS) data. These approaches aim to increase power by combining association signals from multiple genes in the same gene set. In addition, gene set analysis can also shed more light on the biological processes underlying complex diseases. However, current approaches for gene set analysis are still in an early stage of development in that analysis results are often prone to sources of bias, including gene set size and gene length, linkage disequilibrium patterns and the presence of overlapping genes. In this paper, we provide an in-depth review of the gene set analysis procedures, along with parameter choices and the particular methodology challenges at each stage. In addition to providing a survey of recently developed tools, we also classify the analysis methods into larger categories and discuss their strengths and limitations. In the last section, we outline several important areas for improving the analytical strategies in gene set analysis.
Work flow for gene set analysis of GWAS datasets.
LLM interpretation
This figure is a flow diagram illustrating the workflow for gene set analysis of GWAS datasets. It outlines a four-step sequential process: Preprocessing (assigning SNPs to genes and genes to sets), Formulating hypotheses (competitive vs. self-contained null hypotheses), Constructing test statistics (gene-based vs. SNP-based tests while accounting for bias), and Assessing statistical significance (via permutation tests or parametric models). The process flows linearly from top to bottom, with a bidirectional interaction indicated between the types of test statistics and potential sources of bias.
| Name | Type |
|---|---|
| Androgen local | drug |
| bipolar disorder | phenotype |
| breast cancer | phenotype |
| causal SNP | cohort |
| complex diseases | phenotype |
| C-reactive protein | phenotype |
| Crohnβs disease | phenotype |
| cystic fibrosis | phenotype |
| disease | phenotype |
| disease risk | phenotype |
| disease status | phenotype |
| disease susceptibility | phenotype |
| Endometrial cancer local | phenotype |
| estrogen | drug |
| GAIN schizophrenia dataset local | cohort |
| gene | gene |
| Gene Ontology | drug |
| gene set local | drug |
| GeneSet local | gene |
| gene set annotation database local | drug |
| genetic variants | cohort |
| genome-wide association studies | cohort |
| GWAS | cohort |
| GWAS datasets local | cohort |
| HapMap | cohort |
| HapMap3 | cohort |
| HNF1A local | gene |
| KEGG | drug |
| most significant SNP local | variant |
| MSigDB local | drug |
| PANTHER Classification System local | drug |
| phenotype of interest | phenotype |
| REACTOME local | drug |
| schizophrenia | phenotype |
| single nucleotide polymorphisms | variant |
| SNP | cohort |
| trait level local | phenotype |
| type 2 diabetes | phenotype |
| UGT1A1 local | gene |
| UGT1A10 local | gene |
| UGT1A3 local | gene |
| UGT1A4 local | gene |
| UGT1A5 local | gene |
| UGT1A6 local | gene |
| UGT1A7 local | gene |
| UGT1A8 local | gene |
| UGT1A9 local | gene |
No uploaded files.
In this knowledge base
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Multi-trait GWAS for growth under contrasting thermal rearing conditions in rainbow trout (Oncorhynchus mykiss). | Gallardo-Hidalgo J et al. | β | 2025 | β |
| A best-match approach for gene set analyses in embedding spaces. | Li L et al. | β | 2024 | β |
| A genome-wide association study of Chinese and English language phenotypes in Hong Kong Chinese children. | Lin YP et al. | β | 2024 | β |
| A high-dimensional omnibus test for set-based association analysis. | Yang H et al. | β | 2024 | β |
| Novel insights into the genetic architecture of pregnancy glycemic traits from 14,744 Chinese maternities. | Zhu H et al. | β | 2024 | β |
| The goldmine of GWAS summary statistics: a systematic review of methods and tools. | Kontou PI et al. | β | 2024 | β |
| Uncertainty quantification in high-dimensional linear models incorporating graphical structures with applications to gene set analysis. | Tan X et al. | β | 2024 | β |
| A gene regulatory network approach harmonizes genetic and epigenetic signals and reveals repurposable drug candidates for multiple sclerosis. | Manuel AM et al. | β | 2023 | β |
| The shared genetic architecture of suicidal behaviour and psychiatric disorders: A genomic structural equation modelling study. | Kootbodien T et al. | β | 2023 | β |
| Toward Best Practices for Imaging Transcriptomics of the Human Brain. | Arnatkeviciute A et al. | β | 2023 | β |
| Variants in the CETP gene affect levels of HDL cholesterol by reducing the amount, and not the specific lipid transfer activity, of secreted CETP. | Γlnes Γ S et al. | β | 2023 | β |
| Gene-Interaction-Sensitive enrichment analysis in congenital heart disease. | Woodward AA et al. | β | 2022 | β |
| Genome-wide identification of the genetic basis of amyotrophic lateral sclerosis. | Zhang S et al. | β | 2022 | β |
| MEF2C gene variations are associated with ADHD in the Chinese Han population: a case-control study. | Fu X et al. | β | 2022 | β |
| Translational relevance of forward genetic screens in animal models for the study of psychiatric disease. | Sheardown E et al. | β | 2022 | β |
| An integrative study of genetic variants with brain tissue expression identifies viral etiology and potential drug targets of multiple sclerosis. | Manuel AM et al. | β | 2021 | β |
| ASYMPTOTICALLY INDEPENDENT U-STATISTICS IN HIGH-DIMENSIONAL TESTING. | He Y et al. | β | 2021 | β |
| Gene-Based Tests of a Genome-Wide Association Study Dataset Highlight Novel Multiple Sclerosis Risk Genes. | Li H et al. | β | 2021 | β |
| Genes and Pathways Affecting Sheep Productivity Traits: Genetic Parameters, Genome-Wide Association Mapping, and Pathway Enrichment Analysis. | Esmaeili-Fard SM et al. | β | 2021 | β |
| Genetic Pathways and Functional Subnetworks for the Complex Nature of Bipolar Disorder in Genome-Wide Association Study. | Kuo CY et al. | β | 2021 | β |
| Genome-wide association study and pathway analysis identify NTRK2 as a novel candidate gene for litter size in sheep. | Esmaeili-Fard SM et al. | β | 2021 | β |
| Integrating Multi-Omics Data for Gene-Environment Interactions. | Du Y et al. | β | 2021 | β |
| Machine Learning-Based Approach Highlights the Use of a Genomic Variant Profile for Precision Medicine in Ovarian Failure. | Henarejos-Castillo I et al. | β | 2021 | β |
| Overcoming false-positive gene-category enrichment in the analysis of spatially resolved transcriptomic brain atlas data. | Fulcher BD et al. | β | 2021 | β |
| A Gene-Set Enrichment and Protein-Protein Interaction Network-Based GWAS with Regulatory SNPs Identifies Candidate Genes and Pathways Associated with Carcass Traits in Hanwoo Cattle. | Srikanth K et al. | β | 2020 | β |
| An integrative, genomic, transcriptomic and network-assisted study to identify genes associated with human cleft lip with or without cleft palate. | Yan F et al. | β | 2020 | β |
| Fifteen Years of Gene Set Analysis for High-Throughput Genomic Data: A Review of Statistical Approaches and Future Challenges. | Das S et al. | β | 2020 | β |
| Functional annotation of melanoma risk loci identifies novel susceptibility genes. | Fang S et al. | β | 2020 | β |
| Genes acting in synapses and neuron projections are early targets of selection during urban colonization. | Mueller JC et al. | β | 2020 | β |
| Gene-set Enrichment with Mathematical Biology (GEMB). | Cochran AL et al. | β | 2020 | β |
| Genome-wide Identification of the Genetic Basis of Amyotrophic Lateral Sclerosis | Zhang S et al. | β | 2020 | β |
| Identification of Novel Genes Associated with Cortical Thickness in Alzheimer's Disease: Systems Biology Approach to Neuroimaging Endophenotype. | Kim BH et al. | β | 2020 | β |
| Pathway Analysis of Renal Cell Carcinoma Genome-Wide Association Studies Identifies Novel Associations. | Purdue MP et al. | β | 2020 | β |
| Polygenic Influences on Pubertal Timing and Tempo and Depressive Symptoms in Boys and Girls. | Horvath G et al. | β | 2020 | β |
| Simultaneous Enrichment Analysis of all Possible Gene-sets: Unifying Self-Contained and Competitive Methods. | Ebrahimpoor M et al. | β | 2020 | β |
| Tourette Syndrome Risk Genes Regulate Mitochondrial Dynamics, Structure, and Function. | Clarke RA et al. | β | 2020 | β |
| Agnostic Pathway/Gene Set Analysis of Genome-Wide Association Data Identifies Associations for Pancreatic Cancer. | Walsh N et al. | β | 2019 | β |
| An evaluation of supervised methods for identifying differentially methylated regions in Illumina methylation arrays. | Mallik S et al. | β | 2019 | β |
| A QTL on chromosome 3q23 influences processing speed in humans. | Knowles EEM et al. | β | 2019 | β |
| Beyond genome-wide significance: integrative approaches to the interpretation and extension of GWAS findings for alcohol use disorder. | Salvatore JE et al. | β | 2019 | β |
| Cross-species alcohol dependence-associated gene networks: Co-analysis of mouse brain gene expression and human genome-wide association data. | Mignogna KM et al. | β | 2019 | β |
| Gene-level genome-wide association analysis of suicide attempt, a preliminary study in a psychiatric Mexican population. | GonzΓ‘lez-Castro TB et al. | β | 2019 | β |
| MicroRNA-655-3p and microRNA-497-5p inhibit cell proliferation in cultured human lip cells through the regulation of genes related to human cleft lip. | Gajera M et al. | β | 2019 | β |
| Non-Syndromic Cleft Lip with or without Cleft Palate: Genome-Wide Association Study in Europeans Identifies a Suggestive Risk Locus at 16p12.1 and Supports <i>SH3PXD2A</i> as a Clefting Susceptibility Gene. | van Rooij IA et al. | β | 2019 | β |
| Powerful gene set analysis in GWAS with the Generalized Berk-Jones statistic. | Sun R et al. | β | 2019 | β |
| The Genetics of Polycystic Ovary Syndrome: An Overview of Candidate Gene Systematic Reviews and Genome-Wide Association Studies. | Hiam D et al. | β | 2019 | β |
| Using targeted next-generation sequencing to characterize genetic differences associated with insecticide resistance in Culex quinquefasciatus populations from the southern U.S. | Kothera L et al. | β | 2019 | β |
| Validity of polygenic risk scores: are we measuring what we think we are? | Janssens ACJW | β | 2019 | β |
| A Biomolecular Network Driven Proteinic Interaction in HCV Clearance. | Singh P et al. | β | 2018 | β |
| Association analyses of more than 140,000 men identify 63 new prostate cancer susceptibility loci. | Schumacher FR et al. | β | 2018 | β |
| Autism spectrum disorders and autistic traits share genetics and biology. | Bralten J et al. | β | 2018 | β |
| Computational systems biology approaches for Parkinson's disease. | Glaab E | β | 2018 | β |
| Genetic moderation of the effects of the Family Check-Up intervention on children's internalizing symptoms: A longitudinal study with a racially/ethnically diverse sample. | Lemery-Chalfant K et al. | β | 2018 | β |
| Genomic and Genotypic Characterization of <i>Cylindrospermopsis raciborskii</i>: Toward an Intraspecific Phylogenetic Evaluation by Comparative Genomics. | Abreu VAC et al. | β | 2018 | β |
| Integration of Enhancer-Promoter Interactions with GWAS Summary Results Identifies Novel Schizophrenia-Associated Genes and Pathways. | Wu C et al. | β | 2018 | β |
| The Complex Interaction of Mitochondrial Genetics and Mitochondrial Pathways in Psychiatric Disease. | Cuperfain AB et al. | β | 2018 | β |
| Wild GWAS-association mapping in natural populations. | Santure AW et al. | β | 2018 | β |
| A Gene-Based Analysis of Acoustic Startle Latency. | Smith AK et al. | β | 2017 | β |
| A Novel Approach for Pathway Analysis of GWAS Data Highlights Role of BMP Signaling and Muscle Cell Differentiation in Colorectal Cancer Susceptibility. | Mishra A et al. | β | 2017 | β |
| A pathway analysis of genome-wide association study highlights novel type 2 diabetes risk pathways. | Liu Y et al. | β | 2017 | β |
| Association analysis identifies 65 new breast cancer risk loci. | Michailidou K et al. | β | 2017 | β |
| Deep Sequencing of 71 Candidate Genes to Characterize Variation Associated with Alcohol Dependence. | Clark SL et al. | β | 2017 | β |
| Examination of the Involvement of Cholinergic-Associated Genes in Nicotine Behaviors in European and African Americans. | Melroy-Greif WE et al. | β | 2017 | β |
| Gene-wide Association Study Reveals RNF122 Ubiquitin Ligase as a Novel Susceptibility Gene for Attention Deficit Hyperactivity Disorder. | Garcia-MartΓnez I et al. | β | 2017 | β |
| Genomic Analysis of Genotype-by-Social Environment Interaction for <i>Drosophila melanogaster</i> Aggressive Behavior. | Rohde PD et al. | β | 2017 | β |
| Identification of additional loci associated with antibody response to Mycobacterium avium ssp. Paratuberculosis in cattle by GSEA-SNP analysis. | Del Corvo M et al. | β | 2017 | β |
| Identification of ten variants associated with risk of estrogen-receptor-negative breast cancer. | Milne RL et al. | β | 2017 | β |
| Pathway analysis of complex diseases for GWAS, extending to consider rare variants, multi-omics and interactions. | Kao PY et al. | β | 2017 | β |
| Pathway-based genome-wide association analysis of milk coagulation properties, curd firmness, cheese yield, and curd nutrient recovery in dairy cattle. | Dadousis C et al. | β | 2017 | β |
| Pooling-Based Genome-Wide Association Study Identifies Risk Loci in the Pathogenesis of Ovarian Endometrioma in Chinese Han Women. | Wang W et al. | β | 2017 | β |
| The genetics of human longevity: an intricacy of genes, environment, culture and microbiome. | Dato S et al. | β | 2017 | β |
| The null hypothesis of GSEA, and a novel statistical model for competitive gene set analysis. | Debrabant B | β | 2017 | β |
| A novel type 2 diabetes risk allele increases the promoter activity of the muscle-specific small ankyrin 1 gene. | Yan R et al. | β | 2016 | β |
| Covariance Association Test (CVAT) Identifies Genetic Markers Associated with Schizophrenia in Functionally Associated Biological Processes. | Rohde PD et al. | β | 2016 | β |
| Cross-Study Comparison Reveals Common Genomic, Network, and Functional Signatures of Desiccation Resistance in Drosophila melanogaster. | Telonis-Scott M et al. | β | 2016 | β |
| FLAGS: A Flexible and Adaptive Association Test for Gene Sets Using Summary Statistics. | Huang J et al. | β | 2016 | β |
| Gene set analysis for interpreting genetic studies. | Pers TH | β | 2016 | β |
| Genetics of Coronary Artery Disease. | McPherson R et al. | β | 2016 | β |
| Genetics of structural connectivity and information processing in the brain. | Giddaluru S et al. | β | 2016 | β |
| Genetic variation in FAAH is associated with cannabis use disorders in a young adult sample of Mexican Americans. | Melroy-Greif WE et al. | β | 2016 | β |
| Genome-wide gene-based analysis suggests an association between Neuroligin 1 (NLGN1) and post-traumatic stress disorder. | Kilaru V et al. | β | 2016 | β |
| Investigation of a Possible Role for the Histidine Decarboxylase Gene in Tourette Syndrome in the Chinese Han Population: A Family-Based Study. | Dong H et al. | β | 2016 | β |
| Pathway Analysis Incorporating Protein-Protein Interaction Networks Identified Candidate Pathways for the Seven Common Diseases. | Lin PL et al. | β | 2016 | β |
| Protein Quantitative Trait Loci Analysis Identifies Genetic Variation in the Innate Immune Regulator TOLLIP in Post-Lung Transplant Primary Graft Dysfunction Risk. | Cantu E et al. | β | 2016 | β |
| Targeted genomic analysis reveals widespread autoimmune disease association with regulatory variants in the TNF superfamily cytokine signalling network. | Richard AC et al. | β | 2016 | β |
| Test for association of common variants in GRM7 with alcohol consumption. | Melroy-Greif WE et al. | β | 2016 | β |
| The statistical properties of gene-set analysis. | de Leeuw CA et al. | β | 2016 | β |
| ABC transporters and the proteasome complex are implicated in susceptibility to Stevens-Johnson syndrome and toxic epidermal necrolysis across multiple drugs. | Nicoletti P et al. | β | 2015 | β |
| Alcohol Dependence Genetics: Lessons Learned From Genome-Wide Association Studies (GWAS) and Post-GWAS Analyses. | Hart AB et al. | β | 2015 | β |
| A Powerful Pathway-Based Adaptive Test for Genetic Association with Common or Rare Variants. | Pan W et al. | β | 2015 | β |
| Endophenotypes for Alcohol Use Disorder: An Update on the Field. | Salvatore JE et al. | β | 2015 | β |
| Genetic Analysis of Association Between Calcium Signaling and Hippocampal Activation, Memory Performance in the Young and Old, and Risk for Sporadic Alzheimer Disease. | Heck A et al. | β | 2015 | β |
| Genetic factors and epigenetic mechanisms of longevity: current perspectives. | Lazarus J et al. | β | 2015 | β |
| Integrating Diverse Types of Genomic Data to Identify Genes that Underlie Adverse Pregnancy Phenotypes. | Hirbo J et al. | β | 2015 | β |
| Interaction of Wnt pathway related variants with type 2 diabetes in a Chinese Han population. | Zhou JB et al. | β | 2015 | β |
| Involvement of astrocyte metabolic coupling in Tourette syndrome pathogenesis. | de Leeuw C et al. | β | 2015 | β |
| Lessons learned in the analysis of high-dimensional data in vaccinomics. | Oberg AL et al. | β | 2015 | β |
| Linking genes to neurological clinical practice: the genomic basis for neurorehabilitation. | Goldberg A et al. | β | 2015 | β |
| MAGMA: generalized gene-set analysis of GWAS data. | de Leeuw CA et al. | β | 2015 | β |
| META-GSA: Combining Findings from Gene-Set Analyses across Several Genome-Wide Association Studies. | Rosenberger A et al. | β | 2015 | β |
| Pathway-Based Genome-Wide Association Studies for Plasma Triglycerides in Obese Females and Normal-Weight Controls. | Jiao H et al. | β | 2015 | β |
| Polygenic risk for externalizing disorders: Gene-by-development and gene-by-environment effects in adolescents and young adults. | Salvatore JE et al. | β | 2015 | β |
| Systems biology and gene networks in neurodevelopmental and neurodegenerative disorders. | Parikshak NN et al. | β | 2015 | β |
| Systems Genetics Analysis of Genome-Wide Association Study Reveals Novel Associations Between Key Biological Processes and Coronary Artery Disease. | Ghosh S et al. | β | 2015 | β |
| Converging genetic and functional brain imaging evidence links neuronal excitability to working memory, psychiatric disease, and brain activity. | Heck A et al. | β | 2014 | β |
| Explaining additional genetic variation in complex traits. | Robinson MR et al. | β | 2014 | β |
| Human longevity and variation in DNA damage response and repair: study of the contribution of sub-processes using competitive gene-set analysis. | Debrabant B et al. | β | 2014 | β |
| Integrative identification of deregulated miRNA/TF-mediated gene regulatory loops and networks in prostate cancer. | Afshar AS et al. | β | 2014 | β |
| Network.assisted analysis to prioritize GWAS results: principles, methods and perspectives. | Jia P et al. | β | 2014 | β |
| Pathway-based analysis tools for complex diseases: a review. | Jin L et al. | β | 2014 | β |
| Polygenic scores predict alcohol problems in an independent sample and show moderation by the environment. | Salvatore JE et al. | β | 2014 | β |
| Semantic particularity measure for functional characterization of gene sets using gene ontology. | Bettembourg C et al. | β | 2014 | β |
| The synapse in schizophrenia. | Pocklington AJ et al. | β | 2014 | β |
| Whole-genome pathway analysis on 132,497 individuals identifies novel gene-sets associated with body mass index. | Simonson MA et al. | β | 2014 | β |
| A century after Fisher: time for a new paradigm in quantitative genetics. | Nelson RM et al. | β | 2013 | β |
| Association signals unveiled by a comprehensive gene set enrichment analysis of dental caries genome-wide association studies. | Wang Q et al. | β | 2013 | β |
| Finding type 2 diabetes causal single nucleotide polymorphism combinations and functional modules from genome-wide association data. | Kang C et al. | β | 2013 | β |
| Gene-based testing of interactions in association studies of quantitative traits. | Ma L et al. | β | 2013 | β |
| Genome-wide modeling of complex phenotypes in Caenorhabditis elegans and Drosophila melanogaster. | De S et al. | β | 2013 | β |
| Integrated enrichment analysis of variants and pathways in genome-wide association studies indicates central role for IL-2 signaling genes in type 1 diabetes, and cytokine signaling genes in Crohn's disease. | Carbonetto P et al. | β | 2013 | β |
| Network-based multiple sclerosis pathway analysis with GWAS data from 15,000 cases and 30,000 controls. | International Multiple Sclerosis Genetics Consortium | β | 2013 | β |
| Pathway analysis using information from allele-specific gene methylation in genome-wide association studies for bipolar disorder. | Chuang LC et al. | β | 2013 | β |
| Pathway-based analysis using genome-wide association data from a Korean non-small cell lung cancer study. | Lee D et al. | β | 2013 | β |
| Pathway-based approaches for sequencing-based genome-wide association studies. | Wu G et al. | β | 2013 | β |
| Pharmacogenetics of topical and systemic treatment of psoriasis. | Prieto-PΓ©rez R et al. | β | 2013 | β |
| Topics in transcriptional control of lipid metabolism: from transcription factors to gene-promoter polymorphisms. | Bergen WG et al. | β | 2013 | β |
| Translating genome wide association study results to associations among common diseases: in silico study with an electronic medical record. | Anand V et al. | β | 2013 | β |
| A bias-reducing pathway enrichment analysis of genome-wide association data confirmed association of the MHC region with schizophrenia. | Jia P et al. | β | 2012 | β |
| Cell adhesion molecules contribute to Alzheimer's disease: multiple pathway analyses of two genome-wide association studies. | Liu G et al. | β | 2012 | β |
| Comparison of pathway analysis approaches using lung cancer GWAS data sets. | Fehringer G et al. | β | 2012 | β |
| Enriched pathways for major depressive disorder identified from a genome-wide association study. | Kao CF et al. | β | 2012 | β |
| Genomics in 2012: challenges and opportunities in the next generation sequencing era. | Zhao Z et al. | β | 2012 | β |
| Integrated genome-wide pathway association analysis with INTERSNP. | Herold C et al. | β | 2012 | β |
| Integrative pathway analysis of genome-wide association studies and gene expression data in prostate cancer. | Jia P et al. | β | 2012 | β |
| Network-assisted investigation of combined causal signals from genome-wide association studies in schizophrenia. | Jia P et al. | β | 2012 | β |
| Pathway analysis of genomic data: concepts, methods, and prospects for future development. | Ramanan VK et al. | β | 2012 | β |
| Physiology and Endocrinology Symposium: How single nucleotide polymorphism chips will advance our knowledge of factors controlling puberty and aid in selecting replacement beef females. | Snelling WM et al. | β | 2012 | β |
| Pooled sample-based GWAS: a cost-effective alternative for identifying colorectal and prostate cancer risk variants in the Polish population. | Gaj P et al. | β | 2012 | β |
| Searching joint association signals in CATIE schizophrenia genome-wide association studies through a refined integrative network approach. | Jia P et al. | β | 2012 | β |
| Uncovering networks from genome-wide association studies via circular genomic permutation. | Cabrera CP et al. | β | 2012 | β |
| A comprehensive network and pathway analysis of candidate genes in major depressive disorder. | Jia P et al. | β | 2011 | β |
| An improved kinetic model for the acetone-butanol-ethanol pathway of Clostridium acetobutylicum and model-based perturbation analysis. | Li RD et al. | β | 2011 | β |
| Network-assisted Causal Gene Detection in Genome-wide Association Studies: An Improved Module Search Algorithm. | Jia P et al. | β | 2011 | β |