Phenotype harmonization and cross-study collaboration in GWAS consortia: the GENEVA experience.
- Authors
- Bennett, Siiri N; Caporaso, Neil; Fitzpatrick, Annette L; Agrawal, Arpana; Barnes, Kathleen; Boyd, Heather A; Cornelis, Marilyn C; Hansel, Nadia N; Heiss, Gerardo; Heit, John A; Kang, Jae Hee; Kittner, Steven J; Kraft, Peter; Lowe, William; Marazita, Mary L; Monroe, Kristine R; Pasquale, Louis R; Ramos, Erin M; van Dam, Rob M; Udren, Jenna; Williams, Kayleen; GENEVA Consortium
- Year
- 2011
- Journal
- Genetic epidemiology
- PMID
- 21284036
- DOI
- 10.1002/gepi.20564
- PMCID
- PMC3055921
Genome-wide association study (GWAS) consortia and collaborations formed to detect genetic loci for common phenotypes or investigate gene-environment (G*E) interactions are increasingly common. While these consortia effectively increase sample size, phenotype heterogeneity across studies represents a major obstacle that limits successful identification of these associations. Investigators are faced with the challenge of how to harmonize previously collected phenotype data obtained using different data collection instruments which cover topics in varying degrees of detail and over diverse time frames. This process has not been described in detail. We describe here some of the strategies and pitfalls associated with combining phenotype data from varying studies. Using the Gene Environment Association Studies (GENEVA) multi-site GWAS consortium as an example, this paper provides an illustration to guide GWAS consortia through the process of phenotype harmonization and describes key issues that arise when sharing data across disparate studies. GENEVA is unusual in the diversity of disease endpoints and so the issues it faces as its participating studies share data will be informative for many collaborations. Phenotype harmonization requires identifying common phenotypes, determining the feasibility of cross-study analysis for each, preparing common definitions, and applying appropriate algorithms. Other issues to be considered include genotyping timeframes, coordination of parallel efforts by other collaborative groups, analytic approaches, and imputation of genotype data. GENEVA's harmonization efforts and policy of promoting data sharing and collaboration, not only within GENEVA but also with outside collaborations, can provide important guidance to ongoing and new consortia.
No figures extracted from this document.
No entities extracted from this document yet.
No uploaded files.
In this knowledge base
| Title | Year | PMID |
|---|---|---|
| Evidence of CNIH3 involvement in opioid dependence. | 2016 | 26239289 |
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Genome-wide association study provides novel insight into the genetic architecture of severe obesity. | Krishnan M et al. | — | 2025 | → |
| Robust Automated Harmonization of Heterogeneous Data Through Ensemble Machine Learning: Algorithm Development and Validation Study. | Yang D et al. | — | 2025 | → |
| A Developmentally-Informative Genome-wide Association Study of Alcohol Use Frequency. | Thomas NS et al. | — | 2024 | → |
| Decoding the exposome: data science methodologies and implications in exposome-wide association studies (ExWASs). | Chung MK et al. | — | 2024 | → |
| Defining the causes of sporadic Parkinson's disease in the global Parkinson's genetics program (GP2). | Towns C et al. | — | 2023 | → |
| Defining the causes of sporadic Parkinson’s disease in the global Parkinson’s genetics program (GP2) | Towns C et al. | — | 2022 | — |
| Genetic Analyses of Enamel Hypoplasia in Multiethnic Cohorts. | Alotaibi RN et al. | — | 2022 | → |
| Identifying Datasets for Cross-Study Analysis in dbGaP using PhenX. | Pan H et al. | — | 2022 | → |
| Infection outcome needs two to tango: human host and the pathogen. | Maurya R et al. | — | 2022 | → |
| Phenotype Harmonization in the GLIDE2 Oral Health Genomics Consortium. | Divaris K et al. | — | 2022 | → |
| A System for Phenotype Harmonization in the National Heart, Lung, and Blood Institute Trans-Omics for Precision Medicine (TOPMed) Program. | Stilp AM et al. | — | 2021 | → |
| Genome Editing Human Pluripotent Stem Cells to Model β-Cell Disease and Unmask Novel Genetic Modifiers. | George MN et al. | — | 2021 | → |
| Using the PhenX Toolkit to Select Standard Measurement Protocols for Your Research Study. | Cox LA et al. | — | 2021 | → |
| Harmonizing behavioral outcomes across studies, raters, and countries: application to the genetic analysis of aggression in the ACTION Consortium. | Luningham JM et al. | — | 2020 | → |
| An Assessment of Environmental Health Measures in the Deepwater Horizon Research Consortia. | Pan H et al. | — | 2019 | → |
| Data Integration Methods for Phenotype Harmonization in Multi-Cohort Genome-Wide Association Studies With Behavioral Outcomes. | Luningham JM et al. | — | 2019 | → |
| National Institute on Drug Abuse genomics consortium white paper: Coordinating efforts between human and animal addiction studies. | Cates HM et al. | — | 2019 | → |
| Precision medicine in Thailand. | Shotelersuk V et al. | — | 2019 | → |
| Genetic susceptibility to infectious diseases: Current status and future perspectives from genome-wide approaches. | Mozzi A et al. | — | 2018 | → |
| Extracting Country-of-Origin from Electronic Health Records for Gene- Environment Studies as Part of the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) Study. | Farber-Eger E et al. | — | 2017 | → |
| Maelstrom Research guidelines for rigorous retrospective data harmonization. | Fortier I et al. | — | 2017 | → |
| Evidence of CNIH3 involvement in opioid dependence. | Nelson EC et al. | — | 2016 | → |
| Neighborhood × Serotonin Transporter Linked Polymorphic Region (5-HTTLPR) interactions for substance use from ages 10 to 24 years using a harmonized data set of African American children. | Windle M et al. | — | 2016 | → |
| Unravelling the human genome-phenome relationship using phenome-wide association studies. | Bush WS et al. | — | 2016 | → |
| Robust Gene-Gene Interaction Analysis in Genome Wide Association Studies. | Kim Y et al. | — | 2015 | → |
| Using the PhenX Toolkit to Add Standard Measures to a Study. | Hendershot T et al. | — | 2015 | → |
| Calibrating longitudinal cognition in Alzheimer's disease across diverse test batteries and datasets. | Gross AL et al. | — | 2014 | → |
| Data compatibility in the addiction sciences: an examination of measure commonality. | Conway KP et al. | — | 2014 | → |
| Data sharing in large research consortia: experiences and recommendations from ENGAGE. | Budin-Ljøsne I et al. | — | 2014 | → |
| Genetic heterogeneity of asthma phenotypes identified by a clustering approach. | Siroux V et al. | — | 2014 | → |
| Genome-wide diet-gene interaction analyses for risk of colorectal cancer. | Figueiredo JC et al. | — | 2014 | → |
| Research and publication trends in hospital medicine. | Dang Do AN et al. | — | 2014 | → |
| Data harmonization and federated analysis of population-based studies: the BioSHaRE project. | Doiron D et al. | — | 2013 | → |
| Meta-analysis methods for genome-wide association studies and beyond. | Evangelou E et al. | — | 2013 | → |
| Characterization of gene-environment interactions for colorectal cancer susceptibility loci. | Hutter CM et al. | — | 2012 | → |
| Exploiting gene expression variation to capture gene-environment interactions for disease. | Idaghdour Y et al. | — | 2012 | → |
| Genome-wide association scan of dental caries in the permanent dentition. | Wang X et al. | — | 2012 | → |
| Measuring alcohol consumption for genomic meta-analyses of alcohol intake: opportunities and challenges. | Agrawal A et al. | — | 2012 | → |
| Research guidelines in the era of large-scale collaborations: an analysis of Genome-wide Association Study Consortia. | Austin MA et al. | — | 2012 | → |
| Using PhenX measures to identify opportunities for cross-study analysis. | Pan H et al. | — | 2012 | → |
| Genome-wide association studies of alcohol intake--a promising cocktail? | Agrawal A et al. | — | 2011 | → |
| Using the PhenX Toolkit to Add Standard Measures to a Study. | Hendershot T et al. | — | 2011 | → |