Including additional controls from public databases improves the power of a genome-wide association study.
- Authors
- Mukherjee, Semanti; Simon, Jennifer; Bayuga, Sharon; Ludwig, Emmy; Yoo, Sarah; Orlow, Irene; Viale, Agnes; Offit, Kenneth; Kurtz, Robert C; Olson, Sara H; Klein, Robert J
- Year
- 2011
- Journal
- Human heredity
- PMID
- 21849791
- DOI
- 10.1159/000330149
- PMCID
- PMC3171281
Though genome-wide association studies (GWAS) have identified numerous susceptibility loci for common diseases, their use is limited due to the expense of genotyping large cohorts of individuals. One potential solution is to use 'additional controls', or genotype data from control individuals deposited in public repositories. While this approach has been used by several groups, the genetically heterogeneous nature of the population of the United States makes this approach potentially problematic. We empirically investigated the utility of this approach in a US-based GWAS. In a small GWAS of pancreatic cancer in New York, we observed clear population structure differences relative to controls from the database of Genotypes and Phenotypes (dbGaP). When we conduct the GWAS using these additional controls, we find large inflation of the test statistic that is properly corrected by using eigenvectors from principal components analysis as covariates. To deal with errors introduced due to different sources, we propose simultaneously genotyping a small number of controls along with cases and then comparing this group to the additional controls. We show that removing SNPs that show differences between these control groups reduces false-positive findings. Thus, through an empirical approach, this report provides practical guidance for using additional controls from publicly available datasets.
No figures extracted from this document.
No chunks β full text not yet ingested.
No entities extracted from this document yet.
No uploaded files.
No citations found.
In this knowledge base
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Genetic Variants Associated with Suspected Neonatal Hypoxic Ischaemic Encephalopathy: A Study in a South African Context. | Foden CJ et al. | β | 2025 | β |
| Mitochondrial genome variants associated with amyotrophic lateral sclerosis and their haplogroup distribution. | Briones MRS et al. | β | 2024 | β |
| Mitochondrial genome variants associated with Amyotrophic Lateral Sclerosis and their haplogroup distribution | Briones MRS et al. | β | 2024 | β |
| PASTRY: achieving balanced power for detecting risk and protective minor alleles in meta-analysis of association studies with overlapping subjects. | Kim EE et al. | β | 2024 | β |
| Increase in power by obtaining 10 or more controls per case when type-1 error is small in large-scale association studies. | Katki HA et al. | β | 2023 | β |
| GAWMerge expands GWAS sample size and diversity by combining array-based genotyping and whole-genome sequencing. | Mathur R et al. | β | 2022 | β |
| Syringohydromyelia in Dogs: The Genomic Component Underlying a Complex Neurological Disease. | Andrino S et al. | β | 2022 | β |
| Genome-wide association analysis identifies a meningioma risk locus at 11p15.5. | Claus EB et al. | β | 2018 | β |
| FOLD: a method to optimize power in meta-analysis of genetic association studies with overlapping subjects. | Kim EE et al. | β | 2017 | β |
| KAT2B polymorphism identified for drug abuse in African Americans with regulatory links to drug abuse pathways in human prefrontal cortex. | Johnson EO et al. | β | 2016 | β |
| Hypothesis testing at the extremes: fast and robust association for high-throughput data. | Zhou YH et al. | β | 2015 | β |
| Genome-wide analysis of the role of copy-number variation in pancreatic cancer risk. | Willis JA et al. | β | 2014 | β |
| Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy. | Johnson EO et al. | β | 2013 | β |
| Susceptibility loci associated with specific and shared subtypes of lymphoid malignancies. | Vijai J et al. | β | 2013 | β |
| Population-based case-control association studies. | Hancock DB et al. | β | 2012 | β |