paperKB
coga / coga-kb
Help
Sign in

Chunk #5 — MATERIALS AND METHODS — GWAS summary statistics curation and integration

Source
CAUSALdb: a database for disease/trait causal variants identified using summary statistics of genome-wide association studies.
Embedded
yes

Text

Besides, we integrated GWAS summary statistics of non-UKBB cohorts from several public databases, including GWAS Catalog (8), LD Hub (12), GRASP (10), PhenoScanner (13) and dbGaP (22). We also curated hundreds of summary statistics from websites of consortiums such as PGC (https://www.med.unc.edu/pgc), MAGIC (https://www.magicinvestigators.org), SSGAC (https://www.thessgac.org), and JENGER (http://jenger.riken.jp/en/). We only included studies for which the original publication was available and if population-related information and sample size were clearly recorded. To remove duplicate GWAS summary statistics among these resources, we identified redundancy by publication source and only retained the one with the most information. We extracted the sample size, population, and source information across these databases and the original study. GWAS population information was mapped to five super-populations (AFR, AMR, EAS, EUR and SAS) in the 1000 Genomes Project (1KGP). To ensure accurate fine-mapping using the 1KGP LD information, we did not include GWASs conducted on mixed populations.