paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #36 — Data access — GWAS summary statistics and validation data

Source
Leveraging functional annotations in genetic risk prediction for human complex diseases.
Embedded
yes

Text

For Crohn’s disease, we trained the model using summary statistics from International Inflammatory Bowel Disease Genetics Consortium (IIBDGC; Ncase = 6,333 and Ncontrol = 15,056) [25]. Samples from the Wellcome Trust Case Control Consortium (WTCCC) were removed from the meta-analysis and used as the validation dataset (Ncase = 1,689 and Ncontrol = 2,891) [26]. For breast cancer, we trained the model using summary statistics from Genetic Associations and Mechanisms in Oncology (GAME-ON) study (Ncase = 16,003 and Ncontrol = 41,335) [27], and tested the performance using samples from the Cancer Genetic Markers of Susceptibility (CGEMS) study (Ncase = 966 and Ncontrol = 70) [28]. Shared samples between CGEMS and GAME-ON were removed. We used samples from the CIDR-GWAS of breast cancer for trans-ethnic analysis (Ncase = 1,666 and Ncontrol = 2,038) [29]. For rheumatoid arthritis, we used summary statistics from a meta-analysis with 5,539 cases and 20,169 controls to train the model [30]. WTCCC samples were removed from the meta-analysis and used for validation (Ncase = 1,829 and Ncontrol = 2,892) [26]. For type-II diabetes, the training dataset is Diabetes