Chunk #37 — Data access — GWAS summary statistics and validation data

Source: Leveraging functional annotations in genetic risk prediction for human complex diseases.
Embedded: yes

Text

a meta-analysis with 5,539 cases and 20,169 controls to train the model [30]. WTCCC samples were removed from the meta-analysis and used for validation (Ncase = 1,829 and Ncontrol = 2,892) [26]. For type-II diabetes, the training dataset is Diabetes Genetics Replication and Meta-analysis (DIAGRAM) consortium GWAS with 12,171 cases and 56,862 controls [31]. We used samples from Northwestern NUgene Project for validation (Ncase = 662 and Ncontrol = 517) [32]. Samples from Institute for Personalized Medicine (IPM) eMERGE project are used for trans-ethnic analysis (African American: Ncase = 517 and Ncontrol = 213; Hispanic: Ncase = 477 and Ncontrol = 102) [33]. The training dataset for celiac disease is from a GWAS with 4,533 cases and 10,750 controls [34]. Samples in the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) celiac disease study were used for validation (Ncase = 1,716 and Ncontrol = 530) [35].