Chunk #64 — Methods — GWAS data for risk factors and diseases

Source: Causal associations between risk factors and common diseases inferred from GWAS summary data.
Embedded: yes

Text

GWAS data for 22 common diseases were from two community-based studies, i.e., Genetic Epidemiology Research on Adult Health and Aging29 (GERA) and UKB pilot phase27. There were 60,586 individuals of European ancestry in the GERA data. We cleaned the GERA genotype data using the standard quality control (QC) filters (excluding SNPs with missing rate ≥0.02, Hardy–Weinberg equilibrium test P-value ≤ 1 × 10−6 or minor allele count < 1, and removing individuals with missing rate ≥0.02), and imputed the genotype data to the 1000G using IMPUTE255. We used GCTA56 to estimate the genetic relationship matrix (GRM) of the individuals using a subset of the imputed SNPs (minor allele frequency, MAF ≥0.01 and imputation INFO score ≥0.3 and in common with those in the HapMap phase 3, HM3), and computed the first 20 principal components (PCs) from the GRM. We removed one of each pair of individuals with estimated genetic relatedness ≥0.05 and retained 53,991 unrelated individuals for analysis. Individual-level ICD-9 codes were not available in dbGaP but had been classified into 22 common diseases (Supplementary Table 4). The disease status