Chunk #33 — ONLINE METHODS — Association Analysis — Genomic Quality Control: Principal Component Analysis (PCA) and Relatedness Checking in the core PGC dataset
We performed PCA for all 90 cohorts separately using SNPs with high imputation quality (INFO >0.8), low missingness (<1%), MAF>0.05 and in relative linkage equilibrium (LD) after 2 iterations of LD pruning (r2 < 0.2, 200 SNP windows). We removed well known long-range-LD areas (MHC and chr8 inversion). Thus, we retained between 57K and 95K autosomal SNPs in each cohort. SNPs present in all 90 cohorts (N=7,561) were used for robust relatedness testing using PLINK v1.94; pairs of subjects with PIHAT > 0.2 were identified and one member of each pair removed at random, preferentially retaining cases and trio members over case-control members.