paperKB
coga / coga-kb
Help
Sign in

Chunk #7 — Subjects and methods — Quality control

Source
Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy.
Embedded
yes

Text

Quality control (QC) procedures, mimicking standard procedures used for GWAS, were conducted in each study separately using PLINK (Purcell et al. 2007) unless otherwise stated. Subjects were excluded due to call rate <95 %, discordance between reported gender and estimated gender based on chromosome X SNP data (FST < 0.2 used to indicate female and FST >0.8 used to indicate male), or excessive homozygosity based on autosomal SNP data (FST < −0.2 or FST >0.5). Further, for subject pairs having identity-by-state estimates greater than 99 % (indicative of sample duplication or monozygotic twins), we retained the subject with the highest call rate. Identity-by-descent (IBD) estimates were also generated to identify subject pairs (or clusters) with cryptic relatedness. For subjects classified as European American, we identified relative clusters having IBD >10 % (indicative of third-degree relation or closer) and retained the single subject having the highest call rate from each cluster. Since IBD estimates may be inflated in the presence of population stratification, we used the KING program (Manichaikul et al. 2010) to identify clusters among African American subjects. The KING