paperKB
coga / coga-kb
Help
Sign in

Chunk #9 — Materials and Methods — Study material

Source
Population substructure and control selection in genome-wide association studies.
Embedded
yes

Text

Further data cleaning of the autosomal SNPs typed in both PLCO and NHS scans retained SNPs with MAF >5%, a P-value for fitness for Hardy-Weinberg proportion equilibrium exact test >10−5 in both control sets, and a rate of missing genotypes <5%. A handful of SNPs that had different genotype frequencies between the PLCO controls and NHS controls (with P-value <10−7 based on the 2-df chi-squared test) were removed, most likely due to informatic inconsistencies in SNP identification between studies. In total, 475,116 autosomal SNPs (hereafter called the testing SNPs) were identified for further analysis.