Further data cleaning of the autosomal SNPs typed in both PLCO and NHS scans retained SNPs with MAF >5%, a P-value for fitness for Hardy-Weinberg proportion equilibrium exact test >10−5 in both control sets, and a rate of missing genotypes <5%. A handful of SNPs that had different genotype frequencies between the PLCO controls and NHS controls (with P-value <10−7 based on the 2-df chi-squared test) were removed, most likely due to informatic inconsistencies in SNP identification between studies. In total, 475,116 autosomal SNPs (hereafter called the testing SNPs) were identified for further analysis.