Pooling of Phase 1 and Phase 2 data were based on individual genotypes. We imposed stringent criteria for call rates of SNPs and checked for significant disparity of MAFs between series. Only summary data was available for the Texas and IARC-GWA studies. To minimise errors in data harmonisation we examine for deviation in MAF for SNPs in cases and controls across datasets.