Genotypes were imputed to NCBI build 37 using Phase 1 of the 1000 Genomes reference data and selecting for the Asian population for ethnicity, as implemented on the Minimac server (http://imputationserver.sph.umich.edu/) [24]. Following imputation, duplicate IDs corresponding to triallelic SNPs were removed. In accordance with our imputation pipeline, we removed SNPs with MAF < 0.01, imputation quality R2 < 0.9 and average call rate of <0.95. The imputation analysis produced a post-imputation analytic sample of 4,009,606 SNPs which was subjected to further QC. The imputed data, after removal of the major histocompatibility complex (26–33 Mb on chromosome 6), and pruning, was used to calculate two ancestry-informative covariates, using Multidimensional Scaling. These ancestry-informative covariates were used to adjust for any population structure.