Phasing and imputation was performed separately for each of the 15 datasets using a common set of single nucleotide polymorphisms (SNPs) passing QC (Table S2) using the program Impute2 v2.1.2 (Supplementary Information).47 The imputation reference panel was HapMap 3 release 2. We used all available HapMap3 populations for imputation as it was shown that the increase in the reference panel decreases error.48, 49 Post-imputation filters were applied to remove SNPs with INFO scores < 0.4 or with MAF < 0.05. We observe high imputation accuracy (as captured by the INFO score) across a range of minor allele frequencies (Figure S1). There was high concordance between directly genotyped variants with imputed dosages of the same variants after masking (Figure S2).