Imputation to the 1000 Genomes20 phase 1 reference was performed within the PGC pipeline16 using SHAPEIT for phasing21 and IMPUTE2 for imputation.22 Imputation was performed with a chunk size of 3 Mb with default parameters on the full set of 2186 phased haplotypes (August 2012, 30 069 288 variants, release ‘v3.macGT1’). Samples were then combined (within ancestry groups) for relatedness testing and calculation of PC covariates. The same filters as above were employed and we removed one individual from each pair of related or duplicate individuals (pi-hat value >0.2), preferentially retaining cases.