We estimated haplotypes for the full cohort (pre-phasing), followed by haploid imputation23. For the pre-phasing step, we only used markers present on both the UK BiLEVE and UK Biobank Axiom arrays. We removed markers that failed quality control in more than one batch, had a greater than 5% overall missing rate, and had a MAF of less than 0.0001. We removed samples that were identified as outliers for heterozygosity and missing rate. These filters resulted in a dataset with 670,739 autosomal markers in 487,442 samples. Phasing on the autosomes was carried out using SHAPEIT324 (see Methods and https://jmarchini.org/software/). The 1000 Genomes phase 3 dataset25 was used as a reference panel, predominantly to help with the phasing of samples with non-European ancestry. In a separate experiment that leveraged phase inferred from mother–father–child trios, we estimated a median phasing switch error rate of 0.229% (see Methods).