less than 3, or depth of coverage less than 7 are set to missing. For indels, cells with depth-normalized quality scores less than 5, or depth of coverage less than 10 are set to missing. Then, variant sites were filtered, such that all samples carrying variation have heterozygous (0/1) genotype calls and all samples carrying heterozygous variation fail the allele balance cut-off; these sites were removed from the dataset at this stage. The allele balance cut-off, as with the depth and quality scores used for cell filtering above, differed depending on whether the site was a SNV or an indel: SNVs require at least one sample to carry an alternative allele balance ≥ 15%, and indels require at least one sample to carry an alternative allele balance ≥ 20%. These filters resulted in the removal of 441,406 sites, leaving 8,761,478 variants in the dataset. After subsetting to 1,000 randomly selected individuals, we had 1,076,707 autosomal variants that passed quality control. We further removed variants with call rate <99% (that is, missing in more than 10 individuals), reducing the number of analysed autosomal variants to 1,044,517. The comparison results of TOPMed WGS and BioMe WES data are described in Supplementary Information