individuals from each super-population in the BioMe TOPMed study, and selected markers on chromosome 20 present on the Illumina HumanOmniExpress (8v1-2_A) array. The selected genotypes were phased with Eagle 2.4.181, using the 1000 Genomes Project phase 3 (n = 2,504), Haplotype Reference Consortium (HRC, n = 32,470) and TOPMed (n = 96,756) reference panels, excluding the 500 individuals from the TOPMed reference panel. The phased genotypes were imputed using Minimac4111 from each reference panel, and the imputation accuracy was estimated as the squared correlation coefficient (r2) between the imputed dosages and the genotypes calls from the sequence data. The allele frequencies were estimated among all TOPMed individuals estimated to belong to the same super-population, and the r2 values were averaged across variants in each MAF category. Variants present in 100 sequenced individuals but absent from the reference panels were assumed to have r2 = 0 for the purposes of computing the average r2. The minimum MAF to achieve r2 > 0.3 was calculated from the average r2 in each MAF category by finding the MAF that crosses r2 = 0.3 using linear interpolation. The average number of rare variants (MAF < 0.5%) and the fraction of imputable rare variants (r2