The underrepresentation of non-European individuals in human genetic studies1 limits the applicability of the results for a large proportion of the world’s population2. Reference genome datasets3–12 are needed to characterize population-specific variation, enable efficient imputation of variants that are not directly genotyped, and extend genome-wide association studies (GWAS) to additional populations. The value of population-specific reference datasets is well recognized and projects based in the United States and Europe have provided deep characterization of specific populations (for example, Ashkenazi Jews12 and individuals from the Netherlands3 and Iceland13) and, in particular, data from individuals of Nordic countries have provided examples of how reference genome datasets can be used to drive comprehensive genetic studies across an entire population14. In Africa, populations show complex genetic patterns, smaller blocks of linkage disequilibrium and higher levels of heterozygosity, which provides unique value for genetic studies. Across the continent, early reference genome datasets for diverse populations are being built as part of H3Africa and other studies5,15. A Korean reference genome as well as Japanese and Chinese reference genome datasets have been created, and the formation of