Individuals in the MGBB were genotyped using the Illumina Multi-Ethnic Global array with hg19 coordinates. Variant-level QC filters removed variants with a call rate <98% and those that were duplicated across batches, monomorphic, not confidently mapped to a genomic location or associated with genotyping batch. Sample-level QC filters removed individuals with a call rate less than 98%, excessive autosomal heterozygosity (±3 s.d. from the mean) or discrepant self-reported and genetically inferred sex. PCs of ancestry were calculated in the 1000 Genomes phase 3 reference panel and subsequently projected onto the MGBB dataset, where a random forest classifier was used to assign ancestral group membership for individuals with a prediction probability >90%. The Michigan Imputation Server was then used to impute missing genotypes with the Haplotype Reference Consortium dataset serving as the reference panel. Imputed genotype dosages were converted to hard-call format and subjected to further QC, where SNPs were removed if they exhibited poor imputation quality (INFO <0.8), low MAF (<1%), deviations from HWE (P < 1 × 10−10) or missingness (variant call rate <98%). Only unrelated individuals (\documentclass[12pt]{minimal} \usepackage{amsmath}