We performed standard QC steps for both array genotyped data and the subset of WGS data extracted in step 2 using PLINK32. Samples failing sex check or with >3% missing data were excluded. SNPs with missing rate >3% or that failed Hardy-Weinberg Equilibrium check (p < 1e-4) were excluded from the study. A structure analysis was conducted to match ancestries to 1000 genomes reference haplotypes and mis-classified samples were excluded. In addition, we adopted standard TOPMed filters (https://topmed.nhlbi.nih.gov/) for variant selection. The variants that were labeled as follows were excluded: SVM (support vector machine score more negative than −0.5 and hence fails the SVM filter), CEN (falls in a centromeric region with inferred reference sequence), DISC (more than 5 percent Mendelian inconsistencies), EXHET (has excessive heterozygosity with HWE p-value < 1e-6) or CHRXHET (has excessive heterozygosity in male chrX).