Multiple genome-wide arrays were used to genotype the COGA sample23,40–42 (see Supplemental Text). A subset of 47,000 common (minor allele frequency (MAF) > 0.1 in the combined sample), independent (defined as R2 < 0.5) and high quality (missing rate < 2% and Hardy-Weinberg Equilibrium (HWE) p-values > 0.001) SNPs that were genotyped across all arrays were used to assess duplicate samples, confirm the reported pedigree structure and compute ancestral principal components (see Supplemental Text for details). After assignment of individuals in a family to a specific population, family-wise ancestry was designated according to the majority of individual family members (see Lai et al, accompanying paper). Only AA and EA families were included in subsequent analyses, due to low numbers of other groups. Only variants with non A/T or C/G alleles, missing rates < 5%, MAF > 3%, and HWE p values > 0.0001 were used for imputation. Genotypes were imputed to 1000 Genomes using the cosmopolitan reference panel (Phase 3, version 5, NCBI GRCh37; Supplemental Text) using SHAPEIT243 and Minimac344. Imputed SNPs with R2 < 0.30 were excluded, and genotype