We conducted principal components analysis to address population structure in the Yale-UPenn samples and SAGE samples using the program Eigensoft (Price et al., 2006, Patterson et al., 2006). SNPs were pruned for linkage disequilibrium (LD) estimated by r2 >0.8. The first PC score distinguished AAs and EAs. Thus, we analyzed AAs and EAs separately to minimize population admixture. Within each population, the top three PCs were then used in all analyses to correct for residual population stratification.