SAGE samples were genotyped on the ILLUMINA Human 1 M platform at CIDR. We included 4036 unrelated self-reported AA (1297) or EA (2740) subjects (60 duplicate genotype samples were excluded from analysis). We used PLINK software to perform basic data cleaning steps before analysis (41). After QC(sample call rate ≥ 97%; SNP call rate ≥ 95%; MAF ≥ 0.005 in controls; HWE P ≥ 0.00001 in controls), a total of 1297 (2740) unrelated subjects and 953,258 (888,092) autosomal SNPs for AAs (EAs) were available for further analysis. To obtain a more genetically homogeneous sample and correct for population stratification in the association analysis, we computed principal components (PC) using the EIGENSOFT package (42). Specifically, 172,891 pruned SNPs common to AA and EA samples and in low linkage disequilibrium (LD) (genotypic correlation < 0.5) with one another, were fed into EIGENSOFT. The top two PCs of AA and EA samples and with the Phase II Hapmap CEU and YRI samples are shown in Supplementary Figure 1. Outliers were defined as subjects whose ancestry was at least 3 standard deviations from the