The Yale-UPenn samples were genotyped using the Illumina HumanOmnil-Quad v1.0 microarray containing 988,306 autosomal SNPs, at the Center for Inherited Disease Research (CIDR) and the Yale Center for Genome Analysis. Genotypes were called using Illumina GenomeStudio software V2011.1 and genotyping module V 1.8.4. (Illumina, San Diego, CA, USA). The SAGE samples were genotyped on the Illumina Human 1M array containing 1,069,796 total SNPs. We used the following criteria for genotyping quality control filtering: 1) call rate >98%; 2) MAF > 1%; and p value for HWE> 5 x 10-6. After data cleaning, 44,644 SNPs on the microarray and 135 individuals with call rates < 98% were excluded; 62,076 additional SNPs were removed due to minor allele frequencies (MAF) <1%. After data cleaning and quality control, 5,543 individuals and 889,659 SNPs remained for imputation. After applying the same QC procedures to the SAGE sample, 39 subjects with call rates < 98% were excluded. Thus, in the SAGE sample, 4,012 individuals and 726,191 SNPs remained for analysis.