If a SNP is located at a TFBS of a gene, it may affect the level or timing of gene expression. We identified such SNPs according to the procedure shown in Supplementary Figure 1. For each SNP within 5 kb upstream or 1 kb downstream of a transcription start site (TSS), we first extracted 29 bp DNA sequence on either side of the SNP, and then used the MATCH (10) method to predict possible TFBSs in the resulting 59 base pair sequence using each alternative allele. A SNP was classified as affecting TFBS activity if MATCH predicted a TFBS with one allele but not with the other and the difference in the matrix similarity scores (MSS) or core similarity scores (CSS) between the two alleles was ≥0.2. Possible scores for MSS and CSS range from 0 to 1 (10). We performed predictions using all the 187 position weight matrices classified as high quality non-redundant vertebrate (mouse, rat and human) matrices in TRANSFAC Release 12.1 (11). We used the default set of MATCH score thresholds provided by TRANSFAC to allow for