Gene expression is tightly controlled by transcription factors (TFs) that are recruited to DNA cis-regulatory modules (CRMs). Many TFs have well-documented sequence preferences for their binding sites (transcription factor binding sites (TFBSs)) [1]. However, in contrast to the startling simplicity of the amino acid code, the 'regulatory code' at CRMs has a more ambiguous relationship between sequence and function. Chromatin immunoprecipitation (ChIP) coupled with genome-wide analyses have made it possible to map TF binding positions globally in vivo, which in some cases can serve as good predictors of CRM transcriptional outputs [2-4]. At the same time, these analyses often cannot explain the exact rules underlying TF binding to a given sequence, and functional prediction based on sequence alone has had limited success, in particular in mammalian systems [5].