We now turn to the human (ENCODE [23]) and Drosophila datasets (combined from modENCODE and other studies [2,24-30]), selecting for analysis those TFs for which position-wise conservation across species generally correlated with PWM information content. This initial filtering was done to ensure that PWMs included in the analysis reflected the global sequence constraints of these TFs' binding sites and could therefore be used to compare such constraints across TFBS instances, as presented below. Additional filtering criteria were used to ensure sufficient statistical power (in particular with respect to the total number of sites showing variation) and specificity of the analysis, resulting in the final dataset of 15 Drosophila and 36 human motifs (see Materials and methods and Supplementary note on TF selection in Additional file 1 for details). As before, we used DGRP data [22] to assess individual variation at Drosophila TFBSs, while for the humans we used Central European (CEU) genotypes sequenced as part of the 1000 Genomes Pilot Project [21] (using a Yoruban population instead of CEU yielded consistent results; not shown). Similar to our findings for the