Analysis of variation at transcription factor binding sites in Drosophila and humans.
- Authors
- Spivakov, Mikhail; Akhtar, Junaid; Kheradpour, Pouya; Beal, Kathryn; Girardot, Charles; Koscielny, Gautier; Herrero, Javier; Kellis, Manolis; Furlong, Eileen E M; Birney, Ewan
- Year
- 2012
- Journal
- Genome biology
- PMID
- 22950968
- DOI
- 10.1186/gb-2012-13-9-r49
- PMCID
- PMC3491393
BACKGROUND: Advances in sequencing technology have boosted population genomics and made it possible to map the positions of transcription factor binding sites (TFBSs) with high precision. Here we investigate TFBS variability by combining transcription factor binding maps generated by ENCODE, modENCODE, our previously published data and other sources with genomic variation data for human individuals and Drosophila isogenic lines. RESULTS: We introduce a metric of TFBS variability that takes into account changes in motif match associated with mutation and makes it possible to investigate TFBS functional constraints instance-by-instance as well as in sets that share common biological properties. We also take advantage of the emerging per-individual transcription factor binding data to show evidence that TFBS mutations, particularly at evolutionarily conserved sites, can be efficiently buffered to ensure coherent levels of transcription factor binding. CONCLUSIONS: Our analyses provide insights into the relationship between individual and interspecies variation and show evidence for the functional buffering of TFBS mutations in both humans and flies. In a broad perspective, these results demonstrate the potential of combining functional genomics and population genetics approaches for understanding gene regulation.
Position-wise variation properties of three well-characterized developmental TFs from Drosophila melanogaster. (a) Interspecies diversity at bound motif positions and motif flanks. Diversity is expressed as 1-phastcons scores [64] per position across 15 insect species normalized to these scores for the scrambled versions of the same motifs detected within the respective TF-bound regions. TF 'binding logo' representations of motif PWMs are shown below each plot. (b) Within-species diversity at bound motif positions and motif flanks, expressed as genetic diversity (D) [78] per position across 162 isogenic lines of D. melanogaster from the DGRP normalized to the same metric for the scrambled versions of the motifs detected within the respective TF-bound regions. Asterisks indicate positions showing significantly reduced variation compared to the scrambled motifs (relative diversity <1; permutation test P < 5e-3). TF 'binding logo' representations of motif PWMs are shown below each plot. The non-normalized versions of the same plots, including both TF-bound and all instances of these motifs and their scrambled versions, are shown in Figure S1 in Additional file 1. (c) Within-species diversity per motif position across the three score ranges labeled grey to red in the increasing order: weak (Twi and Tin, 3 to 5; Bin, 5 to 8), medium (Twi and Tin, 5 to 7; Bin, 8 to 10) and strong (Twi and Tin, >7; Bin, >10). (d) Inverse correlation between individual variation at motif positions (x-axis) and positional information content according to motifs' PWM (y-axis). Variation is expressed in the same terms as in (b). Numbers beside the dots indicate motif positions; r is the Pearson's correlation coefficients for each TF. The same plots for cross-species variation are shown in Figure S2 in Additional file 1.
Individual variation of the binding sites for 15 Drosophila and 36 human TFs selected for this study. (a) Distributions of position-wise diversity at motif positions (red), scrambled motifs and motif flanks at the TF-bound regions of Drosophila (left panel) and human (right) TFs; P-values are from Kruskal-Wallis non-parametric significance tests. (b) Violin plots (a combination of boxplots and two mirror-image kernel density plots) showing the correlation between individual variation and information content per motif position for the bound instances of Drosophila (left) and human (right) TFs included in this study (top, red) and their scrambled versions detected within the same bound regions (bottom, grey); P-values are from Wilcoxon two-sample non-parametric significance tests.
Motif mutational load of Drosophila and human TFBSs located within different genomic contexts. (a) Examples of mutational load values for individual instances of four human TFs (ranging from high to very low) showing different combinations of parameters that are combined in this metric: the reduction of PWM match scores at the minor allele ('ΔPWM score') and the number of genotypes within the mutation in the population (minor allele frequency (MAF)). (b) Relationship between phylogenetic conservation and motif mutational load for D. melanogaster (left) and human (right) TFs included in this study. Conservation is expressed as per-instance branch length scores (BLSs) for each instance computed against the phylogenetic tree of 12 Drosophila species. The average load for D. melanogaster-specific sites (BLS = 0) is shown separately as these have an exceptionally high motif load. (c) Relationship between motif stringency and motif load in Drosophila (left) and humans (right). Motif stringency is expressed as scaled ranked PWM scores grouped into five incremental ranges of equal size (left to right), with average motif load shown for each range. (d) Relationship between distance from transcription start site (TSS) and motif load in Drosophila (left) and humans (right) for all analyzed TFs excluding CTCF (top) and for CTCF alone (bottom), with average motif load shown for each distance range. (b-d) Average motif load is computed excluding a single maximum value to reduce the impact of outliers. The P-values are from permutation tests, in which permutations are performed separately for each TF and combined into a single statistic as described in Materials and methods.
Evidence for the 'buffering' of deleterious TFBS variation by neighboring homotypic motifs in Drosophila. (a) Distributions of average motif load per 100 kb window along Drosophila chromosome 2R and chromosome × (yellow; see Figure S5 in Additional file 1 for other chromosomes). Recombination rate distributions along the chromosomes (dashed lines) are from [22] (and are near-identical to an earlier analysis [43]); note that there is no apparent correlation between these two parameters. Regions of high average motif load marked with asterisks are further examined in (b). Average motif load is computed excluding a single maximum value to reduce the impact of outliers. (b) Examples of motif arrangement at regions that fall within 100 kb windows having high average motif load (L >5e-3). Motifs with no detected deleterious variation (L = 0) are colored grey, and those with non-zero load pink (low load) to red (high load). Asterisks refer to similarly labeled peaks from (a). Note that most high-load motifs found in these regions have additional motifs for the same TF in their proximity. (c) Distributions of average load across ranges of phylogenetic conservation for motifs with a single match within a bound region ('singletons', blue) versus those found in pairs ('duplets', red). For equivalent comparison, a random motif out of the duplet was chosen for each bound region and the process was repeated 100 times. Results are shown for the four TFs for which appreciable differences between 'singletons' and 'duplets' were detected. Phylogenetic conservation is expressed in terms of branch length score (BLS) ranges, similarly to Figure 2b. The P-value is from a permutation test for the sum of average load differences for each range between 'singleton' and 'duplet' motifs. Average load was computed excluding a single maximum value. (d) Relationship between the average load per TF and the average number of motifs per bound region. Average load was computed excluding a single maximum value; r is Pearson's correlation coefficient and the P-value is from the correlation test. (e) The difference in motif score between motif pairs mapping to the same bound regions: the one with the highest load versus one with a zero load ('constant'; left) or in random pairs (right). These results suggest that the major alleles of motifs with a high load are generally not 'weaker' than their non-varying neighbors (the P-value is from the Wilcoxon test).
Evidence for the 'buffering' of variation at conserved CTCF binding sites. (a) Proportion of homozygous polymorphic CTCF binding sites with 'buffered' levels of ChIP signal depending on the sites' evolutionary conservation (less conserved, BLS <0.5; more conserved, BLS ≥0.5). Sites at which the minor variant retained at least two-thirds of the major variant's signal were considered as 'buffered'. The P-value is from the Fisher test. Major and minor variants were defined on the basis of the global allele frequency data from [75,76]. (b) Differences in the CTCF binding signal (Δ ChIP signal) at homozygous polymorphic sites that show either 'low' (left) or 'high' (right) disparity in absolute motif match scores (Δ motif score) between the variants (<1 or >1, respectively). The ChIP signals are sign-adjusted relative to the direction of PWM score change. Site-specific signals from multiple individuals with the same genotype, where available, were summarized by mean. The P-value is from the Wilcoxon test. (c) Genotype-specific differences in the CTCF ChIP signal across individuals between homozygous polymorphic sites with appreciable differences in absolute PWM match scores (Δ motif score >1) at less conserved (BLS <0.5, left) and more conserved (BLS >0.5, right) CTCF motifs. The ChIP signals are sign-adjusted relative to the direction of PWM score change. Site-specific signals from multiple individuals with the same variant, where available, were summarized by mean. The P-value is from the Wilcoxon test. (d) An interaction linear model showing that interspecies motif conservation (expressed by branch length scores) reduces the effect of motif mutations on CTCF binding. Shown are the effect plots predicting the relationship between the change of PWM score (at the minor versus the major variant) and the change of the associated ChIP signal at three hypothetical levels of evolutionary conservation: BLS = 0 (low; left); BLS = 0.5 (medium; middle); and BLS = 1 (high; right). Major and minor variants were defined on the basis of the global allele frequency data from [75,76]. (e) An interaction linear model showing that interspecies motif conservation (BLS) reduces the effect of motif stringency on the binding signal. Shown are the effect plots predicting the relationship between motif scores and ranked ChIP signal at three hypothetical conservation levels: BLS = 0 (low; left); BLS = 0.5 (medium; middle); and BLS = 1 (high; right). (f) A schematic illustrating the observed effect of binding site mutations on CTCF binding signal at two polymorphic CTCF sites - one poorly conserved (BLS = 0.03, left) and one highly conserved (BLS = 0.84, right) - that have similar motif match scores (14.9 and 14.2, respectively). Sequences of higher- (top) and lower-scoring alleles (bottom) are shown on the figure. Mutations resulting in a similar loss of score (down to 12.5 and 11.8, respectively) resulted in a 53% loss of CTCF binding signal at the non-conserved site (left, compare the amplitudes of top (blue) to bottom (red) curves), in contrast to a mere 6% at the conserved site (right).
| Name | Type |
|---|---|
| 1000 Genomes Pilot Project local | cohort |
| BEINF0 local | drug |
| Bin local | gene |
| Bin local | variant |
| BIN local | gene |
| Biniou local | gene |
| branch length score local | drug |
| Bric-à-brac local | gene |
| CEU | cohort |
| Conserved NFκB site local | variant |
| Conserved TFBS local | variant |
| CTCF | gene |
| DGRP local | cohort |
| DGRP SNPs local | variant |
| D. melanogaster local | cohort |
| Drosophila Genetic Reference Panel local | cohort |
| Drosophila melanogaster | cohort |
| Drosophila Population Genomics Project local | cohort |
| embryo patterning local | phenotype |
| Encyclopedia of DNA Elements local | cohort |
| fly local | cohort |
| gam local | drug |
| gamlss local | drug |
| glucocorticoid receptor | gene |
| Gm12872 local | cohort |
| Gm12873 local | cohort |
| Gm12874 local | cohort |
| Gm12878 | cohort |
| Gm12891 local | cohort |
| Gm12892 local | cohort |
| Gm19238 local | cohort |
| Gm19239 local | cohort |
| Gm19240 local | cohort |
| human | cohort |
| human 1000 Genomes project local | cohort |
| humans | cohort |
| lymphoblastoid lines local | cohort |
| modENCODE local | cohort |
| mutational load local | phenotype |
| neurodevelopmental abnormalities local | phenotype |
| NFκB | gene |
| Pax6 | gene |
| SNP | cohort |
| Ste12 local | gene |
| TFBS load local | phenotype |
| TFBS mutation local | variant |
| TFBS variation local | phenotype |
| Tin local | gene |
| Tin local | variant |
| TIN local | gene |
| Tinman local | gene |
| Twi local | variant |
| Twist | gene |
| Yoruba | cohort |
| ZAGA local | drug |
No uploaded files.
| Citation | PMID | DOI | Status |
|---|---|---|---|
| BaroloSShadow enhancers: Frequently asked questions about distributed cis-regulatory information and enhancer redundancy.BioEssays20123413514110.1002/bies.20110012122083793PMC3517143 | — | — | — |
| BolouriHDavidsonEHTranscriptional regulatory cascades in development: initial rates, not steady state, determine network kinetics.Proc Natl Acad Sci USA20031009371937610.1073/pnas.153329310012883007PMC170925 | — | — | — |
| BradleyRKLiX-YTrapnellCDavidsonSPachterLChuHCTonkinLABigginMDEisenMBBinding site turnover produces pervasive quantitative changes in transcription factor binding between closely related Drosophila species.PLoS Biol20108e100034310.1371/journal.pbio.100034320351773PMC2843597 | — | — | — |
| BulykMLProtein binding microarrays for the characterization of DNA-protein interactions.Adv Biochem Eng Biotechnol200710465851729081910.1007/10_025PMC2727742 | — | — | — |
| BusheyAMRamosECorcesVGThree subclasses of a Drosophila insulator show distinct and cell type-specific genomic distributions.Genes Dev2009231338501010.1101/gad.179820919443682PMC2701583 | — | — | — |
| CharlesworthBFundamental concepts in genetics: effective population size and patterns of molecular evolution and variation.Nat Rev Genet2009101952051920471710.1038/nrg2526 | — | — | — |
| ChenKvan NimwegenERajewskyNSiegalMLCorrelating gene expression variation with cis-regulatory polymorphism in Saccharomyces cerevisiae.Genome Biol Evol201026977072082928110.1093/gbe/evq054PMC2953268 | — | — | — |
| CostanzoMBaryshnikovaAMyersCLAndrewsBBooneCCharting the genetic interaction map of a cell.Curr Opin Biotechnol201122667410.1016/j.copbio.2010.11.00121111604 | — | — | — |
| CoxDHinkleyDTheoretical Statistics1974London: Chapman & Hall188 | — | — | — |
| CrockerJPotterNErivesADynamic evolution of precise regulatory encodings creates the clustered site signature of enhancers.Nat Commun201019910.1038/ncomms110220981027PMC2963808 | — | — | — |
| CrockerJTamoriYErivesAEvolution acts on enhancer organization to fine-tune gradient threshold readouts.PLoS Biol20086e26310.1371/journal.pbio.006026318986212PMC2577699 | — | — | — |
| DeweyFEChenRCorderoSPOrmondKECaleshuCKarczewskiKJWhirl-CarrilloMWheelerMTDudleyJTByrnesJKCornejoOEKnowlesJWWoonMSangkuhlKGongLThornCFHebertJMCapriottiEDavidSPPavlovicAWestAThakuriaJVBallMPZaranekAWRehmHLChurchGMWestJSBustamanteCDSnyderMAltmanRBPhased whole-genome genetic risk in a family quartet using a major allele reference sequencePLoS Genet201179e10022801010.1371/journal.pgen.1002280PMC317420121935354 | — | — | — |
| Drosophila Population Genomics Project.http://www.dpgp.org | — | — | — |
| ENCODE Motif Browser.http://www.broadinstitute.org/~pouyak/motif-disc/human | — | — | — |
| Ensembl Genome Browser.http://www.ensembl.org/index.html | — | — | — |
| Fiston-LavierA-SSinghNDLipatovMPetrovDADrosophila melanogaster recombination rate calculator.Gene2010463182010.1016/j.gene.2010.04.01520452408 | — | — | — |
| Flybase.http://www.flybase.org | — | — | — |
| FoxJEffect displays in R for generalised linear models.J Stat Software20038127 | — | — | — |
| GarfieldDHaygoodRNielsenWWrayGPopulation genetics of cis-regulatory sequences that operate during embryonic development in the sea urchin Strongylocentrotus purpuratus.Evol Dev20121415216710.1111/j.1525-142X.2012.00532.x23017024 | — | — | — |
| GibsonGEpistasis and pleiotropy as natural properties of transcriptional regulation.Theor Popul Biol199649588910.1006/tpbi.1996.00038813014 | — | — | — |
| GodtDCoudercJLCramtonSELaskiFAPattern formation in the limbs of Drosophila: bric à brac is expressed in both a gradient and a wave-like pattern and is required for specification and proper segmentation of the tarsus.Development1993119799812791055110.1242/dev.119.3.799 | — | — | — |
| GoteaVViselAWestlundJMNobregaMAPennacchioLAOvcharenkoIHomotypic clusters of transcription factor binding sites are a key component of human promoters and enhancers.Genome Res20102056557710.1101/gr.104471.10920363979PMC2860159 | — | — | — |
| HaldaneJBSThe cost of natural selection.J Genet19575551152410.1007/BF02984069 | — | — | — |
| HallikasOPalinKSinjushinaNRautiainenRPartanenJUkkonenETaipaleJGenome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity.Cell2006124475910.1016/j.cell.2005.10.04216413481 | — | — | — |
| HalpernALBrunoWJEvolutionary distances for protein-coding sequences: modeling site-specific residue frequencies.Mol Biol Evol19981591091710.1093/oxfordjournals.molbev.a0259959656490 | — | — | — |
| HareEEPetersonBKIyerVNMeierREisenMBSepsid even-skipped enhancers are functionally conserved in Drosophila despite lack of sequence conservation.PLoS Genet20084e100010610.1371/journal.pgen.100010618584029PMC2430619 | — | — | — |
| HartmanJLIvJLHHartwellLPrinciples for the buffering of genetic variation.Science20012911001100410.1126/science.291.5506.100111232561 | — | — | — |
| HeBZHollowayAKMaerklSJKreitmanMDoes positive selection drive transcription factor binding site turnover? A test with Drosophila cis-regulatory modules.PLoS Genet20117e100205310.1371/journal.pgen.100205321572512PMC3084208 | — | — | — |
| HertzGZStormoGDIdentifying DNA and protein patterns with statistically significant alignments of multiple sequences.Bioinformatics19991556357710.1093/bioinformatics/15.7.56310487864 | — | — | — |
| Human Synthetic Major Allele Data from Dewey.http://datadryad.org/handle/10255/dryad.34659 | — | — | — |
| HustonMBiological Diversity: The Coexistence of Species on Changing Landscapes1994Cambridge, UK: Cambridge University Press | — | — | — |
| JakobsenJSBraunMAstorgaJGustafsonEHSandmannTKarzynskiMCarlssonPFurlongEEMTemporal ChIP-on-chip reveals Biniou as a universal regulator of the visceral muscle transcriptional network.Genes Dev2007212448246010.1101/gad.43760717908931PMC1993875 | — | — | — |
| JunionGSpivakovMGirardotCBraunMGustafsonEBirneyEFurlongEA transcription factor collective defines cardiac cell fate and reflects lineage history.Cell201214847348610.1016/j.cell.2012.01.03022304916 | — | — | — |
| KasowskiMGrubertFHeffelfingerCHariharanMAsabereAWaszakSMHabeggerLRozowskyJShiMUrbanAEHongM-YKarczewskiKJHuberWWeissmanSMGersteinMBKorbelJOSnyderMVariation in transcription factor binding among humans.Science201032823223510.1126/science.118362120299548PMC2938768 | — | — | — |
| KheradpourPStarkARoySKellisMReliable prediction of regulator targets using 12 Drosophila genomes.Genome Res2007171919193110.1101/gr.709040717989251PMC2099599 | — | — | — |
| KimJHeXSinhaSEvolution of regulatory sequences in 12 Drosophila species.PLoS Genet20095e100033010.1371/journal.pgen.100033019132088PMC2607023 | — | — | — |
| KophengnavongTMichnowiczJEBlackwellTKEstablishment of distinct MyoD, E2A, and twist DNA binding specificities by different basic region-DNA conformations.Mol Cell Biol20002026127210.1128/MCB.20.1.261-272.200010594029PMC85082 | — | — | — |
| LiX-YMacArthurSBourgonRNixDPollardDAIyerVNHechmerASimirenkoLStapletonMHendriksCLLChuHCOgawaNInwoodWSementchenkoVBeatonAWeiszmannRCelnikerSEKnowlesDWGingerasTSpeedTPEisenMBBigginMDTranscription factors bind thousands of active and inactive regions in the Drosophila blastoderm.PLoS Biol200862410.1371/journal.pbio.0060024PMC223590218271625 | — | — | — |
| LudwigMZBergmanCPatelNHKreitmanMEvidence for stabilizing selection in a eukaryotic enhancer element.Nature200040356454710.1038/3500061510676967 | — | — | — |
| LuskRWEisenMBEvolutionary mirages: selection on binding site composition creates the illusion of conserved grammars in Drosophila enhancers.PLoS Genet20106e100082910.1371/journal.pgen.100082920107516PMC2809757 | — | — | — |
| MacArthurSLiX-YLiJBrownJBChuHCZengLGrondonaBPHechmerASimirenkoLKernenSVEKnowlesDWStapletonMBickelPBigginMDEisenMBDevelopmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions.Genome Biol200910R8010.1186/gb-2009-10-7-r8019627575PMC2728534 | — | — | — |
| MackayTFCRichardsSStoneEABarbadillaAAyrolesJFZhuDCasillasSHanYMagwireMMCridlandJMRichardsonMFAnholtRRHBarranMBessCBlankenburgKPCarboneMACastellanoDChaboubLDuncanLHarrisZJavaidMJayaseelanJCJhangianiSNJordanKWLaraFLawrenceFLeeSLLibradoPLinheiroRSLymanRFThe Drosophila melanogaster Genetic Reference Panel.Nature201248217317810.1038/nature1081122318601PMC3683990 | — | — | — |
| MahonySBenosPVSTAMP: a web tool for exploring DNA-binding motif similarities.Nucleic Acids Res200735W253W25810.1093/nar/gkm27217478497PMC1933206 | — | — | — |
| MajewskiJPastinenTThe study of eQTL variations by RNA-seq: from SNPs to phenotypes.Trends Genet201127727910.1016/j.tig.2010.10.00621122937 | — | — | — |
| ManolioTGenomewide association studies and assessment of the risk of disease.N Engl J Med201036316617610.1056/NEJMra090598020647212 | — | — | — |
| MauranoMWangHKutyavinTStamatoyannopoulosJWidespread site-dependent buffering of human regulatory polymorphism.PLoS Genet20128e100259910.1371/journal.pgen.100259922457641PMC3310774 | — | — | — |
| McDaniellRLeeB-KSongLLiuZBoyleAPErdosMRScottLJMorkenMAKuceraKSBattenhouseAKeefeDCollinsFSWillardHFLiebJDFureyTSCrawfordGEIyerVRBirneyEHeritable individual-specific and allele-specific chromatin signatures in humans.Science201032823523910.1126/science.118465520299549PMC2929018 | — | — | — |
| MeijsingSHPufallMASoAYBatesDLChenLYamamotoKRDNA binding site sequence directs glucocorticoid receptor structure and activity.Science200932440741010.1126/science.116426519372434PMC2777810 | — | — | — |
| modENCODE Motif Browser.http://www.broadinstitute.org/~pouyak/motif-disc/fly | — | — | — |
| MosesAMChiangDYKellisMLanderESEisenMBPosition specific variation in the rate of evolution in transcription factor binding sites.BMC Evol Biol200331910.1186/1471-2148-3-1912946282PMC212491 | — | — | — |
| MosesAMStatistical tests for natural selection on regulatory regions based on the strength of transcription factor binding sites.BMC Evol Biol2009928610.1186/1471-2148-9-28619995462PMC2800119 | — | — | — |
| MullerHJOur load of mutations.Am J Hum Genet1950211117614771033PMC1716299 | — | — | — |
| NeiMAnalysis of Gene Diversity in Subdivided Populations.Proc Natl Acad Sci USA1973703321332310.1073/pnas.70.12.33214519626PMC427228 | — | — | — |
| NuzhdinSVRychkovaAHahnMWThe strength of transcription-factor binding modulates co-variation in transcriptional networks.Trends Genet201026515310.1016/j.tig.2009.12.00520080313PMC2826431 | — | — | — |
| NègreNBrownCDMaLBristowCAMillerSWWagnerUKheradpourPEatonMLLoriauxPSealfonRLiZIshiiHSpokonyRFChenJHwangLChengCAuburnRPDavisMBDomanusMShahPKMorrisonCAZiebaJSuchySSenderowiczLVictorsenABildNAGrundstadAJHanleyDMacAlpineDMMannervikMA cis-regulatory map of the Drosophila genome.Nature201147152753110.1038/nature0999021430782PMC3179250 | — | — | — |
| OhlssonRBartkuhnMRenkawitzRCTCF shapes chromatin by multiple mechanisms: the impact of 20 years of CTCF research on understanding the workings of chromatin.Chromosoma201011935136010.1007/s00412-010-0262-020174815PMC2910314 | — | — | — |
| PatenBHerreroJFitzgeraldSBealKFlicekPHolmesIBirneyEGenome-wide nucleotide-level mammalian ancestor reconstruction.Genome Res2008181829184310.1101/gr.076521.10818849525PMC2577868 | — | — | — |
| Portales-CasamarEThongjueaSKwonATArenillasDZhaoXValenEYusufDLenhardBWassermanWWSandelinAJASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles.Nucleic Acids Res201038D105101010.1093/nar/gkp95019906716PMC2808906 | — | — | — |
| QuinlanARHallIMBEDTools: a flexible suite of utilities for comparing genomic features.Bioinformatics20102684184210.1093/bioinformatics/btq03320110278PMC2832824 | — | — | — |
| RoySErnstJKharchenkoPVKheradpourPNegreNEatonMLLandolinJMBristowCAMaLLinMFWashietlSArshinoffBIAyFMeyerPERobineNWashingtonNLDi StefanoLBerezikovEBrownCDCandeiasRCarlsonJWCarrAJungreisIMarbachDSealfonRTolstorukovMYWillSAlekseyenkoAAArtieriCBoothBWIdentification of functional elements and regulatory circuits by Drosophila modENCODE.Science2010330178717972117797410.1126/science.1198374PMC3192495 | — | — | — |
| SandmannTGirardotCBrehmeMTongprasitWStolcVFurlongEEMA core transcriptional network for early mesoderm development in Drosophila melanogaster.Genes Dev20072143644910.1101/gad.150900717322403PMC1804332 | — | — | — |
| SchmidtDWilsonMDBallesterBSchwaliePCBrownGDMarshallAKutterCWattSMartinez-JimenezCPMackaySTalianidisIFlicekPOdomDTFive-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding.Science20103281036104010.1126/science.118617620378774PMC3008766 | — | — | — |
| SiepelABejeranoGPedersenJSHinrichsASHouMRosenbloomKClawsonHSpiethJHillierLWRichardsSWeinstockGMWilsonRKGibbsRAKentWJMillerWHausslerDEvolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes.Genome Res2005151034105010.1101/gr.371500516024819PMC1182216 | — | — | — |
| SisodiyaSMFreeSLWilliamsonKAMitchellTNWillisCStevensJMKendallBEShorvonSDHansonIMMooreATVan HeyningenVPAX6 haploinsufficiency causes cerebral malformation and olfactory dysfunction in humans.Nat Genet20012821421610.1038/9004211431688 | — | — | — |
| STAMP: A Tool-kit for DNA Motif Comparison.http://www.benoslab.pitt.edu/stamp | — | — | — |
| StasinopoulosDMRigbyRAGeneralised additive models for Location Scale and Shape (GAMLSS) in R.J Stat Software200723146 | — | — | — |
| StormoGDZhaoYDetermining the specificity of protein-DNA interactions.Nat Rev Genet2010117517602087732810.1038/nrg2845 | — | — | — |
| SwamyKBSChoC-YChiangSTsaiZT-YTsaiH-KImpact of DNA-binding position variants on yeast gene expression.Nucleic Acids Res2009376991700110.1093/nar/gkp74319767613PMC2790881 | — | — | — |
| The 1000 genomes project consortiumA map of human genome variation from population-scale sequencing.Nature20104671061106710.1038/nature0953420981092PMC3042601 | — | — | — |
| The ENCODE ConsortiumAn integrated Encyclopedia of DNA Elements in the human genome.Nature2012489577410.1038/nature1124722955616PMC3439153 | — | — | — |
| TouzetHVarréJ-SEfficient and accurate P-value computation for position weight matrices.Algorithms Mol Biol2007112151807297310.1186/1748-7188-2-15PMC2238751 | — | — | — |
| TweedieSAshburnerMFallsKLeylandPMcQuiltonPMarygoldSMillburnGOsumi-SutherlandDSchroederASealRZhangHFlyBase: enhancing Drosophila Gene Ontology annotations.Nucleic Acids Res200937D55555910.1093/nar/gkn78818948289PMC2686450 | — | — | — |
| VavouriTElgarGPrediction of cis-regulatory elements using binding site matrices-the successes, the failures and the reasons for both.Curr Opin Genet Dev20051539540210.1016/j.gde.2005.05.00215950456 | — | — | — |
| ViselABlowMLiZZhangTAkiyamaJHoltAPlajzer-FrickIShoukryMWrightCChenFAfzalVRenBRubinEPennacchioLAChIP-seq accurately predicts tissue-specific activity of enhancers.Nature200945785485810.1038/nature0773019212405PMC2745234 | — | — | — |
| WeirauchMTHughesTRConserved expression without conserved regulatory sequence: the more things change, the more they stay the same.Trends Genet201026667410.1016/j.tig.2009.12.00220083321 | — | — | — |
| ZhengWGianoulisTAKarczewskiKJZhaoHSnyderMRegulatory variation within and between species.Annu Rev Genomics Hum Genet2010123273462172194210.1146/annurev-genom-082908-150139 | — | — | — |
| ZhengWZhaoHManceraESteinmetzLMSnyderMGenetic analysis of variation in transcription factor binding in yeast.Nature20104641187118910.1038/nature0893420237471PMC2941147 | — | — | — |
| ZinzenRPGirardotCGagneurJBraunMFurlongEEMCombinatorial binding predicts spatio-temporal cis-regulatory activity.Nature2009462657010.1038/nature0853119890324 | — | — | — |
In this knowledge base
External
| Title | Authors | Journal | Year | Link |
|---|---|---|---|---|
| Genetic coupling of enhancer activity and connectivity in gene expression control. | Ray-Jones H et al. | — | 2025 | → |
| p53motifDB: integration of genomic information and tumour suppressor p53 binding motifs. | Baniulyte G et al. | — | 2025 | → |
| tRNA expression and modifications as critical components in the biology of blood-feeding arthropods. | Kelley M et al. | — | 2025 | → |
| <i>PC</i> Gene Affects Milk Production Traits in Dairy Cattle. | Du A et al. | — | 2024 | → |
| MotifQuest: An Automated Pipeline for Motif Database Creation to Improve Peptidomics Database Searching Programs. | Dang TC et al. | — | 2024 | → |
| Structural underpinnings of mutation rate variations in the human genome. | Liu Z et al. | — | 2023 | → |
| Evolution of transcription factor binding through sequence variations and turnover of binding sites. | Krieger G et al. | — | 2022 | → |
| Genetic polymorphisms of <i>PKLR</i> gene and their associations with milk production traits in Chinese Holstein cows. | Du A et al. | — | 2022 | → |
| proChIPdb: a chromatin immunoprecipitation database for prokaryotic organisms. | Decker KT et al. | — | 2022 | → |
| Promoter sequence and architecture determine expression variability and confer robustness to genetic variants. | Einarsson H et al. | — | 2022 | → |
| A signature of Neanderthal introgression on molecular mechanisms of environmental responses. | Findley AS et al. | — | 2021 | → |
| <i>Cis</i>-acting variation is common across regulatory layers but is often buffered during embryonic development. | Floc'hlay S et al. | — | 2021 | → |
| Promoter sequence and architecture determine expression variability and confer robustness to genetic variants | Einarsson H et al. | — | 2021 | — |
| Transcriptional enhancers and their communication with gene promoters. | Ray-Jones H et al. | — | 2021 | → |
| Functional effects of variation in transcription factor binding highlight long-range gene regulation by epromoters. | Mitchelmore J et al. | — | 2020 | → |
| MAGGIE: leveraging genetic variation to identify DNA sequence motifs mediating transcription factor binding and function. | Shen Z et al. | — | 2020 | → |
| Serpent/dGATAb regulates Laminin B1 and Laminin B2 expression during Drosophila embryogenesis. | Töpfer U et al. | — | 2019 | → |
| Stable enhancers are active in development, and fragile enhancers are associated with evolutionary adaptation. | Li S et al. | — | 2019 | → |
| Using mechanistic models for the clinical interpretation of complex genomic variation. | Peña-Chilet M et al. | — | 2019 | → |
| An information theoretic treatment of sequence-to-expression modeling. | Khajouei F et al. | — | 2018 | → |
| A resource of variant effect predictions of single nucleotide variants in model organisms. | Wagih O et al. | — | 2018 | → |
| Insights into mammalian transcription control by systematic analysis of ChIP sequencing data. | Devailly G et al. | — | 2018 | → |
| Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. | Huang YF et al. | — | 2017 | → |
| Principles for the regulation of multiple developmental pathways by a versatile transcriptional factor, BLIMP1. | Mitani T et al. | — | 2017 | → |
| 267 Spanish Exomes Reveal Population-Specific Differences in Disease-Related Genetic Variation. | Dopazo J et al. | — | 2016 | → |
| A systematic, large-scale comparison of transcription factor binding site models. | Hombach D et al. | — | 2016 | → |
| Identifying genetic modulators of the connectivity between transcription factors and their transcriptional targets. | Fazlollahi M et al. | — | 2016 | → |
| Non-coding single nucleotide variants affecting estrogen receptor binding and activity. | Bahreini A et al. | — | 2016 | → |
| Nucleotide diversity analysis highlights functionally important genomic regions. | Tatarinova TV et al. | — | 2016 | → |
| System-Wide Associations between DNA-Methylation, Gene Expression, and Humoral Immune Response to Influenza Vaccination. | Zimmermann MT et al. | — | 2016 | → |
| The evolution of gene expression and binding specificity of the largest transcription factor family in primates. | Kapopoulou A et al. | — | 2016 | → |
| The Genetics of Transcription Factor DNA Binding Variation. | Deplancke B et al. | — | 2016 | → |
| The population genomics of rhesus macaques (Macaca mulatta) based on whole-genome sequences. | Xue C et al. | — | 2016 | → |
| Cis-Expression Quantitative Trait Loci Mapping Reveals Replicable Associations with Heroin Addiction in OPRM1. | Hancock DB et al. | — | 2015 | → |
| Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas. | Mathelier A et al. | — | 2015 | → |
| Identification of altered cis-regulatory elements in human disease. | Mathelier A et al. | — | 2015 | → |
| Insights from ENCODE on Missing Proteins: Why β-Defensin Expression Is Scarcely Detected. | Fan Y et al. | — | 2015 | → |
| Large-scale identification of sequence variants influencing human transcription factor occupancy in vivo. | Maurano MT et al. | — | 2015 | → |
| Mechanisms of mutational robustness in transcriptional regulation. | Payne JL et al. | — | 2015 | → |
| Predicting functional regulatory SNPs in the human antimicrobial peptide genes DEFB1 and CAMP in tuberculosis and HIV/AIDS. | Flores Saiffe Farías A et al. | — | 2015 | → |
| Significant expansion of the REST/NRSF cistrome in human versus mouse embryonic stem cells: potential implications for neural development. | Rockowitz S et al. | — | 2015 | → |
| A gene-specific non-enhancer sequence is critical for expression from the promoter of the small heat shock protein gene αB-crystallin. | Jing Z et al. | — | 2014 | → |
| A simple metric of promoter architecture robustly predicts expression breadth of human genes suggesting that most transcription factors are positive regulators. | Hurst LD et al. | — | 2014 | → |
| Associating disease-related genetic variants in intergenic regions to the genes they impact. | Macintyre G et al. | — | 2014 | → |
| Cellular dissection of psoriasis for transcriptome analyses and the post-GWAS era. | Swindell WR et al. | — | 2014 | → |
| Characterization and identification of cis-regulatory elements in Arabidopsis based on single-nucleotide polymorphism information. | Korkuc P et al. | — | 2014 | → |
| Cis-regulatory variation: significance in biomedicine and evolution. | Friedensohn S et al. | — | 2014 | → |
| Expanding the roles of chromatin insulators in nuclear architecture, chromatin organization and genome function. | Schoborg T et al. | — | 2014 | → |
| New tools in the box: an evolutionary synopsis of chromatin insulators. | Heger P et al. | — | 2014 | → |
| No gene in the genome makes sense except in the light of evolution. | Haerty W et al. | — | 2014 | → |
| On the identification of potential regulatory variants within genome wide association candidate SNP sets. | Chen CY et al. | — | 2014 | → |
| Purifying selection in deeply conserved human enhancers is more consistent than in coding sequences. | De Silva DR et al. | — | 2014 | → |
| Simulations of enhancer evolution provide mechanistic insights into gene regulation. | Duque T et al. | — | 2014 | → |
| Spurious transcription factor binding: non-functional or genetically redundant? | Spivakov M | — | 2014 | → |
| Stochastic EM-based TFBS motif discovery with MITSU. | Kilpatrick AM et al. | — | 2014 | → |
| Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments. | Kheradpour P et al. | — | 2014 | → |
| The functional consequences of variation in transcription factor binding. | Cusanovich DA et al. | — | 2014 | → |
| The role of the interactome in the maintenance of deleterious variability in human populations. | Garcia-Alonso L et al. | — | 2014 | → |
| A yeast one-hybrid and microfluidics-based pipeline to map mammalian gene regulatory networks. | Gubelmann C et al. | — | 2013 | → |
| Computational interrogation of cis-regulatory elements of genes that are common targets of luteotropin and luteolysin in the primate corpus luteum. | Suresh PS et al. | — | 2013 | → |
| Cooperativity and rapid evolution of cobound transcription factors in closely related mammals. | Stefflova K et al. | — | 2013 | → |
| Coordinated effects of sequence variation on DNA binding, chromatin structure, and transcription. | Kilpinen H et al. | — | 2013 | → |
| DNA sequencing methods in human genetics and disease research. | Lehrach H | — | 2013 | → |
| ENCODE: A Sourcebook of Epigenomes and Chromatin Language. | Yavartanoo M et al. | — | 2013 | → |
| Extensive divergence of transcription factor binding in Drosophila embryos with highly conserved gene expression. | Paris M et al. | — | 2013 | → |
| MCOIN: a novel heuristic for determining transcription factor binding site motif width. | Kilpatrick AM et al. | — | 2013 | → |
| Modulation of epidermal transcription circuits in psoriasis: new links between inflammation and hyperproliferation. | Swindell WR et al. | — | 2013 | → |
| Response to comment on "Evidence of abundant purifying selection in humans for recently acquired regulatory functions". | Ward LD et al. | — | 2013 | → |
| An integrated encyclopedia of DNA elements in the human genome. | ENCODE Project Consortium | — | 2012 | → |