Chunk #23 — Conclusions and perspectives

Source: JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding profiles.
Embedded: yes

Text

So far, the JASPAR database has stored and focused mostly on PFMs as the model of choice for TF-DNA interactions. We recognise that the PFMs stored in JASPAR assume nucleotide independence and do not consider the methylation status of nucleotides, which would require DNA methylation data and an expanded alphabet or specific representation (44–46). To account for successive nucleotide dependencies, we introduced transcription factor flexible models (TFFMs) into JASPAR for a set of profiles when data was available to compute them (12,33). Mostly based on convolutional neural networks, deep learning models are now considered state-of-the-art to accurately model data generated from genomic assays such as ChIP-seq, ChIP-nexus, or ATAC-seq (47–49). Some deep learning models have improved performance when initialising their convolutional filters with PWMs derived from JASPAR profiles (50,51), while many models assess derived patterns by comparison with JASPAR profiles (47,48,52). The high quality of the modelling and improved methods to interpret the deep learning models make them attractive to decipher the cis-regulatory code (53). With deep learning approaches becoming critical to studying TF-DNA interactions and discovering the regulatory grammar