paperKB
coga / coga-kb
Help
Sign in

Chunk #47 — Methods (full – for online materials) — Sequence motif analysis on global CAGE enhancer and promoter sets

Source
An atlas of active enhancers across human cell types and tissues.
Embedded
yes

Text

This created five sets of regions representing non-overlapping bidirectional enhancers, nonCGI promoters and CGI promoters (annotated and full sets for the two latter ones). Motif enrichment was analyzed using HOMER36 version 3, a suite of tools for motif discovery and next-generation sequencing analysis (http://biowhat.ucsd.edu/homer/). Sequences of the three region sets (enhancers, nonCGI and CGI promoters) were compared to equal numbers of randomly selected genomic fragments of the average region size, matched for GC content and autonormalized to remove bias from lower-order oligo sequences. After masking repeats, motif enrichment was calculated using the cumulative binomial distribution by considering the total number of target and background sequence regions containing at least one instance of the motif. One hundred motifs were searched for a range of motif lengths (7-14 bp) resulting in a set of 800 de novo motifs per set. After filtering redundant motifs, the top 50 motifs resulting from each search were combined, remapped and ranked according to enrichment (depletion) in the enhancer set. In parallel, we also used HOMER to calculate the enrichment of ChIP-seq derived known transcription factor motifs.