paperKB
coga / coga-kb
Help
Sign in

Chunk #20 — ENCODE data production and initial analyses — Summary of ENCODE-identified elements

Source
An integrated encyclopedia of DNA elements in the human genome.
Embedded
yes

Text

Accounting for all these elements, a surprisingly large amount of the human genome, 80.4%, is covered by at least one ENCODE-identified element (detailed in Supplementary Table Q1). The broadest element class represents the different RNA types covering 62% of the genome (although the majority is inside of introns or near genes). Regions highly enriched for histone modifications form the next largest class (56.1%). Excluding RNA elements and broad histone elements 44.2 % of the genome is covered. Smaller proportions of the genome are occupied by regions of open chromatin (15.2%) or sites of TF binding (8.1%), with 19.4% covered by at least one DHS or TF ChIP-seq peak across all cell lines. Using our most conservative assessment, 8.5% of bases are covered by either a TF binding site motif (4.6%) or a DHS footprint (5.7%). This however is still about 4.5-fold higher than the amount of protein coding exons, and about 2-fold higher than the estimated amount of pan-mammalian constraint.