paperKB
coga / coga-kb
Help
Sign in

Chunk #2 — Introduction

Source
Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.
Embedded
yes

Text

Long intergenic noncoding RNAs (lincRNAs) are defined as intergenic (relative to current gene annotations) transcripts longer than 200 nucleotides in length that lack protein coding capacity. LincRNAs are known to perform myriad functions through diverse mechanisms ranging from the regulation of epigenetic modifications and gene expression to acting as scaffolds for protein signaling complexes [8], [15]. The first attempts to generate lincRNA annotation sets either profiled lincRNAs specific to a small number of tissues or required that transcripts harbor specific structural features such as splicing and polyadenylation [16]–[18]. The GENCODE consortium (GENCODE v7) has manually curated approximately five thousand lincRNAs that are not restricted to particular tissues or structural features, however this annotation set contains only a small fraction of all lincRNAs because it does not take advantage of RNA-seq data to identify novel transcripts [19], [20]. The limited scale of current lincRNA annotations, including GENCODE, is clearly incompatible with the massive amount of intergenic transcription observed by the ENCODE project. It should therefore be expected that the genome encodes far more lincRNAs than are currently known.