Chunk #38 — Materials and Methods — Quantitation of the Transcribed Fraction of the Genome

Source: Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.
Embedded: yes

Text

The uniquely mappable human genome, defined here as the portions of the genome to which RNA-seq reads can be uniquely mapped, was derived for hg18 from http://www.imagenix.com/uniqueome/downloads/hg18_uniqueome.unique_starts.base-space.50.2.positive.BED.gz [45]. It contains 2,570,174,327 bp or 83.4% of the total human genomic sequence. To determine the genomic coverage of RNA-seq data, all aligned RNA-seq reads were combined and read coverage at each genomic base position was determined with the BEDTools function genomeCoverageBed. Split reads (i.e. exon-exon junction spanning reads) were counted such that intronic sequence was included as part of the reads. In Figure 1A, “All genes, ESTs, cDNAs” includes GENCODE v10 genes (excluding pseudogenes), RefSeq NM and NR genes, UCSC Known Genes, spliced H-Invitational cDNAs, spliced ESTs (UCSC Genome Browser “Spliced EST” track), and previously annotated spliced lincRNAs [16]. In all cases, intronic sequences of genes, cDNAs and ESTs were included.