Chunk #43 — Assessing coding potential of putative models using mass spectrometry data

Source: GENCODE: the reference human genome annotation for The ENCODE Project.
Embedded: yes

Text

The pipeline was able to map peptides to almost 40% of the protein-coding genes in the GENCODE 7 release. The 83,054 tryptic peptides detected at a false-discovery rate of 1%, mapped unambiguously to 8098 of the 20,700 annotated protein-coding genes. We were able to detect the translation of multiple splice isoforms for 194 genes, and within this set of genes, we validated the expression of eight isoforms that were tagged as candidates for NMD degradation. We found peptide support for 33 transcripts annotated as “putative” and another 50 transcripts annotated as “novel.” With the mass spectroscopy data we generated for the GENCODE 3c release, we also detected the expression of peptides that mapped to a pseudogene (MST1P9) (Pei et al. 2012).