Chunk #42 — Assessing coding potential of putative models using mass spectrometry data

Source: GENCODE: the reference human genome annotation for The ENCODE Project.
Embedded: yes

Text

A pipeline has been set up to verify the annotation of gene models with mass spectroscopy data from human proteomics experiments (M Tress, P Maietta, I Ezkurdia, A Valencia, J-J Wesselink, G Lopez, A Pietrelli, and JM Rodriguez, in prep.). The data from tandem mass spectrometry experiments are stored in two huge proteomics data repositories, the GPM (Craig et al. 2004) and Peptide Atlas (Desiere et al. 2006). Peptides are detected by mapping spectra from individual proteomics experiments to the gene products from the GENCODE annotation using the search engine X!Tandem (Craig and Beavis 2004). A single peptide may be detected in many different experiments, though only once per experiment. We generate P-values for all detected peptides by combining the X!Tandem P-values for each individual experimental peptide-spectrum match, and a target-decoy approach is used to determine false-discovery rates.