paperKB
coga / coga-kb
Help
Sign in

Chunk #1 — Manual annotation process

Source
GENCODE: the reference human genome annotation for The ENCODE Project.
Embedded
yes

Text

The group's approach to manual gene annotation is to annotate transcripts aligned to the genome and take the genomic sequences as the reference rather than the cDNAs. Currently only three vertebrate genomes—human, mouse, and zebrafish—are being fully finished and sequenced to a quality that merits manual annotation. The finished genomic sequence is analyzed using a modified Ensembl pipeline (Searle et al. 2004), and BLAST results of cDNAs/ESTs and proteins, along with various ab initio predictions, can be analyzed manually in the annotation browser tool Otterlace (http://www.sanger.ac.uk/resources/software/otterlace/). The advantage of genomic annotation compared with cDNA annotation is that more alternative spliced variants can be predicted, as partial EST evidence and protein evidence can be used, whereas cDNA annotation is limited to availability of full-length transcripts. Moreover, genomic annotation produces a more comprehensive analysis of pseudogenes. One disadvantage, however, is that if a polymorphism occurs in the reference sequence a coding transcript cannot be annotated, whereas cDNA annotation, for example, performed by RefSeq (Pruitt et al. 2012), can select the major haplotypic form as it is not limited by a reference sequence.