paperKB
coga / coga-kb
Help
Sign in

Chunk #15 — GROWTH AND STATISTICS

Source
Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation.
Embedded
yes

Text

The dramatic decrease in the number of plasmid protein records, and thus in the number of total accessions, reflects the completion of a RefSeq bacterial genome re-annotation project (http://www.ncbi.nlm.nih.gov/refseq/about/prokaryotes/reannotation/) and the adoption of the new data model for prokaryotes, including their plasmids. In this new data model a single RefSeq non-redundant protein accession may be annotated on more than one genomic sequence record when translation of those genomic protein-coding regions results in an identical protein (see http://www.ncbi.nlm.nih.gov/refseq/about/nonredundantproteins/). Redundancy in all bacterial proteins also significantly decreased; however, it is not apparent here due to continued significant increases in the number of bacterial genomes included in the dataset. These changes also resulted in an overall drop in the number of archaeal protein records.