paperKB
coga / coga-kb
Help
Sign in

Chunk #10 — NEW DEVELOPMENTS IN KEGG — KEGG GENES and ortholog annotation

Source
KEGG for representation and analysis of molecular networks involving diseases and drugs.
Embedded
yes

Text

As of 3 September 2009, the KEGG GENES database contains 4.8 million genes in 1049 genomes. In comparison, the UniProt database (9) contains 9.4 million proteins from one-half million species. KEGG already covers half of the known protein universe and >90% of protein sequence families (Kanehisa,M., unpublished data). As the number of complete genomes increases, the coverage of the protein universe will also increase, but there will be remaining fractions of protein families, such as for plant proteins and viral proteins. These protein families are useful to analyze, for example, EST data and metagenomics data, and they will be incorporated in the KO system.