The Protein Clusters database (www.ncbi.nlm.nih.gov/proteinclusters/) contains over 280 000 sets of almost identical RefSeq proteins encoded by complete prokaryotic, mitochondrial or chloroplast genomes and organized in a taxonomic hierarchy (33). These clusters are used as a basis for genome-wide comparison at NCBI as well as to provide simplified BLAST searches via Concise Microbial Protein BLAST (www.ncbi.nlm.nih.gov/genomes/prokhits.cgi). Protein Clusters provides annotations, publications, domains, structures, external links and analysis tools, including multiple sequence alignments and phylogenetic trees.