The NCBI RefSeq database (27) is a nonredundant set of curated and computationally derived sequences for transcripts, proteins and genomic regions. The number of records in the RefSeq collection has grown by 42% over the past year so that Release 36 (July 2009) contains 4.0 million nucleotide and 8.2 million protein sequences representing over 8600 organisms. RefSeq sequences can be searched and retrieved from the Entrez Nucleotide and Protein databases, and the complete RefSeq collection is available in the RefSeq directory on the NCBI FTP site.