Chunk #34 — PLANTS

Source: Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation.
Embedded: yes

Text

INSDC transcript data are not available for the genomic cultivar then a RefSeq transcript may be generated from the assembled genomic sequence based on a combination of transcript or protein alignments, RNA-Seq, and/or published data. A second area of focus is to increase the number of supported known protein-coding transcripts and proteins as this provides a curated reagent that can be used when annotating other plant genomes. Lastly, we are making more RefSeqs representing splice variants when there is sufficient supporting evidence. These efforts will significantly improve the quality of the plant RefSeq dataset and will contribute to improvements in future genome annotations. The current set of plant genomes annotated by the pipeline can be accessed at NCBI's eukaryotic genome annotation pipeline website http://www.ncbi.nlm.nih.gov/genome/annotation_euk/all/ with links to the detailed annotation report and other resources such as species BLAST and FTP.