Chunk #45 — Accessing data via UCSC, Ensembl, and FTP

Source: GENCODE: the reference human genome annotation for The ENCODE Project.
Embedded: yes

Text

As the GENCODE gene set is now built partly through the Ensembl pipeline, the GENCODE data release cycle is coupled to the trimonthly Ensembl releases. Dates and release notes, as well as more details of the data sets and formats, are listed at http://www.gencodegenes.org. The GENCODE releases contain updated gene sets where either new data from the manual annotation has been integrated as described above or additionally the automated gene set was rebuilt or refined. Users can view GENCODE data in the UCSC browser (Fig. 9), and also it is the default gene set shown in Ensembl. All genes and pseudogenes within the release have stable Ensembl (ENS) identifications and the manual annotated genes have additional Vega (OTT) IDs (Wilming et al. 2008). All OTT identifications are also versioned so the user can identify when a transcript was last manually updated.