The GENCODE gene set contains 140,066 annotated alternative transcripts at coding loci compared with 66,612 in UCSC genes and 38,157 RefSeq. However it must be noted that not all GENCODE transcripts are full length, and if an annotated transcript is partial, it is tagged with start_not_found or end_not_found to highlight this to the user. The GENCODE gene set has 9640 lncRNA loci compared with 6056 in UCSC genes and 4888 in RefSeq. The three transcript data sets (UCSC, RefSeq, and GENCODE) were compared computationally to see how many transcripts were contained in all data sets and how many were unique to each data set (Fig. 7B). As expected, the majority (89%) of CDSs from RefSeq matched in all data sets exactly since Ensembl and UCSC genes use RefSeq cDNA in their automatic pipelines. However, GENCODE has 33,977 unique coding sequences outside RefSeq compared with 18,712 in UCSC genes. Of these unique transcripts, there are only 9319 exact matches in both these sets, indicating the different methods of annotation and the way they interpret EST data.