We define cancer genes as a collection of 736 genes that are mutated in different cancer types and derive from two different data sources. A total of 375 genes come from the Cancer Gene Census (CGC-genes, December 2008), a manually curated list of genes with at least two independent reports of mutations in primary tumors (5). The census provides information on the tumor type, as well as on the genetic effect of the mutation, i.e. whether the mutation is dominant or recessive (Figure 1A and B). The remaining 396 genes derive from high-throughput mutational screenings performed in glioblastoma (7), breast and colorectal (6), pancreatic (8) cancers (CAN-genes, Figure 1C) and lung adenocarcinoma (TSP-genes) (9). CAN- and TSP-genes result from the effort of massively sequencing the cancer gene repertoire (26), and provide the first unbiased mutational screenings in different cancer types. The lists of literature-curated and high-throughput derived cancer genes show poor overlap (Figure 1D), confirming the cancer-specificity of the mutational landscape (27). We gather the protein sequences associated to the 736 cancer genes from the RefSeq database [March 2009, (28)]. For eight genes no RefSeq is available and Ensembl protein sequence (29) is used instead.