paperKB
coga / coga-kb
Help
Sign in

Chunk #39 — Experimental Procedures — Hallmark generation methodology — Step 4: Define raw hallmark sets

Source
The Molecular Signatures Database (MSigDB) hallmark gene set collection.
Embedded
yes

Text

We defined raw hallmarks for each of the 50 hallmarks produced by the previous step. A raw hallmark is the union of a cluster of founder gene sets’ genes after excluding all the “unknown genes”. We considered a gene as “unknown” if it has been identified exclusively by automatic computational predictions or represented a poorly documented sequence such as an EST. Specifically, we defined a gene as “unknown” if its official gene symbol (according to NCBI Entrez and HUGO) matched naming conventions of an EST (e.g., “KIAA”, “LOC”, “MGC”, “FLJ”, or “DKFZp” followed by digits).