Chunk #9 — TOPPGENE: FUNCTIONAL ANNOTATIONS-BASED CANDIDATE GENE PRIORITIZATION

Source: ToppGene Suite for gene list enrichment analysis and candidate gene prioritization.
Embedded: yes

Text

similarity measures of categorical (e.g. GO annotations) and numeric (i.e. gene expression) annotations. While a fuzzy-based similarity measure is applied for categorical terms [see Popescu et al. (30) for additional details], for numeric annotation, i.e. the microarray expression values, the similarity score is calculated as the Pearson correlation of the two expression vectors of the two genes. The 14 similarity scores are combined into an overall score using statistical meta-analysis. A P-value of each annotation of a test gene G is derived by random sampling of the whole genome. The P-value of similarity score Si is defined as: Fisher's inverse chi-square method, which states that (assuming pi values come from independent tests) is then applied to combine the P-values from multiple annotations into an overall P-value. The final similarity score of the test gene is then obtained by 1 minus the combined P-value. For more details, validation and comparison with other related applications; the readers are referred to our previous study (10).