paperKB
coga / coga-kb
Help
Sign in

Chunk #12 — PREDICTION PERFORMANCE

Source
KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns.
Embedded
yes

Text

For finding the best predictive performance of SVM models in each kinase-specific group, the SVM models trained with various features such as coupling pattern (CP), sequence and the combination of coupling pattern and sequence are evaluated based on cross-validation. As shown in Figure 2, the average precision (Prec), sensitivity (Sn), specificity (Sp) and accuracy (Acc) of the SVM models trained with various features are calculated for phosphoserine, phosphothreonine, phosphotyrosine and phosphohistidine. Two methods are used to extract the coupling patterns, i.e. ‘CP difference’ and ‘CP ratio’. ‘CP difference’ indicates the coupling strength of training set subtracted the coupling strength of background set, and ‘CP ratio’ indicates the coupling strength of training set divided the coupling strength of background set. As to the feature of sequence profile, there are various coding methods used for encoding amino acids surrounding the phosphorylation sites, such as reduced alphabet (3-classes, 7-classes and 8-classes), BLOSUM62 profile encoding and 20-dimensional vector. Because the average predictive performance of the kinase-specific phosphorylation sites with small training set may be overestimated, the SVM models of kinase-specific group whose data size