paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #9 — MATERIALS AND METHODS — Model creation and evaluation

Source
KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns.
Embedded
yes

Text

This work incorporates support vector machine (SVM) with the protein sequences and profiles of coupling pattern for training the predictive models for kinase-specific phosphorylation site prediction. A public SVM library, namely LIBSVM (21), is applied for training the predictive models. The SVM kernel function of radial basis function (RBF) is selected. In general, the experimental kinase-specific phosphorylation sites are defined as the positive set, while all other residues (S, T, Y or H) in the phosphorylated proteins are regarded as the negative set. K-fold cross-validation is used to evaluate the predictive performance of the models trained from the large data sets including PKA, PKC and MAPK, and Jackknife cross-validation is applied for models trained from the data size smaller than 30. We balance the positive set and negative set and the sizes of positive set and negative set are equal during the cross-validation processes. The cross-validation is performed for 30 times. The following measures of predictive performance of the trained models are defined: Precision (Prec) = TP/(TP + FP), Sensitivity (Sn) = TP/(TP + FN), Specificity (Sp) = TN/(TN +