paperKB
coga / coga-kb
Help
Sign in

Chunk #30 — Data additions — New spectral and chromatographic data

Source
DrugBank 6.0: the DrugBank Knowledgebase for 2024.
Embedded
yes

Text

Retention indices (RI) are another useful set of observables that can be used to identify molecules via GC–MS. RIs are essentially adjusted retention times used in gas chromatography that allow nearly universal comparisons of retention times across GC platforms. The DrugBank team used a machine learning algorithm called RI-Pred (13) that has an RI error of <2% (Table 3). Using a cut-off mass of 900 Daltons (the upper mass limit for most GC–MS instruments), a total of ∼4000 compounds were selected from the DrugBank as being ‘GC–MS’ compatible. These compounds were then computationally derivatized with TMS and TBDMS to generate ∼100k derivatized structures. RI-Pred was then used to predict the retention indices for all ∼100k derivative structures across three standard types of GC columns (semi-standard non-polar, standard non-polar and standard polar). This led to the generation of ∼300k predicted column-specific retention indices––all of which have been entered in the ‘Predicted Spectral Properties’ subsection (under the ‘Properties’ field) of every GC–MS compatible drug. These RI data have been incorporated into DrugBank's GC–MS search function as described later.