paperKB
coga / coga-kb
Processing
Help
Sign in

Chunk #23 — UNIQUE CHEMICAL ANNOTATION

Source
canSAR: update to the cancer translational research and drug discovery knowledgebase.
Embedded
yes

Text

To develop our pipeline, we developed a Knime workflow using Knime version 4.1.3. with RDKit KNIME integration version 4.0.1.v202002121352 and KNIME Python Integration version 4.1.3.v202005112253 running on a Python 3 environment; MolVS version 0.1.1, ChemAxon version 20.18.0. Class Tautomerization Plugin was used for canonicalization of molecules, Marvin 20.18, 2020, ChemAxon. Standardizer was used for structure transformation, JChem 20.18.0, 2020, ChemAxon (http://www.chemaxon.com). In summary, compounds from multiple data sources (e.g. ChEMBL (28), BindingDB (29), and canSAR own compounds) are imported and any violations to the SDF format are corrected. Compounds are then Kekulised using RDKit Kekulizer and standardised using MolVS Standardizer tool. A key change is the subsequent use of ChemAxon Class Tautomerization Plugin to generate canonical tautomers of all compounds. Subsequent salt-stripping and tautomer generation of the free base/acid is carried out.