Source data
The CC capitalizes on many data sources. The following is an extensive list of resources that are worth considering in current and future versions of the CC. Inside each CC level, I list the resources in alphabetical order.
Observational data resources
A
Chemistry
-
ChemoPy [paper code] - A small chemoinformatics library focused on physicochemical properties and some fingerprints.
-
DeepChem [web code] - A powerful deep learning chemoinformatics library, containing a large number of featurizers.
- Among the interesting featurizers, there are the PDB-crystal embeddings, which should, in principle, enable connectivity between crystals and small molecules.
-
E3FP [paper code] - Simple representations of 3D molecular structure.
- Integrated tightly with RDKIT.
-
molBLOCKS [paper code] - Decompose small molecules into fragments (scaffolds).
-
PyBioMed [paper code] - A number of physicochemical descriptors and the common fingerprints. Very similar to ChemoPy.
- It can also featurize sequence data (protein and DNA).
-
RDKIT [paper code] - The standard library for chemoinformatics in
python
. - Calculates several fingerprints and also does 3D conformational sampling.
B
Targets
-
DrugBank mode of action and metabolic genes -
ChEMBL -
BindingDB -
STITCH -
Therapeutic Target Database -
Comparative Toxicogenomics Database -
PubChem Bioassays -
HTS bioassays matrices from paper XXX -
Touchstone binding data -
Human Metabolome Database
C
Networks
-
Chemical Entities of Biological Interest (ChEBI) -
MetaPhORS -
STRING -
InWEB -
Recon 1 -
PathwayCommons -
Reactome -
KEGG
D
Cells
E
Clinics
- [X]