The following is a list of chemical libraries that we like to use in virtual screening exercises. This is only a small selection. The best way to browse compound collections is via ZINC Subsets and, especially, ZINC Catalogs. ZINC assigns an
abbreviation to each collection, and we stick with ZINC notation if possible.
In the SB&NB file-system, compound collections can be found in
chemical_checker PostGreSQL database contains the compound collections, too, in the
- Included in the CC
- Not yet included in the CC
These are popular and representative compound libraries that we offer as default search libraries in the Chemical Checker similarity resource.
dbap: Approved drugs according to DrugBank [data]
dbex: Experimental drugs according to DrugBank [data]
hmbdb: Human metabolites available from the Human Metabolome Database [data]
Traditional Chinese Medicines
tcmnp: Traditional Chinese Medicines from TCM Database @ Taiwan. This is the world's largest collection of Chinese medicines [data]
lincs: Compounds, mainly from the Broad Institute Library, related to the LINCS consortium [data]
Prestwick Chemical Library
prestwick: A commercial collection of over 1.2k off-patent drugs [data]
NIH Clinical Collection
nihcc: Screening library used at the National Institute of Health [data]
NCI Diversity Collection
ncidiv: Screening library of the Developmental Therapeutics Program [data]
toolcompounds: Tool compounds according to ZINC [data]
There is a large number number of chemogenomics databases that one may consider. I list here the ones selected in an important recent review on the matter.
- Pfizer chemogenomic library: Compound selection based on the most selective pharmacological probe for a given target, and maximal chemical and biological diversity; dominated by kinases, GPCRs and ion channels; available for external collaboration.
- Sigma library of pharmacologically active compounds (LOPAC1280): Commercially available, widely reported, well suited for GPCR biology.
- Prestwick Chemical Library (see above): Contains only approved drugs.
- GlaxoSmithKline Biologically Diverse Compound Set: Selective pharmacological probes, dominated by kinase and GPCR targets.
- Mechanism Interrogation PlatE 3.0 (NCATS): Oncology focused and dominated by kinase inhibitors. Well suited for anticancer screens.
- NIH Molecular Libraries Program Probes: Good coverage of nucleic acid-binding proteins; all bioassay data are open access.
- GlaxoSmithKline Protein Kinase Inhibitor Set: Open source kinase chemical probes.
- Pathogen Box: Approx. 400 diverse, drug-like molecules active against neglected diseases of interest and is available free of charge. Formerly the Malaria Box.
- Pandemic Response Box: 400 compounds.
- Malaria Box
- Reframe: A screening library of 12,000 molecules assembled by combining three databases (Clarivate Integrity, GVK Excelra GoStar and Citeline Pharmaprojects) to facilitate drug repurposing.
- Probes&Drugs 4050 Probes and 12329 drugs. Public resource joining together focused libraries of bioactive compounds (probes, drugs, specific inhibitor sets etc.) with commercially available screening libraries.
Natural product databases
We have a particular interest in natural product (NP) databases, mainly because they are likely to be useful for Global Health research.
- South African Natural Compounds Database (SANCDB): Contains about 600 NPs, all of them with some degree of bioactivity annotation. Belongs to the Research Unit in Bioinformatics (RUBi), NIH Common Fund and Rhodes University.
- AfroDB: Dataset 1 in Ntie-Kang et al. 2013. Contains 947 compounds. Coordinated from Cameroon.
- Natural Product Activity and Species Source database (NPASS): Over 35k well-annotated compounds, belonging to 25k organisms.
- Northern African Natural Products Database (NANPDB): About 4.5k NPs from Northern Africa, mainly from plants.
- Brazilian Natural Compound Database (NUBBEdb): Contains 640 molecules mainly from plants in Brazil.
- BIOFACQUIM: A Mexican Compound Database of Natural Products.
- CMAUP: Collective Molecular Activities of Useful Plants.
- FooDB: 24,000 food chemicals
- Chem-Space. Downloadable compound and target sets.
- Boheringer BI Miner
- IRB Barcelona library