Chemical libraries
The following is a list of chemical libraries that we like to use in virtual screening exercises. This is only a small selection. The best way to browse compound collections is via ZINC Subsets and, especially, ZINC Catalogs. ZINC assigns an abbreviation
to each collection, and we stick with ZINC notation if possible.
In the SB&NB file-system, compound collections can be found in /aloy/chemical_checker_repo/libraries/
. The chemical_checker
PostGreSQL database contains the compound collections, too, in the libraries
table.
-
Included in the CC -
Not yet included in the CC
Exemplary collections
These are popular and representative compound libraries that we offer as default search libraries in the Chemical Checker similarity resource.
-
Approved Drugs dbap
: Approved drugs according to DrugBank [data] -
Experimental Drugs dbex
: Experimental drugs according to DrugBank [data] -
Human Metabolites hmbdb
: Human metabolites available from the Human Metabolome Database [data] -
Traditional Chinese Medicines tcmnp
: Traditional Chinese Medicines from TCM Database @ Taiwan. This is the world's largest collection of Chinese medicines [data] -
LINCS Compounds lincs
: Compounds, mainly from the Broad Institute Library, related to the LINCS consortium [data] -
Prestwick Chemical Library prestwick
: A commercial collection of over 1.2k off-patent drugs [data] -
NIH Clinical Collection nihcc
: Screening library used at the National Institute of Health [data] -
NCI Diversity Collection ncidiv
: Screening library of the Developmental Therapeutics Program [data] -
Tool Compounds toolcompounds
: Tool compounds according to ZINC [data]
Chemogenomics libraries
There is a large number number of chemogenomics databases that one may consider. I list here the ones selected in an important recent review on the matter.
-
Pfizer chemogenomic library: Compound selection based on the most selective pharmacological probe for a given target, and maximal chemical and biological diversity; dominated by kinases, GPCRs and ion channels; available for external collaboration. -
Sigma library of pharmacologically active compounds (LOPAC1280): Commercially available, widely reported, well suited for GPCR biology. -
Prestwick Chemical Library (see above): Contains only approved drugs. -
GlaxoSmithKline Biologically Diverse Compound Set: Selective pharmacological probes, dominated by kinase and GPCR targets. -
Mechanism Interrogation PlatE 3.0 (NCATS): Oncology focused and dominated by kinase inhibitors. Well suited for anticancer screens. -
NIH Molecular Libraries Program Probes: Good coverage of nucleic acid-binding proteins; all bioassay data are open access. -
GlaxoSmithKline Protein Kinase Inhibitor Set: Open source kinase chemical probes. -
Pathogen Box: Approx. 400 diverse, drug-like molecules active against neglected diseases of interest and is available free of charge. Formerly the Malaria Box. -
Pandemic Response Box: 400 compounds. -
Malaria Box -
Reframe: A screening library of 12,000 molecules assembled by combining three databases (Clarivate Integrity, GVK Excelra GoStar and Citeline Pharmaprojects) to facilitate drug repurposing. -
Probes&Drugs 4050 Probes and 12329 drugs. Public resource joining together focused libraries of bioactive compounds (probes, drugs, specific inhibitor sets etc.) with commercially available screening libraries.
Natural product databases
We have a particular interest in natural product (NP) databases, mainly because they are likely to be useful for Global Health research.
-
South African Natural Compounds Database (SANCDB): Contains about 600 NPs, all of them with some degree of bioactivity annotation. Belongs to the Research Unit in Bioinformatics (RUBi), NIH Common Fund and Rhodes University. -
AfroDB: Dataset 1 in Ntie-Kang et al. 2013. Contains 947 compounds. Coordinated from Cameroon. -
Natural Product Activity and Species Source database (NPASS): Over 35k well-annotated compounds, belonging to 25k organisms. -
Northern African Natural Products Database (NANPDB): About 4.5k NPs from Northern Africa, mainly from plants. -
Brazilian Natural Compound Database (NUBBEdb): Contains 640 molecules mainly from plants in Brazil. -
BIOFACQUIM: A Mexican Compound Database of Natural Products. -
CMAUP: Collective Molecular Activities of Useful Plants. -
FooDB: 24,000 food chemicals
Purchasable libraries
-
Chem-Space. Downloadable compound and target sets.
Free collections
-
Boheringer BI Miner -
IRB Barcelona library