This is an old version of this page. You can view the most recent version or browse the history.

similarity and connectivity

Similarity and connectivity

🚧 This page is under construction.

In the context of the CC, "connectivity" is a generalization of "similarity". That is, it is a more flexible means to compare entities. The notion of connectivity is of special interest to unsupervised drug discovery since it enables mapping of external biological data to the chemical space.

In the CC pipeline, connectivity happens at the pre-processing step. The pre-processing step has two phases:

More precisely, connectivity starts with standard input files and finishes with a signature type 0.

In some datasets, this procedure may be of considerable complexity and we need to conceive workflows that can be wrapped into a predict() method. For other datasets, the procedure will be almost trivial.

Another important matter here is the distance. The CC works with common distance metrics, such as the cosine or euclidean distances. Sometimes, connectivity may require other types of metrics (e.g. GSEA-like, overlap, etc.). We might consider learning siamese networks that transform original distances to the more standard ones. This is an unexplored avenue, though.

GitLab

similarity and connectivity

Similarity and connectivity

Chemistry

Targets

Networks

Cells

Diseases