|
|
|
# Similarity and connectivity
|
|
|
|
|
|
|
|
:construction: This page is under construction.
|
|
|
|
|
|
|
|
In the context of the CC, "connectivity" is a generalization of "similarity". That is, it is a more flexible means to compare entities. The notion of connectivity is of special interest to *unsupervised* drug discovery since it enables mapping of external biological data to the chemical space.
|
|
|
|
|
|
|
|
In the [CC pipeline](production phase), connectivity happens at the pre-processing step. The pre-processing step has [two phases](datasets):
|
|
|
|
|
|
|
|
1. XX
|
|
|
|
2. XX
|
|
|
|
|
|
|
|
More precisely, connectivity starts with **standard input files** and finishes with a **signature type 0**.
|
|
|
|
|
|
|
|
In some datasets, this procedure may be of considerable complexity and we need to conceive workflows that can be wrapped into a `predict()` method. For other datasets, the procedure will be almost trivial.
|
|
|
|
|
|
|
|
Another important matter here is the distance. The CC works with *common* distance metrics, such as the `cosine` or `euclidean` distances. Sometimes, connectivity may require other types of metrics (e.g. GSEA-like, overlap, etc.). We might consider learning siamese networks that transform original distances to the more standard ones. This is an unexplored avenue, though.
|
|
|
|
|
|
|
|
## Chemistry
|
|
|
|
|
|
|
|
|
|
|
|
## Targets
|
|
|
|
|
|
|
|
|
|
|
|
## Networks
|
|
|
|
|
|
|
|
|
|
|
|
## Cells
|
|
|
|
|
|
|
|
|
|
|
|
## Diseases |
|
|
|
\ No newline at end of file |