... | ... | @@ -68,7 +68,7 @@ Obviously, it is mandatory that the *vocabularies* used in the production phase |
|
|
|
|
|
From Signature Type 0 onwards, the CC only deals with two distance metrics: the cosine distance and the Euclidean distance. These are well-accepted metrics that capture two different properties: the direction and the absolute distance, respectively.
|
|
|
|
|
|
It may happen that some datasets require more advanced metrics, though. In this case, we recommend applying any required **transformation** of the data in the pre-processing, so as Signatures Type 0 are natively comparable using cosine/Euclidean distances. This can be achieved by metric learning algorithms. For example, one incorporate a Siamese network in the pre-processing:
|
|
|
It may happen that some datasets require more advanced metrics, though. In this case, we recommend applying any required **transformation** of the data in the pre-processing, so as Signatures Type 0 are natively comparable using cosine/Euclidean distances. This can be achieved by metric learning algorithms. For example, one can incorporate a Siamese network in the pre-processing:
|
|
|
|
|
|
![connectivity_examples-02](/uploads/4205269881f764f5e74af564838ebc10/connectivity_examples-02.png)
|
|
|
|
... | ... | @@ -76,6 +76,8 @@ It may happen that some datasets require more advanced metrics, though. In this |
|
|
|
|
|
The *mapping* (prediction) for new molecules/entities can be entered at one or multiple steps of the predict pipeline. The corresponding argument is `entry_point`.
|
|
|
|
|
|
Please note that, in this case, the key that is kept in the Signature Type 0 is exactly the one provided by the user.
|
|
|
|
|
|
### `A` Chemistry
|
|
|
|
|
|
#### `A1.001` 2D fingerprints
|
... | ... | |