New Pipeline
@mbertoni In order to design the new pipeline we should define all steps that the pipeline will follow and if there are substeps. Also we could already decide if some of steps will be sequential or parallel. This list of steps should help us to choose the solution that better fits our needs.
-
Downloads
-
Mol repos
-
Load Chembl DB (How to set the DB name, parsing? or config file?)
-
Signature0 (Preprocess)
4.1 Chemistry spaces( need to wait for the other spaces to finish)
4.2 Network (C5.001) The following networks need to be processed before starting the production of raws)
- Recon
- String
- Inbiomap
- Ppidb
-
Remove Near Duplicates
-
Signature 1
-
Clustering
-
NearestNeighbours
-
Signature 2
-
Projections
-
Web stuff