"Entry points" in pre-processing scripts

Pre-processing scripts should have a predict() method, too. I think there was a misunderstanding here.

This is what I wrote in the wiki:

" Every dataset has a particular processing protocol, always consisting of two consecutive steps:

Fetching of data and conversion to a standard input file.
- It is very important that data are minimally transformed here.
- Data may be fetched from the downloaded files, from calculated properties, or from a file of interest of the user.
From standard input to signature type 0
- When adding/updating a dataset, all procedures here must be encapsulated in a fit() method.
- Accordingly, a predict() method must be available.
- Acceptable standard inputs include: .gmt, .h5 and .tsv. It is strongly recommended that input features are recognizable entities, e.g. those defined in the Bioteque.

It is of the utmost importance that step 2 is endowed with a predict() method. Having the ability to convert any standard input to a signature type 0 (in an automated manner) will enable implementation of connectivity methods. This is a critical feature of the CC and I anticipate that most of our efforts will be put in this particular step."