Autoencoder
I just write this so that I don't forget. Not urgent.
Once we have a signature type 3 predictions for every molecule in every dataset, we can obviously stack the signatures in order to have a multi-dataset description for this molecule. This could be useful, for example, to have a "global" description of the molecule.
Obviously, stacking 25 datasets will yield a very long vector with substantial redundancies. We could build an autoencoder to compress this vector to e.g. 512-d.
@mbertoni : do you know of any adanet examples to build an autoencoder?