Speed-up similarity search

I see three ways of speeding up the similarity search (currently it is prohibitively slow).

Modify the similarity scripts so that they can work with cdist.
Hash/binarize the vectors and use e.g. Hamming distances. For this, we can use data-dependent or data-independent algorithms.
Use annoy or any other nearest-neighbor search.