Miquel Duran-Frigola · a363d8d5
--- a/signaturization.md
+++ b/signaturization.md
@@ -9,13 +9,13 @@ The central type of data are the signatures (one numerical vector per molecule),
 * `sign2` [Signatures type 2](#signatures-type-2): Network embedding of the similarity matrix derived from signatures. They have fixed-length, which is convenient for machine learning, and capture both explicit and implicit similarity relationships in the data.
 * `sign3` [Signatures type 3](#signatures-type-3): Network embedding of observed *and* inferred similarity networks. Their added value, compared to signatures type 2, is that they can be derived for virtually *any* molecule in *any* dataset. :warning: These signatures are not calculated yet, and won't be in the near future.  

-Besides, there are other (auxiliary) types of data that may be of interest. `*` denotes correspondence with signatures type `0`-`3`. 
+Besides, there are other (auxiliary) types of data that may be of interest. The asterisk `*` denotes correspondence with signatures type `0`-`3`. 

 * `sims*` [Similarity vectors](#similarity-vectors): Full similarities stored as light `int8` data. Each molecule receives one such similarity vector per dataset. They may be observed (`_obs`) or predicted (`_prd`) similarities. These signatures are [only applicable to exemplary datasets](production-phase). Currently, we only keep `sims1`.
-* `neig*` [Nearest-neighbors](#nearest-neighbors): . Currently, we consider the 1000-nearest neighbors, which is more than sufficient in any realistic scenario. For now, we only keep `neig1`.
-* 
+* `neig*` [Nearest neighbors](#nearest-neighbors): . Currently, we consider the 1000-nearest neighbors, which is more than sufficient in any realistic scenario. For now, we only keep `neig1`.
+* `nprd*` [Predicted nearest neighbors](#predicted-nearest-neighbors)

-I consider the numbering `0`-`3` to be conceptually closed. However, further auxiliary data types may be introduced in the future. Note that all names have a four-letter code follwed by a digit. Future data should stick to this nomenclature.
+I consider the numbering `0`-`3` to be conceptually closed. However, further auxiliary data types may be introduced in the future. Note that all names have a 4-character code followed by a digit. Future data should stick to this nomenclature.

 ## Commonalities

@@ -23,17 +23,17 @@ I consider the numbering `0`-`3` to be conceptually closed. However, further aux

 I suggest that

-## Signatures type 0
+## `sign0` Signatures type 0

-## Signatures type 1
+## `sign1` Signatures type 1

-## Signatures type 2
+## `sign2` Signatures type 2

-## Signatures type 3
+## `sign3` Signatures type 3

-## Similarity vectors
+## `sims*` Similarity vectors `sims*`

-## Nearest neighbors
+## `neig*` Nearest neighbors and `nprd*` predicted nearest neighbors

 Below, I list an schematic proposal of the classes: