[BACK]
Word composition metrics are based in L-tuple frequencies or counts. Distances between
two sequences are defined as:
- Smith-Waterman alignment algorithm (SW)
Smith-Waterman raw score, using BLOSUM50 scoring matrix and a linear scoring
scheme with gap penalty of
8
- W-metric (Wm)
see manuscript
- Mahalanobis (ma)

- Standard Euclidean (se)

- Euclidean (eu)

- Kullback-Leibler discrepancy (ku)

- Cosine (co)
,
where


For further information and definitions see (Vinga
& Almeida, 2003)
[BACK]