[BACK]
Word composition metrics are based in L-tuple frequencies or counts. Distances between
two sequences are defined as:
- Smith-Waterman alignment algorithm (SW)
Smith-Waterman raw score, using BLOSUM50 scoring matrix and a linear scoring
scheme with gap penalty of
8
- W-metric (Wm)
see manuscript
- Mahalanobis (ma)
- Standard Euclidean (se)
- Euclidean (eu)
- Kullback-Leibler discrepancy (ku)
- Cosine (co)
,
where
For further information and definitions see (Vinga
& Almeida, 2003)
[BACK]