
Go back to startpage
.. Welcome in the year 1965 for time-passengers ..
Vladimir Iosifovich Levenshtein (1935 - 2017);
Vladimir Iosifovich Levenshtein was a Soviet-Russian scientist specializing in information theory.
He graduated from the Department of Mathematics and Mechanics at the Lomonosov Moscow State University in 1958 and
has been working at the Keldysh Institute of Applied Mathematics in Moscow ever since.
The Levenshtein distance (developed by him in 1965) represents the smallest number of single-character edits needed to convert one string into another. The algorithm provides a method
for calculating this distance between any two strings. In linguistics, the Levenshtein distance
is utilized as a metric to measure linguistic similarity, indicating how distinct two languages
are from each other. In bioinformatics, the Levenshtein distance and similar algorithms are
employed to measure differences between biological sequences, such as DNA or proteins. Genetic
mutations - insertions, deletions, and substitutions of nucleotides in DNA or amino acids
in proteins - are reflected as edits in these algorithms.
A smaller distance indicates a closer evolutionary relationship between the sequences.
The algorithm in a Q-like notation: d = D[u;v] = levdist(u,v) .. u,v strings
if[0 = count v;d:count u];
if[0 = count u;d:count v];
if[=[(*)u;(*)v];D[1_u;1_v]];
otherwise
d = 1 + min[ D[1_u;v];D[u;1_v];D[1_u;1_v] ]
Example:
Levenshtein distance bewteen the two strings "yelland" and "yellowly"
replacing "o" by "a"
deletion of "w" and "l"
replacing "y" by "d"
→ 4 operations
/playful intro
m:(1+(l2:count s2),l1:count s1)#0N
m[0]:0,1+til l1
m[;0]:0,1+til l2
g:{s:c:x,'y;s:s+((z,1);1 0);c[1;1]:min raze s;c[;1]}
/k as iterator
r:(k+1),last flip (g\)[first P;1_P;C k];
Levenshtein,:enlist r;
P:r,'m[l2&k+2];
Levenshtein ...
1 0 1 2 3 4 5 6
2 1 0 1 2 3 4 5
3 2 1 0 1 2 3 4
4 3 2 1 0 1 2 3
5 4 3 2 1 1 2 3
6 5 4 3 2 2 2 3
7 6 5 4 3 3 3 3
8 7 6 5 4 4 4 4
Levenshtein distance is 4

Go back to startpage