# Scripting - Fuzzy Matching in Script

Article # 3035634 - Page views: 15

Article # 3035634 - Page views: 15

#### Issue

How to do Fuzzy Matching calculation in Script?

#### Solution

The fuzzy matching algorithm is not available in the scripting environment.

KTM uses a modified __Levenshtein Distance__ to perform the matching. The Levenshtein Distance counts how many changes (deletions, insertions and substitutions) need to be made to convert one string into another.

Below is a WinWrap implementation of fuzzy matching. You could change its behaviour by performing character conversion, and telling it which characters to ignore, customizing it to your needs.

The function **FuzzyMatch(a as String, b as String) as Single **returns a percentage match, 0.0 if the two strings are completely different and 1.0 if they are identical.

Public Function FuzzyMatch(ByVal a As String, ByVal b As String) As Single Dim length As Integer If len(a)>len(b) Then length = len(a) Else length = len(b) If length = 0 Then FuzzyMatch = 0: Exit Function Dim dist As Integer dist = LevenshteinDistance(a, b) FuzzyMatch = CSng(1.0 - (dist / length)) End Function Public Function LevenshteinDistance(a As String, b As String) As Integer Dim i, j, cost, d, min1, min2, min3 ' Avoid calculations where there there are empty words If Len(a) = 0 Then LevenshteinDistance = Len(b): Exit Function If Len(b) = 0 Then LevenshteinDistance = Len(a): Exit Function ' Array initialization ReDim d(Len(a), Len(b)) For i = 0 To Len(a) d(i, 0) = i Next For j = 0 To Len(b) d(0, j) = j Next ' Actual calculation For i = 1 To Len(a) For j = 1 To Len(b) If Mid(a, i, 1) = Mid(b, j, 1) Then cost = 0 ' cost of perfect match Else cost = 1 ' cost of substitution End If ' Since min() function is not a part of WinWrap, we'll "emulate" it below min1 = ( d( i - 1, j ) + 1 ) ' cost of deletion min2 = ( d( i, j - 1 ) + 1 ) ' cost of insertion min3 = ( d( i - 1, j - 1 ) + cost ) 'cost of substition or match If min1 <= min2 And min1 <= min3 Then d(i, j) = min1 ElseIf min2 <= min1 And min2 <= min3 Then d(i, j) = min2 Else d(i, j) = min3 End If Next Next LevenshteinDistance = d(Len(a), Len(b)) End Function

#### Level of Complexity

Moderate

#### Applies to

Product | Version | Build | Environment | Hardware |
---|---|---|---|---|

Kofax Transformation Module | all |