3. Fuzzy Search

Fuzzy Search is a search technique that allows you to find similar words in addition to exact matches

Previous step: Search Options


Fuzzy Search retrieves results that differ by only one or two letters from the search term, and it is useful with misspellings and spelling variations. 

To access the fuzzy search options, simply enter the word or phrase you're searching for in the search bar of the Global Text Search option. Initiate the search and then click on Show advanced filters.

Global text search (1)

Within this section, you will find a Fuzziness option, where you can choose from "None, Medium, or High" depending on how precise you want the matches to be. The search results will include both the exact matches and any variations found.

Fuzzy Search Settings:

  • Low: This setting provides the most precise search results. It will only find exact matches to the search term you've entered. For instance, searching for "apple" will return only "apple."

  • Medium: This setting allows for slight variations in the search term. It will find terms that are closely related to your search query, such as "apples" or "ample."

  • High: This setting is the most lenient, allowing for more significant variations from the search term, including additions, deletions, or substitutions of up to two characters. For example, a search for "apple" might also show results for "apples (one character added)," "ample" (one character removed), "apply" (one character replaced),  "aply" (two characters removed) and "applet" (two characters added).

Technical Details:

The underlying mechanism of Fuzzy Search in Transkribus uses the Levenshtein distance algorithm, which calculates the number of edits required to change one word into another. The settings correspond to the number of allowed edits:

  • Low = Levenshtein distance of 0

  • Medium = Levenshtein distance of 1

  • High = Levenshtein distance of 2

Note on Short Search Terms: The search engine may limit the application of fuzzy logic for very short terms (less than 6-7 characters) due to the potential to overwhelm the search results with irrelevant matches.

Additionally, you have the option to use other advanced filters such as "Title" or "Collection ID" to further narrow down your search and enhance your results. 

Fuzzy Search 2

 


 

Transkribus eXpert (deprecated)

Fuzzy Search is a search technique that allows you to find similar words in addition to exact matches. It retrieves results that differ by only one or two letters from the search term, and it is useful with misspellings and spelling variations. 

To enable it, check the Fuzzy search option and launch the Search. The results will contain both the extract matches and the variants (for example, when searching for “house”, fuzzy search finds also “howse, “horse” and “howses”).