4. Smart Search

Smart Search enables you to perform a more advanced and powerful type of search of the automatically generated transcriptions

Previous step: Search Options

While the standard Fulltext search goes through the transcription as it appears in the editor, Smart Search also looks to several alternatives for each recognised word of the automatic transcription. The alternatives do not appear in the text editor but have been stored in addition to the transcription. 

Using Smart Search, you can find words even if they have been transcribed incorrectly by the Text Recognition model.  It can prove very useful with, but not limited to, records and registers, and can produce valuable results also with automatic transcriptions with a high error rate (CER up to 30%).

To use the Smart Search feature, tick the Smart Search checkbox before launching the Text Recognition. In this way, not only the best match but also the possible alternatives on which the model is less confident in their correctness are stored and can be browsed when you launch a search. By default, 100 is the maximum number of variants taken into account and stored. 

Since generating the Smart Search data during the text recognition is an intense computing task and requires additional storage space (10x more than normal), a 50% credit surcharge is applied. This means that instead of consuming 1 credit per page, as is usually the case, you will consume 1.5 credits per page.

Before launching the text recognition, you must therefore evaluate whether the Smart Search feature is helpful for your documents, depending on how you intend to use the automatic transcripts. If you want to make Smart Search available at a later stage, you will have to relaunch the text recognition on all pages, which consumes more credits than implementing Smart Search in the first place.

When the Text Recognition is finished, you can search your pages by using the search bar in the top right corner (as for the regular Fulltext Search). At this stage, you do not have to select any options: just type the term and launch the search. Automatically, the search will go through both the words appearing in the transcription and all the saved alternatives. 

Clicking on the result will load the page where it has been found. When the term has been found among the variants, it appears correctly in the search result list. However, when you open the corresponding page, you will notice that the transcription contains a different word, i.e. the word that the model rated as best during the recognition.

Looking at the underlined word and the corresponding line in the image, you will see if the variant found by Smart Search is the correct transcription (very likely) or if it is an incorrect guess.



Transkribus eXpert (deprecated)

This feature is not supported in Transkribus eXpert. You can use it within the browser version, as explained above.