1. Automatically transcribing your documents

Previous step: Uploading

Some of the layout and navigation features described here are part of upcoming updates to the Transkribus interface. These changes are already available in the beta version and will soon be released.

Once your documents are uploaded, you can begin automatic text recognition to generate transcriptions. This process involves selecting your pages, choosing a suitable text recognition model, and applying advanced settings if needed. Use this index to jump to to the different steps:

Starting the automatic text recognition
Choosing the right model
Advanced Settings
Finished Recognition

Starting the automatic text recognition

After uploading your documents you can easily start the automatic transcription by following these steps:

Select the pages or documents in your Collection you want to transcribe (if it is your first time in Transkribus, we recommend testing the recognition out on a few pages before processing the entire document).
Click on the “Process with AI” button next to the selection overview.
A slide over will appear, with the "Text Recognition" option as default. In the search field you can look for models based on their name or language. Choose the most appropriate text model for your documents from the list of models (matching the language, time, etc.).
Before starting the recognition, check the displayed amount of credits needed to run the recognition job and the summary of the amount of credits available to the account (Personal credits/ Collection credits).
After having selected the model, click the “Start recognition” button to launch the recognition process.

Choosing the right model

A text model is an AI algorithm that has been trained on a specific set of data, including images and transcriptions. Its purpose is to accurately determine the most likely sequence of characters for each section of handwritten text.

There isn't a universal model that applies to all types of handwriting. Therefore, it is essential to choose the most suitable model for the script, language and time period of your documents.

Within Transkribus, you can select both the public models made available by the Transkribus community and team or the private models trained by yourself.

Find a more detailed explanation in this Help Center article: Choosing a Model.

Advanced Settings for Text Recognition

Restrict on Structure Tags: Limits text recognition to specific regions or elements within the document.
Delete Text from Other Regions: Removes text outside specified text regions, ensuring cleaner transcriptions.
Do word Segmentation: Detects word boundaries within a line, providing precise coordinates (useful for further processing but not affecting the transcript itself). Data is also available in Page XML.
Keep Original Line Polygons: Preserves original line boundaries (polygons) during recognition, which can improve accuracy—especially when lines are too short.
Language Model: Language models are automatically created during the model training process and are based on the Training Data. The impact of language models should be tested on a case-by-case basis. In most instances, language models have been shown to enhance recognition accuracy, but there may be cases where they do not.
Smart Search: Smart Search enables you to perform a more advanced and powerful type of search of the automatically generated transcriptions. Read more about it on the Smart Search page.

It is recommended to avoid using very large models that have been trained by the Transkribus Community with the language model or SmartSearch option, as they may not function properly due to their size. This is also indicated in the description of these models. Moreover, it is not recommended to use such large models as base models when training custom models. Please also refer to the dedicated section on Super Models here.

Finished recognition

You can check the status of the text recognition by clicking on “Processes & Activity", you can then choose the type of activity you want to check.

When the recognition is finished, open the recognized page(s) by clicking on the title of the job or simply go to your document collection and click on the processed page. You will be directed to the editor and the automatically generated transcript will appear on the right side of the screen.

Typically, when you initiate a recognition job, the status should automatically show RUNNING. However, in cases of increased server traffic, the job may not start immediately and the status may display CREATED until it is able to run.
In such instances, the job description will also indicate the position in the jobs queue, allowing you to roughly estimate when the job will commence.

In the event that the jobs queue is exceptionally long or if a large number of jobs fail, you can conveniently check the current and past availability status of Transkribus on this page.
This feature provides real-time updates on the system's uptime and any outages, allowing you to quickly verify if all essential services are running smoothly. For now, we are closely monitoring the most critical services, and we will gradually include additional services over time.

Automatic Layout Recognition

During the Text Recognition process, the images are automatically divided into sections of text and lines. If your documents have a complex layout (such as tables, newspapers, postcards, marginalia, or multiple columns), it might be beneficial to run the Layout Recognition as a separate step before Text Recognition. This allows you to check and correct any issues with the layout before proceeding. For more information on Layout Recognition, refer to the Layout Recognition section.

The following sections discuss in more detail the main aspects of Text Recognition and how to choose the best model for your documents.

Next section: Choosing a Model

Transkribus eXpert (deprecated)

To automatically transcribe your documents, go to the “Tools” tab, under the “Text Recognition” section and click on the “Run” button. In the pop-up window, choose the page(s)/document(s) to process and then click on “Select HTR model.” Here you can choose the most appropriate text model for your documents.

A text model is the AI algorithm trained on a certain number of data (images and transcriptions), able to detect the most probable sequence of characters for each segmented text line. There does not exist a general model for all the handwritings, so you need to choose the most appropriate one for the script and language of your documents.

You can select both the public models made available by the Transkribus community and team and the private models trained by yourself. You can filter your search by engine, language and name.

Advanced settings you can select are:

Use existing line polygons: use this option if you have corrected the line polygons manually because the computation of polygons from the baselines did not perform well on your documents.
Do polygons simplifaction: to reduce the number of points of the line polygons.
Add estimated word coordinates: add approximate bounding boxes for each word in the line (you can then decide to show/hide the word boxes with the eye-icon in the Main bar at the top).
Restrict on structure tag: limit the Text Recognition only to the text regions tagged with the selected structural tag. You can decide if you want to keep or delete the text in the other regions.

After having selected the model, click "OK" to launch the recognition. You can check the status of the text recognition by clicking on the “Jobs” button in the top Main Bar. When the recognition is finished, reload the page: the automatically generated transcript will appear in the text editor,

When you launch the Text Recognition, first, the images are automatically segmented into text regions and lines. This step, called Layout Recognition, connects the text and the image. If your documents have a complex layout (e.g. tables, newspapers, postcards, marginalia, multiple columns…), it could be convenient to run the Layout Recognition as a separate step in order to check and correct it before the Text Recognition. If this is your case, take a look at this page.