Skip to content
English
  • There are no suggestions because the search field is empty.

4. Model Setup and Training

All you need to know before training a Text Recognition model: how to select the Training and Validation Data, add a base model and set up any advanced settings

Previous step: Data Preparation


Once you have your Ground Truth pages ready, it is time to train a Text Recognition model.

Starting the training

You can start the training directly from the collection or document(s) you've been working on. Simply mark the transcribed or corrected pages, then click on the "Selected" button in the bar above the images. There click on "Train Model" and select "Text Recognition Model".

You can also start the training of models by switching from the Desk to the Models view on the left sided navigation slide out. This is the dedicated AI training dashboard, where you can train not only Text Recognition models, but also Baselines, Field and Table models. Your trained Text Recognition models are displayed under "My Models" by default when opening this page. 
Under the section "Training Lab" you can find all the models you trained and the co-relating training data. By clicking on "+ Train New Model" you can also start the training from here. 

After clicking on "Text Recognition Model", the training configuration window will open up.

If you haven't already, you need to select the collection containing your Ground Truth document(s) to mark it as the "Training Data". Via the search bar, you can type in the collection title or collection ID and select it. 

Be aware that you can not select documents from different collections during the training. To overcome this problem, before starting the training, it is possible to create shortcuts from documents belonging to different collections to a single collection, as explained on the Managing Documents page.

After having chosen the correct document(s), the proper training configuration starts.

Design ohne Titel (1)

Step 1: Training Data



During the training, the Ground Truth pages are divided into two groups: 

  • Training Data: set of examples used to fit the parameters of the model, i.e. the data on which the knowledge in the net is based. The model is trained on those pages.
  • Validation Data: set of examples that provides an unbiased evaluation of a model, used to tune the model‘s parameters during the training. In other words, the pages of the Validation Data are set aside during the training and are used to assess its accuracy.

Choose first the pages to be included in the Training Data.
By ticking the box of the document title, you can select all the transcriptions available in the whole document. But you can also choose separate pages by clicking on "Select Pages" and ticking all pages you want to include in the training and then clicking "Save and go back".

Here, you can also filter for the status of the selected pages - by default, the "latest transcription" is selected (regardless of the status in which they were saved) but you could also only take pages with the status "Ground Truth" into account. Pages which do not contain any transcription can not be selected. 



Once you're happy with the selection of the Training Data, click on "Next". 

Step 2: Validation Data




Select the pages to assign to the Validation Data. The pages of the Validation Data are set aside during the training and are used to assess its accuracy.

Remember that the Validation Data should be representative of the Ground Truth and comprise all examples (scripts, hands, languages…) included in the Training Data. Otherwise, if the Validation Data is too little varied, the measurement of the model’s performance could be biased.

We recommend not to save effort at the Validation Set and assign around 10% of your Ground Truth transcriptions to it.

You can select the pages manually or assign them automatically. The manual selection works as described above for the Training Data. Only the pages that contain text and have not been included in the Training Data are selectable.

With the automatic selection, 2%, 5% or 10% of the Training Data is automatically assigned to the Validation Data: in this case, simply click on the percentage you want to assign. The automatic selection is recommended to have more varied Validation Data.Model Set UP

Step 3: Model Setup 




The first information you are asked to enter is the model metadata, in detail:

  • Model Name (chosen by you);
  • Description of your model and the documents on which it is trained (material, period, number of hands, how you have managed abbreviations…);
  • Image URL (optional); 
  • Language(s) of your documents;
  • Time span of your documents.

This metadata will help you to later filter the search bar and find a model more easily.

Base Model:
Existing models can be used as starting point to train new models. When you select a Base model, the training does not start from scratch but from the knowledge already learnt by the base model. With the help of a base model, it is possible to reduce the amount of new Ground Truth pages, thus speeding up the training process. Likely, the base model will also improve the quality of the recognition results.

The benefit of a base model, however, is not always guaranteed and has to be tested for the specific case. To use a base model, simply select the desired one. You can select one of the public models or one of your own PyLaia models. Remember that, to be beneficial, the base model must have been trained on a writing style similar to the one of your model.

It is important to note that Super Models like the Text Titan cannot be used as base models for custom model training due to their extensive amount of data.


Model Set UP (1)

Advanced Settings (optional): 

The last section contains an overview of the model configuration.

Here, at the bottom of the page, you can modify the advanced parameters: 

    • Training cycles:
      This refers to the number of "epochs", which can be understood as the number of times that the learning algorithm will go through the entire Training Data and evaluate itself on both the Training and the Validation Data. The number in this section indicates the maximum number of trained epochs because the training will be stopped automatically when the model no longer improves (i.e. it has reached the lowest possible CER).
      To begin with, it makes sense to stick to the default setting and not enter any value - this way, 100 epochs will be used as a starting point. 
    • Early Stopping:
      This value defines the minimum number of epochs for the training, meaning the model will at least run this many epochs.
      Completed these epochs, if the CER of the Validation Data continues to decrease, the training will continue and end automatically when the model no longer improves.
      On the other hand, if the CER does not go down anymore, the training will be stopped.
      In other words, with the early stopping value, you force the model to train at least that number of epochs.

      For most of the models, the default setting of 20 epochs works well, and we recommend leaving it as it is for your first training.
      When there is no or little variation in the Validation Data, the model may stop too early.
      For this reason, we recommend creating varied Validation Data containing all types of hands and document typologies present in the Training Data. Only if your Validation Data is rather small, increase the “Early Stopping” value in order to avoid the training from stopping before it has seen all the Training Data. Always bear in mind, however, that by increasing this value the training will take longer.
    • Reverse text (RTL):
      Use this option to reverse text during the training when the writing direction in the image is opposite to that of the transcription, e.g. the text was written right-to-left but transcribed left-to-right.
    • Use existing line polygons for training:
      If you check this box, the line polygons will not be computed during the training and instead, the existing ones will be used. When using this model for recognition afterwards, similar line polygons should be used for the best performance of the model. 
    • Train Abbreviations with expansion:
      If in the Ground Truth transcriptions, abbreviations are written in their abbreviated form and expanded in the property "expansion" (property of the Abbreviation Tag), check this option to obtain the best results. 
    • Omit lines by tag 'unclear' or 'gap': 
      If you've used the tags "unclear" or "gap" in the Ground Truth transcription to indicate words or phrases that should not be included in the training, select the respective option. Please be aware though, that not only the tagged words/phrases will be omitted from the training data but the whole line containing such a tag. 
    • Convert images to black & white:

      Transform your image into a simple black and white version. This process, called binarisation, helps the AI focus on key aspects of the image, such as shapes and patterns, by removing colour data.

    • Train tags and Include Properties

      Your model will learn to recognize and apply tags to words similar to those you tagged in your training data. There are two ways to add tags for training:

      1. Use the tag icon – Click the icon and select from the available default tags.

      2. Enter tags manually – Type tag names as a comma-separated list.

Example: If you want to train the model on textStyle tags such as 'Strikethrough' or 'Superscript', select "Train Tag", enable" Include Properties", and enter the tag name "textStyle" in the comma-separated list.

Step 4: Summary & Start 




By clicking "Next", you have made it to the last stage of the training configuration. 

Here, you find a final overview of the setup of your text recognition model, which you can check once again and also go back to change any settings or parameters. 

After checking all the details, click Start to launch the training job.


Model Set UP (3)

You can follow the progress of the training by checking it via the “Jobs” button on the right side of the top menu bar. Click on "Open Full Jobs Table" to see the details of the job. 

Model Set UP (5)

The completion of every epoch will be shown in the job description, and you will receive an email once the training process is completed. Depending on the traffic on the servers and the amount of material, your training might take a while.

In the Jobs table, you can check your position in the queue (i.e. the number of training jobs ahead of yours). You can perform other jobs in Transkribus or close the platform during the training process. If the job status is still stated as “Created” or “Pending”, please don’t start a new training, but just be patient and wait. 
You can easily check the current and past availability status of Transkribus on this page.
This provides real-time updates on the system's uptime and any outages. It allows you to quickly verify if all essential services are running smoothly. Initially, we are monitoring the most critical services, and we will gradually include additional services over time.

 

After the training is finished, you can use your model to recognise new documents, as explained on this page. Read the Managing Models page to learn how to manage and share models.

Go to the next step to learn how to evaluate the performance of your custom text recognition model.