3. Table Models
Trainable table models utilise AI to identify the tabular layout of your historical documents, simplifying data extraction and export into spreadsheets.
Previous step: Manual Layout Recognition
Our Table Models make the digitisation process more efficient and user-friendly. Customise your Table Models to suit your unique documents by adjusting how tables are recognised and data is extracted. This guide will first explain how to train a powerful Table Model, and then how to use the Table Model once it has been trained.
How to use the trained Table Model
When dealing with tables in your document, the appropriate approach in Transkribus depends on the frequency and layout of the tables:
- Tables that are consistently present throughout the entire volume or many pages and have the same or a similar tabular structure: train a table model, as described below.
- Tables that appear sporadically or are embedded in the text: draw them manually using the editor functions (just steps 1-3 of Step 1 - Preparing the Training Data).
How to Train a Table Model
Step 1: Preparing the Training Data
Table Models are not an all-in-one or out-of-the-box solution. They are designed to be trained on a certain document or collection to recognise its specific table layout. However, with enough training data, table models can be trained to recognise a few different types of tables at once.
The Training Data you need depends on the type of table(s). We recommend selecting a sample from the entire volume/collection, not just the first few pages, to have more variety and train a more robust model:
- easy tables: 20 pages of Ground Truth
- difficult tables (uneven rows, skewed tables, a table across two slightly off-set pages...): 50 pages of Ground Truth
- mix of different tables: between 50 and 100 pages of Ground Truth, depending on the number of tables
Table models can be trained even when the separators (that define the columns and rows) are not visible and when the height of the rows varies. However, they may encounter difficulties with very narrow rows and columns, which increases the likelihood of them being overlooked. Table models can also deal with skewed tables as long as the skewness is not excessive.
After choosing the pages to use as the Training Data, draw the tables manually following these steps:
-
In the Transkribus editor, select the "Add a Table" button on the left-side menu. Click on the image once to start the table and once to finish it.

- To create columns, select the table and hold V while moving the cursor across the page and clicking wherever you want to create a column.

- To create rows, select the table and hold H while moving the cursor across the page and clicking wherever you want to create a row. Keep going until all the cells are marked.
.gif?width=670&height=377&name=Draw%20columns%20(3).gif)
- Save the page as Ground Truth and move to the next one.
Please note: Tables created in the eXpert Client cannot be rendered in the Web App. When working with tables, it is recommended that you only make changes to them in one of the two user interfaces. As the eXpert Client is deprecated, we recommend using the Web App.
Good to know tips:
-
- To draw columns and rows with a custom angle, select the table, hold C and use the arrow keys to change the angle.
Hold C+CTRL and use arrow keys to draw more precise angles. - If the table layout of several pages is similar, you have the option to copy and paste the table structure from one page to the others. Simply select the table, press CTRL+C, navigate to the next page, and press CTRL+V. Remember then to adjust the table to fit the image.
We recommend performing this task after drawing the columns but before drawing the rows, as the rows tend to vary more from one page to another and adjusting them manually may take more time. - Table models can be trained to ignore specific columns, consider multiple columns as one column, or even create columns/rows when they are only separated by white space and there are no visible separators. Consistency in creating the Ground Truth and including a sufficient number of pages, covering all the different cases that the model needs to recognise, is crucial.
- To draw columns and rows with a custom angle, select the table, hold C and use the arrow keys to change the angle.
To receive proper training results please make sure to have only one table per page and no merged table cells within your Ground Truth.
Step 2: Training the Table Model
- Navigate to the Model tab in the upper right corner, click "Train a New Model", and select "Table Model."

- Choose the collection containing your Training Data.
- Select the Training Data: select the specific documents or pages where you have drawn the tables and that you want to use to train your Table Model. You have the option to use the Latest Transcriptions or only the documents and pages saved as Ground Truth.
- Select the Validation Data: by default, Transkribus selects 10% of your dataset as validation data. We recommend sticking with this setting, but you also have the option to manually set specific pages as your Validation Data.
- Model Setup: fill in the following fields:
- Model Name: give your model a name.
- Description: provide a brief description of what the model is for.
- Preview Thumbnail: optionally, you can add an image that will serve as a preview thumbnail for your model.
- Advanced Settings (optional):
For initial trainings, it is recommended not to modify these parameters and to adhere to the default advanced options, which have proven to be effective in most scenarios.
- Training Cycles (5000 by default; should be between 1000 and 10000):
the training cycles indicate how many times the model will go through the training data to learn and adjust. More cycles could result in a more accurate model but may also risk overfitting. - Learning Rate (0,0001 by default; should be between 0.0001 and 0.05):
the learning rate affects how quickly the model adapts to the data.
- Training Cycles (5000 by default; should be between 1000 and 10000):
- Review all the settings and data you have inputted. Once everything looks good, start the training process.
- Check the status of the training process via the "Jobs" button on the right side of the top menu bar. Click on "Open Full Jobs Table" to see the details of the job. If the job's status states "Created", you can see in the description how many trainings are in the queue before yours.
Evaluate your Table Model
Instead of the CER (Character Error Rate), the accuracy of a Table Model can be assessed by looking at the percentage of the Mean Average Precision.
The Mean Average Precision (mAP) is a complex measure that evaluates how accurately the system detects the tables, considering whether they were detected, and how well their size and shape match the validation data.

Models with mAP over 60% can already deliver very satisfactory results. However we recommend to always run a recognition test with your model and visually evaluate the results yourself.
How to Use the Trained Table Model
Step 1: Selecting Documents for Recognition
After your Table Model has been trained, go to your Transkribus Desk and select the documents or specific pages you want to recognise.
Step 2: Table Recognition Process
- Click on the "Recognise" button.
- Go to the top of the recognition section and choose "Table."
- Under your private models, find and select your newly trained Table Model.
- Start the Recognition.
.gif?width=484&height=272&name=Table%20model%20recognition%20(1).gif)
Step 3: Layout (Baselines) Recognition
Once you have recognised the table structure on all the pages, run the Automatic Layout Recognition to automatically add baselines.
Remember to use these Advanced Settings:
-
- "Keep existing" text regions: to detect only the baselines inside the already recognised tables.
- "Split lines on region borders": to make the baselines strictly obey the cell border and prevent close lines belonging to different cells from being merged together.

Additional adjustments to the advanced layout parameters might be required, depending on the specific documents. For a comprehensive overview of all the advanced layout configuration settings, please refer to this page.
If it happens that lines stretching multiple cells are divided, you can merge those partial lines. First, you must move them to the same cell: open the layout tree with the "Layout" button on the right-side menu and select, in the image, the line that belongs to the wrong cell: automatically, it will highlight the corresponding line in the layout window. Within the window, move the highlighted line to the right cell (probably the previous or following cell). Now that both the lines belong to the same cell, you can hold CTRL, select both lines and press M on your keyboard to merge them.
Step 4: Text Recognition
Apply the most appropriate Text Recognition Model to automatically transcribe the content of your tables, as explained on this page.