ICDAR 2019 Competition on Baseline Detection (cBAD)

The database consists of 3021 manually annotated images collected from seven European archives. It is split into a train, eval, and test. The train and eval sets comprise manually annotated groundtruth baselines as PAGE XMLs [1]. The groundtruth of the test-set will be released after evaluating all submitted methods and the final results being made public. A more detailed introduction can be found in [2].

[1] PAGE XML documentation with reference implementations
[2] Competition description


ICDAR 2019 cBAD dataset download

Evaluation Scheme

The ICDAR 2017 cBAD evaluation scheme is used to measure baseline errors. A detailed description of the evaluation scheme is available here. The evaluation tool which will be used for the competition is available as standalone jar .

Submission Protocol

The submitted result file has to be a compressed tar file (.tar.gz), containing separate result files (imagename.txt or imagename.xml) for each image in the test set. The folder structure must not be changed! If the test-set contains 3 different images:
  • cPAS-0003.jpg
  • cPAS-0004.jpg
  • cPAS-0007.jpg
than the corresponding result file should contain:
  • cPAS-0003.jpg.txt
  • cPAS-0004.jpg.txt
  • cPAS-0007.jpg.txt
  • page/cPAS-0003.xml
  • page/cPAS-0004.xml
  • page/cPAS-0007.xml
Do not include the images in the result file. Even if your algorithm doesn't detect any text line on a certain image, you have to provide an "empty" result file for this image. A result file for a certain image could either be a valid PAGE xml-file containing the baselines detected by your method, or a txt-file containing the detected baselines. If you use text files, the baselines need to be encoded as follows:
Each row corresponds to a baseline. Different points are semicolon separated, x- and y-coordinates are comma-separated: x1,y1;x2,y2;x3,y3.


You can submit your results multiple times using the following links.


02-04-2019 Submission is open
18-02-2019 Training data is available
24-01-2019 Competition page is online

Important Dates

Trainingset Online

01st February 2019

Registration Deadline

5th May 2019

Submission Deadline

5th May 2019

ICDAR 2019

20th-25th Sep. 2019


Markus Diem

[TU Wien, Computer Vision Lab] 

Markus Diem is a senior scientist at the Computer Vision Lab, TU Wien, Austria. His research interests are Cultural Heritage Applications and Document Analysis.

Florian Kleber

[TU Wien, Computer Vision Lab] 

Florian Kleber is currently a senior scientist at the Computer Vision Lab, Institute for Computer Aided Automation, TU Wien, Austria. His research interests are Cultural Heritage Applications and Document Analysis Applications.

Basilis Gatos

[NCSR Demokritos / IIT / CIL] 

Researcher at the Institute of Informatics and Telecommunications of the National Center for Scientific Research "Demokritos", Athens, Greece.