ICDAR2017 Competition on Handwritten Text Recognition on the READ Dataset (ICDAR2017 HTR)


All submissions and results for the ICDAR2017 Competition on Handwritten Text Recognition on the READ Dataset (ICDAR2017 HTR)

Name Method Info Submitter Affiliation Result is public Track Subtrack BLEU WER CER
Baseline This system has been trained using the first 40 pages of Train-A. The system is based on the deep learning toolkit to transcribe handwritten text images called Laia. Verónica Romero Universitat Politècnica de València Train-A data set and Train-B data set Training material 64.78 29.15
CNN-LSTM CNN-LSTM using seam carving line segmentation. Test xml parse. Curtis Wigington, Brian Davis BYU Computer Science Department Test-B: Advanced Track Advanced Subtrack
SFR_without_regions Results from our paper Start, Follow, Read: End-to-End Full-Page Handwriting Recognition. This result did not use provided regions of interest in the test data. Code, results, and trained weights available at https://github.com/cwig/start_follow_read. BYU Computer Science Department, Curtis Wigington BYU Computer Science Department Test-B: Advanced Track Advanced Subtrack 72.3
SFR_with_regions Results from our paper Start, Follow, Read: End-to-End Full-Page Handwriting Recognition. This result used the provided regions of interest in the test data. Code, results, and trained weights available at https://github.com/cwig/start_follow_read. BYU Computer Science Department, Curtis Wigington BYU Computer Science Department Test-B: Advanced Track Advanced Subtrack 72.95
cnn-lstm CNN-LSTM using seam carving line segmentation Curtis Wigington BYU Computer Science Department Test-B: Advanced Track Advanced Subtrack 71.53
cnn-lstm CNN-LSTM Curtis Wigington BYU Computer Science Department Test-A: Traditional Track Traditional subtrack 19.06 7.01
cnn-lstm2 CNN-LSTM with seam carving based line segementation Curtis Wigington BYU Computer Science Department Test-A: Traditional Track Traditional subtrack 19.06 7.01
MS-CRNN-COMB Combination of MS trained CRNN results. ParisTech Telecom ParisTech, France, and University of Bala- mand, Lebanon, Chafic Mokbel Telecom ParisTech, France, and University of Bala- mand, Lebanon, University of Balamand Test-A: Traditional Track Traditional subtrack 21.58 7.74
MS-CRNN MS trained CRNN on 80% of the data. ParisTech Telecom ParisTech, France, and University of Bala- mand, Lebanon, Chafic Mokbel Telecom ParisTech, France, and University of Bala- mand, Lebanon, University of Balamand Test-B: Advanced Track Advanced Subtrack 48.34
  • 9 items

News

30/6/2017
Dear participants,
the Test data is now available for both traditional track and advanced track.

28/4/2017
Dear participants,
Remember to include your mail in the followers of this competition if you want to be continously informed with news.

28/4/2017
Dear participants,
There is a remark regarding the data provided for this competition:
In this edition, the quality of the images (and the resolution) for some batches (is not as good as previous editions. For the preparation of this competition, we received the images that you have available and the Ground-Truth (GT) was prepared for this images taking profit of existing GT material (transcripts).
This issue may happen both with the training data and the test data. For the test data, we inform you that the images are collected from different collections and therefore the image resulution may be not the same for all test images.
Regarding the resolution of the images, low resolution images are very frequent in archives (thousands of images, according to archives involved in READ). This is because many collections were scanned some time ago and currently some of these collections are not being scanned again (document not currently available, low budgets, different priorities, ...). So, this is a real problem that many collections residing in archive needs to be addressed.
Sorry for not providing you this information in advance.

3/4/2017 The training data is now available

7/2/2017 ICDAR2017 Competition on Handwritten Text Recognition on announcement

Important Dates

3 April 2017: competition opens

3 April 2017: training data available

15 June 2017: registration deadline

30 June 2017: test data available

14 July 2017: deadline for submitting results on the test data





Organizers







Verónica Romero

[Universitat Politècnica de València] 

Member of the Pattern Recognition and Human Language Technology Research Center of the Universitat Politècnica de València.

Enrique Vidal

[Universitat Politècnica de València] 

PhD in Physics from the Universitat de València (Spain), 1985. Full professor of Computer Science in the Universitat Politècnica de València. Member of the IEEE and a fellow of the IAPR.

Joan Andreu Sanchez

[Universitat Politècnica de València] 

Joan Andreu Sanchez is professor at Universitat Politècnica de Valencia and researcher in the Pattern Recognition and Human Language Technologies (PRHLT) research center.