ICFHR2016 Competition on Handwritten Text Recognition on the READ Dataset


Scoreboard

Ranking for Track 1: Restricted track

Position Name Method Info Submitter Affiliation Submitted before deadline Score
1 RWTH Unscaled images after applying the image enhancing pipeline provided by the setup were used. For optical modelling, they used ROVER combination of 16 MLSTM networks with about 5 MLSTM and convolutional layers and 3 times maxpooling per net. Trained using the CTC. RWTH Human Language Technology and Pattern Recognition Group, Germany Human Language Technology and Pattern Recognition Group, Germany 3
2 PRHLT-char-lm A single model composed of CNN layers followed by BLSTM layers and a fully connected linear layer. It was trained with CTC using Laia toolkit. An 8-gram character language model was used. Pattern Recognition and Human Language Technology (PRHLT) Universitat Politècnica de València 4
3 BYU The system used a CNN and CTC, based on the network described in: B. Shi, X. Bai, and C. Yao, “An end-to-end trainable neural network for image-based sequence recognition,” 2015, http://arxiv.org/abs/1507.05717. BYU Computer Science Department BYU Computer Science Department 5
4 A2IA For optical modelling, it used a MLSTM-RNN trained with CTC, alternating LSTM layers (in four directions), convolution layers with 2×4 subsampling, feed-forward merging directions and a non-linear function. Drop-outwas carried out after each LSTM. The softmax output layer models 87 characters. A2IA Artificial Intelligence and Image Analysis Artificial Intelligence and Image Analysis 8
5 PRHLT-no-lm A single model composed of CNN layers followed by BLSTM layers and a fully connected linear layer. It was trained with CTC using Laia toolkit. No language model was used. Pattern Recognition and Human Language Technology (PRHLT) Universitat Politècnica de València 10
6 LITIS A three layers BLSTM Recurrent Neural Network that was trained with RNNLIB 14 was used. No language model was used and only the words given in the training and validation sets were used as a lexicon. The decoding was based on the combination of multiple BLSTM (over 20). LITIS Laboratoire d’Informatique, du Traitement de l’Information et des Systèmes, France Laboratoire d’Informatique, du Traitement de l’Information et des Systèmes, France 12
7 ParisTech Bidirectional Long Short-Term Memory (BLSTM) recurrent neural network recognizer that consisted of the coupling of 2 recurrent neural networks. The value of an output unit at time step t is the linear combination of the outputs of the forward and backward hidden layers at this time step t. ParisTech Telecom ParisTech, France, and University of Bala- mand, Lebanon Telecom ParisTech, France, and University of Bala- mand, Lebanon 14

Ranking for Track 2: Unrestricted track

Position Name Method Info Submitter Affiliation Submitted before deadline Score
1 A2IA For optical modelling, it used a MLSTM-RNN trained with CTC, alternating LSTM layers (in four directions), convolution layers with 2×4 subsampling, feed-forward merging directions and a non-linear function. Drop-outwas carried out after each LSTM. The softmax output layer models 87 characters. A2IA Artificial Intelligence and Image Analysis Artificial Intelligence and Image Analysis 2

News

The competition is open in scriptnet

Important Dates

1 March 2016 Competition opens, start of registration period, training data available, baseline system available.

31 May 2016 Registration deadline (no more participants will be admitted after this date).

12 June 2016 Test data available.

24 June 2016 Deadline for systems results.

26 June 2016 Deadline for sending short description of the submitted systems.

Oct 23-26, 2016 Winners and final ranking of all teams will be made public at the ICFHR 2016 conference.

Dec, 2016 The competition is open in the scriptnet platform





Organizers







Verónica Romero

[Universitat Politècnica de València] 

Member of the Pattern Recognition and Human Language Technology Research Center of the Universitat Politècnica de València.

Enrique Vidal

[Universitat Politècnica de València] 

PhD in Physics from the Universitat de València (Spain), 1985. Full professor of Computer Science in the Universitat Politècnica de València. Member of the IEEE and a fellow of the IAPR.

Joan Andreu Sanchez

[Universitat Politècnica de València] 

Joan Andreu Sanchez is professor at Universitat Politècnica de Valencia and researcher in the Pattern Recognition and Human Language Technologies (PRHLT) research center.