ICDAR2017 Handwritten Keyword Spotting Competition (ICDAR2017 KWS)


Submission deadline extended - it is open until 28th July 2017.

Keyword spotting (KWS) in handwritten documents can be considered as a special case of image retrieval applied to text images. It aims at making available to the users all instances of a given text query in a set of document images. The interest of handwritten KWS is steadily increasing because it is being considered a feasible solution for indexing and retrieval of handwritten documents without the need of explicitly transcribing these documents.

The present competition is proposed for benchmarking the two main KWS settings, Query by Example (QbE) and the Query by String (QbS), under homogeneous criteria. The evaluation will focus on assessing KWS capabilities needed for large-scale applications of text retrieval in document images.

This is a joint effort of organizers of the latest three KWS contests:

Unlike previous editions of similar competitions, and aiming at approaching the requirements typically raised by indexing and retrieval applications, evaluation will not be (mainly) based on the geometric accuracy of the bounding boxes of the spotted words. Instead, the result of a query will be judged by checking whether it succeeds or not in roughly finding larger image regions, namely lines, where the keyword searched for is actually written. This is in line with the goals pursued in current important projects such as VTM, READ and HIMANIS, as well as with many recent papers on KWS.

On the other hand, we take into account that when indexing large collections of handwritten document images, it is very common to have available moderate amounts of transcribed images -- or if they are not, the cost of producing these transcripts is often negligible with respect to the overall cost of the indexing project. Therefore, in this competition a moderate amount of training data, in the form of transcribed page images, will be provided to all the entrants.

Finally, again aiming to approach realistic large-scale search scenarios, the set of query words will be large and randomly selected.

Competition Outline

ICDAR2017-KWS comprises tracks.

  • TRACK I -- Query-by-Example (QbE). The query words will be given as word images.
  • TRACK II -- Query-by-String (QbS). The query words will be given as text (character strings).

The rules below apply to both tracks, unless specified otherwise.

The dataset is split into two main subsets: Training & Validation and Evaluation. The first subset, including full ground truth of each page image, will be made publicly available by April, 2017. The ground truth of the second subset will be partially hidden to participants until the end of this competition.

Participants will have to provide a ranked list of "spots", sorted by confidence. Each spot shell contain a word ID, a bounding-box for this word specifying the span and position of this spot with respect to the full-page image and a confidence score.

In order to easy the participation of teams relying on line- or word-segmentation-based KWS techniques, basic state-of-the-art automatic line and word segmentation results are also provided to all the entrants.

Dataset

The dataset used in this competition is composed of around 800 page images, written by several hands in different writing styles.

Examples of these images are shown here

This set has been split into three subsets: Training (200 images), Validation (200 images) and Evaluation (the remaining images).

Each image in each of these subsets is accompanied with a PAGE metadata container. See the PRimA tools page for details. The PAGE files of the Training and Validation subsets contain all the available ground truth details (transcripts, text-block and line segmentation, etc.). In the PAGE files of the Evaluation images all these details, except those of text-block segmentation, will be erased.

Keywords

A list of query keywords will be provided for evaluation. These keywords will be randomly extracted from the set of words which appear both in the Training and the Evaluation subsets.

Keywords will be identified by their upper-case spellings. For the QbE task, each keyword will be accompanied by one (or more) small image(s) which is (are) representative of this word.

Evaluation

Evaluation will be based on Average Precision (AP) and mean AP (mAP).

To compute these measures, a given spot of a word w is considered a "hit" (success) if its bounding-box overlaps more than 50% with the (hidden) bounding box of a line which contains the word w, according to the (hidden) ground truth. Therefore, for a set of query keywords, participants will provide a confidence-sorted list of spots, in the format illustrated in this example:

WordID1 ImageID1 123 55 123 50 0.9811

WordID2 ImageID2 55 1333 100 55 0.8955

WordID3 ImageID3 1553 897 321 44 0.7654

... etc. ...

where WordID1, WordID2, etc. are word identifiers; ImageID1, ImageID2, etc., are page image identifiers; the four integers are the x,y coordinates of the upper-left corner of the spot bounding box and the width and height of the bounding box; and the real numbers are the confidence scores.

Page image IDs are just the names of the image files, without the image file suffix, and each word IDs simply consists in the upper-case spelling of the word. This implicitly assumes that evaluation is case-folded; that is, a spot is considered as hit regardless of the exact upper- and/or lower-case spelling of the spot word and the word in the image.

If for a keyword w, the ranked list contains more than one spot which overlaps more than 50% with the same line, then only the first spot will be considered (as a hit or as a false alarm, depending on whether w is written in the line or not).

Evaluation Platform and Dataset availability

This competition is running on the ScriptNet platform, aimed at competitions related to Handwritten Text Recognition and other Document Image Analysis areas of research.

Training and Validation data, including the corresponding ground truth information, as well as all the Evaluation images (without ground truth), will be provided through ScriptNet during the ICDAR 2017 competition period.

After finishing the ICDAR 2017 conference, we intend to keep this KWS competition running for one year, after which all data, including the ground truth data used for evaluation, will be freely available through this platform.

Submission deadline extended - it is open until 28th July 2017.

News

Submission deadline extended - it is open until 28th July 2017.

Important Dates

28 Feb 2017 - Competition is announced

1 7 Apr 2017 - Training and Validation data together with the corresponding auxiliary scripts will be made available in the competition web page

8 Jul 2017 - Release of the sets of query words and test images

15 28 Jul 2017 - Deadline for submitting results on the test set. A brief description of the methods and/or systems used to obtain these results should be also provided to be eventually included in the ICDAR2017 paper about this competition





Organizers







Ioannis Pratikakis

[Democritus University of Thrace] 

Associate Professor at the Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece

Konstantinos Zagoris

[Democritus University of Thrace / ECE] 

Post-Doctoral Researcher at the Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece.

Basilis Gatos

[NCSR Demokritos / IIT / CIL] 

Researcher at the Institute of Informatics and Telecommunications of the National Center for Scientific Research "Demokritos", Athens, Greece.

Giorgos Sfikas

[NCSR Demokritos / IIT / CIL] 

PhD in Computer Science from the University of Strasbourg. Msc and Bsc in Computer Science and BA in History and Archaeology from the University of Ioannina. Currently works as a researcher at IIT/NCSR "Demokritos".

Joan Puigcerver

[Universitat Politècnica de València] 

Joan Puigcerver is a PhD student in Computer Science at the Universitat Politècnica de València. His main topics of interest are Handwritten Text Recognition and Keyword Spotting.

Enrique Vidal

[Universitat Politècnica de València] 

PhD in Physics from the Universitat de València (Spain), 1985. Full professor of Computer Science in the Universitat Politècnica de València. Member of the IEEE and a fellow of the IAPR.

Joan Andreu Sanchez

[Universitat Politècnica de València] 

Joan Andreu Sanchez is professor at Universitat Politècnica de Valencia and researcher in the Pattern Recognition and Human Language Technologies (PRHLT) research center.