Connected Digits Recognition

In the Connected Digits Recognition task, systems are required to recognize sequences of spoken Italian digits (numbers ranging from 0 to 9). Two subtasks are defined, and applicants may choose to participate in any of them:

  • Clean digits: in this subtask, the test digits sequences are acquired in clean environment;
  • Noisy digits: in this subtask, the test digits sequences are acquired in noisy environment. The type of noise may vary from white noise to traffic, room, etc.

The evaluation is based on Minimum Edit Distance between the transcription coming out from the recognizer and the orthographic annotation. Accuracy will be calculated at word and phrase levels. Training and development material extracted from wide-band (16kHz) corpora will be provided.

Task materials Detailed Guidelines

Data Distribution Data are freely available and no fee will be required. Contact: Roberto Gretter, gretter[at]fbk.eu

  • Test data are now available.
  • Training data consist in 5348 sentences, 17505 digits; development data consist in 515 sentences, 3569 digits.

Organizers Gianpaolo Coro (ABLA, Milano), Roberto Gretter (FBK-irst, Trento) and Marco Matassoni (FBK-irst, Trento)