Word Sense Disambiguation Task – All-Words

In the all-words WSD task, systems must tag almost all of the content words in a sample of running text. Participants will be provided with a corpus of about 5000 words extracted from the Italian Syntactic Semantic Treebank. Content words (nouns, verbs, adjectives and a small set of proper nouns) will be semantically tagged according to the sense inventory of ItalWordNet. Participants can obtain the XML file of “ItalWordNet for Evalita” from ELDA (Evaluations and Language resources Distribution Agency). A file with groups of ItalWordNet senses will be provided to allow coarse-grained scoring.

Organizers

  • Nicoletta Calzolari (ILC-CNR, Pisa, Italy – glottolo[at]ilc.cnr.it)
  • Francesca Bertagna (ILC-CNR, Pisa, Italy – francesca.bertagna[at]ilc.cnr.it)

Data Distribution

Participants to the All Words task can obtain “ItalWordNet for EVALITA” from ELDA (Evaluations and Language resources Distribution Agency) by contacting Ms Valerie Mapelli at mapellielda.org, who will inform you on the licensing and delivery procedure. Test data can be obtained by contacting Francesca Bertagna at francesca.bertagna[at]ilc.cnr.it

Detailed guidelines

Trial examples