In the Part-of-Speech Tagging task, systems are required to assign a tag, consisting of a combination of lexical category (PoS tag) and morphological features to each token in a set of sentences. This task will be akin to the Evalita 2007 Pos-Tagging evaluation task.

Participants will be provided with a development data set annotated with the Tanl tag set. The total amount of data available as development and test sets will consist of about 110.000 tokens from newspaper articles.

Task materials

Detailed Guidelines [02/07/2009]

Data Distribution

We require that all participants in the EVALITA 2009 Part of Speech Tagging task accept the terms of a license agreement before receiving training or evaluation data. For data download, please refer to the following web page: http://medialab.di.unipi.it/evalita/

For any problem please contact: Maria Simi, simi[at]di.unipi.it


Giuseppe Attardi and Maria Simi (Uni. Pisa)