In the Part-of-Speech Tagging task, systems are required to assign a lexical category (PoS tag) to each token in a set of sentences. This task will be akin to the GRACE PoS Tagging evaluation task for French. Participants will be provided with a development data set containing texts extracted from the CORIS/CODIS corpus annotated with two different tagsets both to be used for the evaluation task. The total amount of data available as development and test sets will be composed of about 150.000 tokens from newspapers and literature genres.
Fabio Tamburini (Dipartimento di Studi Linguistici e Orientali, Universita’ di Bologna – fabio.tamburini[at]unibo.it)
We require that all participants in the EVALITA 2007 Part of Speech Tagging task accept the terms of a license agreement before receiving training or evaluation data. Please contact directly the organizer: Fabio Tamburini email: fabio.tamburini[at]unibo.it