Anaphora Resolution

In the (nominal) Anaphora Resolution (AR) task, noun phrases are classified as mentions of entities (referring) or non-referring (e.g., expletive, predicative), and then the mentions of the same entities are recognized. Unlike the traditional coreference task MUC- or ACE-style, where only a few types of entities are identified as relevant, all entity mentions are to be considered. Also unlike the coreference task, predicative NPs are not considered to be ‘mentions’ of the entity they specify a property of (e.g., in “Rovereto is a city with 35,000 inhabitants,” only “Rovereto” is considered to be referring, whereas the NP “a city of 35,000 inhabitants” is analyzed as expressing a property of the entity “Rovereto”). (This definition underlies the ARRAU and OntoNotes corpora of English, the ANCORA corpus of Catalan and Spanish, the TUBA/DZ corpus of German, and the LiveMemories corpus of Italian, among others.) In this task, each document is processed separately and entities that are mentioned in different documents are treated as different entities. The evaluation will be based on the Wikipedia subset of the LiveMemories Anaphora Resolution Corpus (LM-WIKI).

Task materials

Detailed Guidelines [18/08/2011]

Data Distribution

Test data [04/10/2011]
LATEST version of the training data [12/09/2011] This version incorporates several rounds of corrections. It also contains annotations of minimal spans that will be used for mention alignment (cf. guidelines). Minimal spans have been added to all the labels in columns 17 and 18, using the BIO notation (“B-MIN”, “I-MIN” and “O”). Minimal spans are encoded as the last parts of composite labels. Please note that no adjusted minimal spans are provided for documents 53, 69 and 68 (for these documents, minimal spans are equal to mention boundaries). –OLD version of the training data

For any problem please contact: Olga Uryupina, uryupina[at]gmail.com

Organizers

Massimo Poesio and Olga Uryupina (University of Trento)