2011-2014 - ANCOR

Anaphore et coréférence pour la recherche d’information dans les Corpus Oraux

Centre-Val de Loire Regional Founds

Partners :
  • LLL (Laboratoire Ligerien de Linguistique - CNRS & Orléans University)
  • LI
Leader: Jean-Yves Antoine

Description:
The main aim of the ANCOR project was to develop ANCOR_Centre, the largest corpus of spoken French annotated
with coreference and anaphoric relations. With a size approaching 500,000 words, ANCOR_Centre has no equivalent for French
and represents one of the largest coreference corpora on spontaneous speech.
It is freely distributed under a CC-BY-NC-SA Creative Commons licence.

This project has been carried out through the collaboration of a NLP team (LI-BDTLN) and a linguistic one (LLL),
what led to the achievment of detailed annotation which fulfils the needs of the NLP and the corpus linguistics communities.

ANCOR_Centre corpus has already enabled the development of coreference resolution systems (CROC) as well as linguistic investigations
on this resource.


Web :
http://tln.li.univ-tours.fr/Tln_Corpus_Ancor.html (French)

Publications :

Muzerelle J., Lefeuvre A., Schang E., Antoine J.-Y., Pelletier A., Maurel D., Eshkol I., Villaneau J. (2014)
ANCOR_Centre, a Large Free Spoken French Coreference Corpus: Description of the Resource and Reliability Measures.
Proc. LREC’2014, Reykjavik, Iceland.
[https://hal.archives-ouvertes.fr/hal-01075679]

Schang E., Boyer A., Muzerelle J., Antoine J-Y, Eskhol I., Maurel D. (2011).
Coreference and Anaphoric Annotations for Spontaneous Speech Corpora In French
Proc. Discourse Anaphora and Anaphor Resolu1on Colloquium, DAARC’2011, Faro, Portugal
[https://halshs.archives-ouvertes.fr/halshs-00764786]