
Using Parallel Corpora for Multilingual (Multi-Document) Summarisation Evaluation
We are presenting a method for the evaluation of multilin-gual multi-document summarisation that allows saving precious annota-tion time and that makes the evaluation results across languages directlycomparable. The approach is based on the manual selection of the mostimportant sentences in a cluster of documents from a sentence-alignedparallel corpus, and by projecting the sentence selection to various targetlanguages. We also present two ways of exploiting inter-annotator agree-ment levels, apply them both to a baseline sentence extraction sum-mariser in seven languages, and discuss the result differences betweenthe two evaluation versions, as well as a preliminary analysis betweenlanguages. The same method can in principle be used to evaluate single-document summarisers or information extraction tools.
Year: 2010

Authors of this publication:

Josef Steinberger
E-mail: jstein@kiv.zcu.cz
Related Projects:

Automatic Text Summarisation | |
Authors: | Josef Steinberger, Karel Ježek, Michal Campr, Jiří Hynek |
Desc.: | Automatic text summarisation using various text mining methods, mainly Latent Semantic Analysis (LSA). |