Machine Translation for Multilingual Summary Content Evaluation

Machine Translation for Multilingual Summary Content Evaluation

The multilingual summarization pilot task atTAC’11 opened a lot of problems we are facingwhen we try to evaluate summary qualityin different languages. The additional languagedimension greatly increases annotationcosts. For the TAC pilot task English articleswere first translated to other 6 languages,model summaries were written and submittedsystem summaries were evaluated. Westart with the discussion whether ROUGE canproduce system rankings similar to those receivedfrom manual summary scoring by measuringtheir correlation. We study then threeways of projecting summaries to a differentlanguage: projection through sentence alignmentin the case of parallel corpora, simplesummary translation and summarizing machinetranslated articles. Building such summariesgives opportunity to run additional experimentsand reinforce the evaluation. Later,we try to see whether machine translated modelscan perform close to original models.

Keywords: multilingual summarization evaluation, machine translation, parallel corpora, TAC multiling

Year: 2012

Download: download Full text 

Authors of this publication:


Josef Steinberger


E-mail: jstein@kiv.zcu.cz

Josef is an associated professor at the Department of computer science and engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in media monitoring and analysis, mainly automatic text summarisation, sentiment analysis and coreference resolution.

Related Projects:


Project

Automatic Text Summarisation

Authors:  Josef Steinberger, Karel Ježek, Michal Campr, Jiří Hynek
Desc.:Automatic text summarisation using various text mining methods, mainly Latent Semantic Analysis (LSA).