JRC’s Participation at TAC 2011: Guided and Multilingual Summarization Tasks

JRC’s Participation at TAC 2011: Guided and Multilingual Summarization Tasks

The paper describes our participation inthe Guided and Multilingual SummarizationTasks at the Text Analysis Conference 2011(TAC’11). We participated in the Guided taskwith the system from the previous year whichcombines aspect identification by an eventextraction system and automatically learnedlexicons with LSA-based summarizer. Thisyear we included temporal analysis to improvesentence ordering, detection of update informationand dealing with the WHEN aspect.We made a first try to compress and paraphrasesentences with our second run. Multilingualsummarization is our ultimate goaland thus all components of the system areeither fully language independent or can beadapted to other languages relatively easily.The multilingual task provided a possibilityto test the system on other languages thanEnglish. The sentence-extractive summarizerwas ranked among the top systems in readabilityand non-redundancy. Even if the content ofits summaries was not ranked on the top forEnglish in the main Guided task, it reachedthe top results in the Multilingual task. Thegenerative run suffered from worse readabilitywhich affected also the content scores.

Keywords: Summarization, multilingual, event extraction, summarization evaluation, text analysis conference

Year: 2012

Download: download Full text 

Authors of this publication:


Josef Steinberger


E-mail: jstein@kiv.zcu.cz

Josef is an associated professor at the Department of computer science and engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in media monitoring and analysis, mainly automatic text summarisation, sentiment analysis and coreference resolution.

Related Projects:


Project

Automatic Text Summarisation

Authors:  Josef Steinberger, Karel Ježek, Michal Campr, Jiří Hynek
Desc.:Automatic text summarisation using various text mining methods, mainly Latent Semantic Analysis (LSA).