
Text Summarization within the LSA Framework
This thesis deals with the development of a new text summarization method that uses the latent semantic analysis (LSA). The language-independent analysis is able to capture interrelationships among terms, so that we can obtain a representation of document topics. This feature is exploited by the proposed summarization approach. The method originally combines both lexical and anaphoric information. Moreover, anaphora resolution is employed in correcting false references in the summary. Then, I describe a new sentence compression algorithm that takes advantage from the LSA properties.Next, I created a method which evaluates the similarity of main topics of an original text and its summary, motivated by the ability of LSA to extracttopics of a text. Using summaries in multilingual searching system muse led to better user orientation in the retrieved texts and to faster searching when summaries were indexed instead of full texts.
Keywords: Summarization, latent semantic analysis, anaphora resolution, sentence compression, summary evaluation, multilingual searching
Year: 2007

Authors of this publication:

Josef Steinberger
E-mail: jstein@kiv.zcu.cz
Related Projects:

Automatic Text Summarisation | |
Authors: | Josef Steinberger, Karel Ježek, Michal Campr, Jiří Hynek |
Desc.: | Automatic text summarisation using various text mining methods, mainly Latent Semantic Analysis (LSA). |