In Czech: Klasifikace multilinguálních korpusů s využitím tezauru EuroWordNet

In Czech: Klasifikace multilinguálních korpusů s využitím tezauru EuroWordNet

Classification of Multilingual Corpora using the EuroWordNet Thesaurus

This paper deals with experiment results for multilingual document categorization. We describe a comparison of algorithmic principles and the methodologies used in our classification system. The aim of experiments was to verify the impact of multilingual thesaurus use on the quality of cross-language categorization. We present our results at the end of this article.

Keywords: classification, text corpus, thesaurus, EuroWordNet

Year: 2004

Download: download Full text [139 kB]

Authors of this publication:

Michal Toman


Michal graduated at UWB in 2003, specialized in software engineering. Currently, he is a PhD student interested in information retrieval, multilingual text processing, word sense disambiguation and knowledge discovery.

Karel Ježek

Phone:  +420 377632475

Karel is the former group coordinator and a supervisor of PhD students working at research projects of this Group.