In Czech: Klasifikace multilingu├íln├şch korpus┼» s vyu┼żit├şm tezauru EuroWordNet

In Czech: Klasifikace multilingu├íln├şch korpus┼» s vyu┼żit├şm tezauru EuroWordNet

Classification of Multilingual Corpora using the EuroWordNet Thesaurus

This paper deals with experiment results for multilingual document categorization. We describe a comparison of algorithmic principles and the methodologies used in our classification system. The aim of experiments was to verify the impact of multilingual thesaurus use on the quality of cross-language categorization. We present our results at the end of this article.

Keywords: classification, text corpus, thesaurus, EuroWordNet

Year: 2004

Download: download Full text [139 kB]

Authors of this publication:


Michal Toman


E-mail: mtoman@kiv.zcu.cz

Michal graduated at UWB in 2003, specialized in software engineering. Currently, he is a PhD student interested in information retrieval, multilingual text processing, word sense disambiguation and knowledge discovery.

Karel Je┼żek


Phone:  +420 377632475, 377632400
E-mail: jezek_ka@kiv.zcu.cz
WWW: http://www-kiv.zcu.cz/~jezek_ka/

Karel is a group coordinator and a supervisor of PhD students working at research projects of this Group.