In Czech: Klasifikace multilinguálních korpusů s využitím tezauru EuroWordNet

In Czech: Klasifikace multilinguálních korpusů s využitím tezauru EuroWordNet

Classification of Multilingual Corpora using the EuroWordNet Thesaurus

This paper deals with experiment results for multilingual document categorization. We describe a comparison of algorithmic principles and the methodologies used in our classification system. The aim of experiments was to verify the impact of multilingual thesaurus use on the quality of cross-language categorization. We present our results at the end of this article.

Keywords: classification, text corpus, thesaurus, EuroWordNet

Year: 2004

Download: download Full text [139 kB]

Authors of this publication:


Michal Toman


E-mail: mtoman@kiv.zcu.cz

Michal graduated at UWB in 2003, specialized in software engineering. Currently, he is a PhD student interested in information retrieval, multilingual text processing, word sense disambiguation and knowledge discovery.

Karel Ježek


Phone:  +420 377632475
E-mail: jezek_ka@kiv.zcu.cz
WWW: https://cs.wikipedia.org/wiki/Karel_Je%C5%BEek_(informatik)

Karel is the former group coordinator and a supervisor of PhD students working at research projects of this Group.