
Documents Categorization in Multilingual Environment
This paper deals with various methods for multilingual document categorization and informs about the results of experiments in which EuroWordNet (EWN) plays the central role and serves as a fundamental problem solving tool. We describe both the algorithmic principles and the methodologies used in our classification system and consequently prove their functionality by experimental results. The aim of experiments was to verify the impact of multilingual collection on the quality of categorization and also find how thesaurus can be used to improve the classification and how the use of multilingual thesaurus can generalize monolingual version of categorization.
Keywords: multilingual document categorization, EuroWordNet
Year: 2005

Authors of this publication:

Karel Ježek
Phone: +420 377632475
E-mail: jezek_ka@kiv.zcu.cz
WWW: https://cs.wikipedia.org/wiki/Karel_Je%C5%BEek_(informatik)

Michal Toman
E-mail: mtoman@kiv.zcu.cz
Related Projects:

Document Classification | |
Authors: | Jiří Hynek, Karel Ježek, Michal Toman, Roman Tesař, Zdeněk Češka, Petr Grolmus |
Desc.: | Use of inductive machine learning methods in classification of short text documents. |