Document Classification Using Itemsets

Document Classification Using Itemsets

The essential point of this paper is to develop a method for automating time-consuming document classification in a digital library. The method proposed in this paper is based on itemsets, extending traditional application of the apriori algorithm.

Keywords: itemset, classification, class generation, cluster, clustering, apriori algorithm, document similarity, document categorization

Year: 2000

Download: download Full text [43 kB]

Authors of this publication:


Jiří Hynek


Phone: +420 603492837
E-mail: jhynek@kiv.zcu.cz
WWW: http://www.kiv.zcu.cz/staff/osobni.php?id_osoby=147&lang=EN

Jiri, a co-founder of the Text-Mining Research Group, works as a lecturer at the Dept. of Computer Science and Engineering. His research interests include machine learning and language-related problems. Jiri’s teaching activity is focused on good writing style and technical writing in general.

Karel Ježek


Phone:  +420 377632475, 377632400
E-mail: jezek_ka@kiv.zcu.cz
WWW: http://www-kiv.zcu.cz/~jezek_ka/

Karel is a group coordinator and a supervisor of PhD students working at research projects of this Group.

Related Projects:


Project

Document Classification

Authors:  Jiří Hynek, Karel Ježek, Michal Toman, Roman Tesař, Zdeněk Češka, Petr Grolmus
Desc.:Use of inductive machine learning methods in classification of short text documents.