A Web-Based User-Profile Generator: Foundation for a Recommender and Expert Finding System

A Web-Based User-Profile Generator: Foundation for a Recommender and Expert Finding System

The objective of our research is to create a universal tool for recommending non-visited interesting web pages as well as experts working in the same field of specialty. We accentuate practical adaptability of user profiles. User profiles are generated on the basis of Suffix Tree Clustering (STC) algorithm, which is similar to creating an inverted list of phrases occurring in a document collection. We are computing similarity of characteristic phrases identified by STC in order to find clusters of phrases. Phrases linked by similarity relationships form a phrase association graph. Clusters of phrases generated by our tool define interests of each user. We have tested the system by means of various document collections, such as Reuters Corpus Volume One – RCV1, 20Newsgroups, CTK – Czech Press Agency and Reuters-21578. Experimental results based on our extensive simulations as well as real-life environment are presented in the paper. Precision of our recommender system is 85 to 95 %.

Keywords: text mining, user profile, web, www, recommender system, expert search, clustering, suffix tree, phrase search, characteristic phrase, similarity, packet filter

Year: 2004

Download: download Full text [143 kB]

Authors of this publication:

Petr Grolmus

E-mail: indy@civ.zcu.cz

Petr used to be a co-founder of the Text-Mining research group. His interest was mainly focused on the identification of user profiles based on users behavior on the Web.

Jiří Hynek

Phone: +420 603492837
E-mail: jhynek@kiv.zcu.cz
WWW: http://www.kiv.zcu.cz/staff/osobni.php?id_osoby=147&lang=EN

Jiri, a co-founder of the Text-Mining Research Group, works as a lecturer at the Dept. of Computer Science and Engineering. His research interests include machine learning and language-related problems. Jiri’s teaching activity is focused on good writing style and technical writing in general.

Karel Ježek

Phone:  +420 377632475, 377632400
E-mail: jezek_ka@kiv.zcu.cz
WWW: http://www-kiv.zcu.cz/~jezek_ka/

Karel is a group coordinator and a supervisor of PhD students working at research projects of this Group.

Related Projects:


User Profile Mining, Social Networks

Authors:  Jiří Hynek, Petr Grolmus, Karel Ježek
Desc.:Identification of user profiles based on users' behavior on the web. Practical applications in various knowledge and information management projects.