Text-Mining Research Group

Cluster labeling with Linked Data

In this article, we would like to introduce our approach to cluster labeling with Linked Data. Clustering web pages into semantically related groups promises better performance in searching the Web. Nowadays, only special semantic search engines provide clustering of results. Other engines are doubtful as far as the quality of clusters and moreover a dependable system for labeling these clusters is lacking. Linked Data is a set of principles for publishing structured data in a machine readable way with regards to linking with other Web resources. This enables data from different sources to be connected and queried over the Internet. The information from Linked Data can be used for preliminary estimates of topics covered by a set of documents. Topics are represented as resources from Linked Data and are used for smooth humanreadable labeling of clusters.

Keywords: Cluster labeling, Linked Data, Clustering, Semantic web

Year: 2013

Download:

Full text

Authors of this publication:

Martin Dostal

E-mail: madostal@kiv.zcu.cz

Martin graduated from the University of West Bohemia in 2009, specialized in software engineering. He is interested in the semantic Web, information retrieval, and question answering.

Karel Ježek

Phone: +420 377632475
E-mail: jezek_ka@kiv.zcu.cz
WWW: https://cs.wikipedia.org/wiki/Karel_Je%C5%BEek_(informatik)

Karel is the former group coordinator and a supervisor of PhD students working at research projects of this Group.

Michal Nykl

E-mail: nyklm@kiv.zcu.cz
WWW: http://home.zcu.cz/~nyklm/

Michal is researcher at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen (Czech Republic). He is Software engineer and his interest is focused on the Graph structure mining algorithms, which are used for Social network analysis, Text-mining, NPL and similar problems.

Related Projects:

Document Clustering and Linked Data
Authors:	Karel Ježek, Martin Dostal
Desc.:	Unsupervised methods for automatic tagging and clustering based on information extraction from Linked data.