Linked Data and PageRank based classification

Linked Data and PageRank based classification

In this article, we would like to present new approach to classification with Linked Data and PageRank. Our research is focused on classification methods that are enhanced by semantic information. The semantic information can be obtained from ontology or from Linked Data. DBpedia was used as source of Linked Data in our case. Feature selection method is semantically based so features can be recognized by nonprofessional users because they are in a human readable and understandable form. PageRank is used during feature selection and generation phase for expansion of basic features into more general representatives. It means that feature selection and processing is based on a network relations obtained from Linked Data. The features can be used by standard classification algorithms. We will present the promising preliminary results that show the easy applicability of this approach to different datasets.

Keywords: Linked Data, PageRank, classification, feature selection

Year: 2013

Download: download Full text [495 kB]

Authors of this publication:


Martin Dostal


E-mail: madostal@kiv.zcu.cz

Martin graduated from the University of West Bohemia in 2009, specialized in software engineering. He is interested in the semantic Web, information retrieval, and question answering.

Michal Nykl


E-mail: nyklm@kiv.zcu.cz
WWW: http://home.zcu.cz/~nyklm/

Michal is researcher at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen (Czech Republic). He is Software engineer and his interest is focused on the Graph structure mining algorithms, which are used for Social network analysis, Text-mining, NPL and similar problems.

Karel Ježek


Phone:  +420 377632475
E-mail: jezek_ka@kiv.zcu.cz
WWW: https://cs.wikipedia.org/wiki/Karel_Je%C5%BEek_(informatik)

Karel is the former group coordinator and a supervisor of PhD students working at research projects of this Group.

Dalibor Fiala


Phone: +420 377 63 2429
E-mail: dalfia@kiv.zcu.cz
WWW: http://www.kiv.zcu.cz/~dalfia/

Dalibor is the research group coordinator and an associate professor at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in data mining, web mining, information retrieval, informetrics, and information science.

Related Projects:


Project

Document Clustering and Linked Data

Authors:  Karel Ježek, Martin Dostal
Desc.:Unsupervised methods for automatic tagging and clustering based on information extraction from Linked data.
Project

Social Networks Analysis

Authors:  Karel Ježek, Dalibor Fiala, Michal Nykl
Desc.:Application of the PageRank algorithm and its modifications to the exploration of network structures, particularly citation and co-autorship networks.