Web Mining Methods for the Detection of Authoritative Sources

Web Mining Methods for the Detection of Authoritative Sources

The innovative portion of this doctoral thesis deals with the definitions, explanations and testing of modifications of the standard PageRank formula adapted for bibliographic networks. The new versions of PageRank take into account not only the citation but also the co-authorship graph. We verify the viability of the new algorithms by applying them to the data from the DBLP digital library and by comparing the resulting ranks of the winners of the ACM SIGMOD E. F. Codd Innovations Award. The rankings based on both the citation and co-authorship information turn out to be better than the standard PageRank ranking. In another part of the disseration, we present a methodology and two case studies for finding authoritative researchers by analyzing academic Web sites.

Keywords: Web mining, Web crawling, ranking algorithms, bibliographic networks, citations, co-authorships, authorities, bibliographic PageRank

Year: 2007

Download: download Full text [2641 kB]

Authors of this publication:


Dalibor Fiala


Phone: +420 377 63 2429
E-mail: dalfia@kiv.zcu.cz
WWW: http://www.kiv.zcu.cz/~dalfia/

Dalibor is the research group coordinator and an associate professor at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in data mining, web mining, information retrieval, informetrics, and information science.

Related Projects:


Project

Social Networks Analysis

Authors:  Karel Ježek, Dalibor Fiala, Michal Nykl
Desc.:Application of the PageRank algorithm and its modifications to the exploration of network structures, particularly citation and co-autorship networks.