Web Mining Methods for the Detection of Authoritative Sources: Theory and Practice

The development of information society in recent decades has enabled collecting, filtering and storing huge amounts of data. These data must be further processed to gain valuable information and knowledge. The scientific field dealing with extracting information and knowledge from data has evolved rapidly to cope with the extent and growth of information sources the number of which has geometrically increased with the appearance of the World Wide Web. All traditional approaches in information retrieval, knowledge acquisition, and data mining must be adapted for the dynamic, heterogeneous, and unstructured data on the Web. Web mining has come into being as a fully-fledged research discipline. This book presents state-of-the-art knowledge of Web mining from the perspective of looking for authoritative sources. Besides introduction to the theoretical concepts of Web crawling, ranking algorithms, and social networks, results of practical experiments are shown as well. In particular, a brand new algorithm for bibliographic networks is introduced. This publication will be especially useful to professionals, researchers, and students in the field of data mining and information retrieval.

Keywords: Web mining, Web crawling, ranking algorithms, bibliographic networks, citations, co-authorships, authorities, bibliographic PageRank

Year: 2009

Dalibor Fiala

Dalibor is an associate professor at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in web mining, information retrieval, and information science.