Network-based statistical comparison of citation topology of bibliographic databases

Network-based statistical comparison of citation topology of bibliographic databases

Modern bibliographic databases provide the basis for scientific research and its evaluation. While their content and structure differ substantially, there exist only informal notions on their reliability. Here we compare the topological consistency of citation networks extracted from six popular bibliographic databases including Web of Science, CiteSeer and arXiv.org. The networks are assessed through a rich set of local and global graph statistics. We first reveal statistically significant inconsistencies between some of the databases with respect to individual statistics. For example, the introduced field bow-tie decomposition of DBLP Computer Science Bibliography substantially differs from the rest due to the coverage of the database, while the citation information within arXiv.org is the most exhaustive. Finally, we compare the databases over multiple graph statistics using the critical difference diagram. The citation topology of DBLP Computer Science Bibliography is the least consistent with the rest, while, not surprisingly, Web of Science is significantly more reliable from the perspective of consistency. This work can serve either as a reference for scholars in bibliometrics and scientometrics or a scientific evaluation guideline for governments and research agencies.

See the paper on the publisher's webpage here.

Keywords: Computational science, scientific data, applied physics, complex networks.

Year: 2014

Journal ISSN: 2045-2322
Download: download Full text [1362 kB]
View record in Web of Science®

Authors of this publication:


Dalibor Fiala


Phone: +420 377 63 2429
E-mail: dalfia@kiv.zcu.cz
WWW: http://www.kiv.zcu.cz/~dalfia/

Dalibor is the research group coordinator and an associate professor at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in data mining, web mining, information retrieval, informetrics, and information science.

Related Projects:


Project

Social Networks Analysis

Authors:  Karel Ježek, Dalibor Fiala, Michal Nykl
Desc.:Application of the PageRank algorithm and its modifications to the exploration of network structures, particularly citation and co-autorship networks.