Text-Mining Research Group

From CiteSeer to CiteSeerX: Author rankings based on coauthorship networks

CiteSeer was a digital library and a search engine gathering its mainly computer science research papers from the World Wide Web. After a few years of stagnation, it was definitely replaced with a new version called CiteSeerX in April 2010. As both CiteSeers provide(d) freely available metadata on the articles they index(ed), it is possible to analyze two different data sets to see the differences between CiteSeer and CiteSeerX. More specifically, we examined the article metadata from CiteSeer (downloaded in December 2005) and from CiteSeerX (harvested in March 2011) with a view of creating rankings of prestigious computer scientists. Since the free article metadata acquired from the Web site of CiteSeerX differ from those in CiteSeer in that they do not systematically include cited references, the only possibility of creating such rankings is to base them on the coauthorship networks in both CiteSeers. In this study, we produce these rankings using 12 different ranking methods including PageRank and its variants, compare them with the lists of ACM A. M. Turing Award and ACM SIGMOD E. F. Codd Innovations Award winners and conclude that the rankings generated from CiteSeerX data outperform those from CiteSeer.
The available full text is a preprint of the article.

Keywords: CiteSeer, CiteSeerX, Coauthorships, Citations, Researchers, PageRank

Year: 2013

Journal ISSN: 1992-8645

Download:

Full text [931 kB]

Authors of this publication:

Dalibor Fiala

Phone: +420 377 63 2429
E-mail: dalfia@kiv.zcu.cz
WWW: http://www.kiv.zcu.cz/~dalfia/

Dalibor is the research group coordinator and an associate professor at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in data mining, web mining, information retrieval, informetrics, and information science.

Related Projects:

Social Networks Analysis
Authors:	Karel Ježek, Dalibor Fiala, Michal Nykl
Desc.:	Application of the PageRank algorithm and its modifications to the exploration of network structures, particularly citation and co-autorship networks.

Text-Mining Research Group

University of West Bohemia

From CiteSeer to CiteSeerX: Author rankings based on coauthorship networks

Authors of this publication:

Dalibor Fiala

Related Projects:

Social Networks Analysis