From CiteSeer to CiteSeerX: Author rankings based on coauthorship networks

From CiteSeer to CiteSeerX: Author rankings based on coauthorship networks

CiteSeer was a digital library and a search engine gathering its mainly computer science research papers from the World Wide Web. After a few years of stagnation, it was definitely replaced with a new version called CiteSeerX in April 2010. As both CiteSeers provide(d) freely available metadata on the articles they index(ed), it is possible to analyze two different data sets to see the differences between CiteSeer and CiteSeerX. More specifically, we examined the article metadata from CiteSeer (downloaded in December 2005) and from CiteSeerX (harvested in March 2011) with a view of creating rankings of prestigious computer scientists. Since the free article metadata acquired from the Web site of CiteSeerX differ from those in CiteSeer in that they do not systematically include cited references, the only possibility of creating such rankings is to base them on the coauthorship networks in both CiteSeers. In this study, we produce these rankings using 12 different ranking methods including PageRank and its variants, compare them with the lists of ACM A. M. Turing Award and ACM SIGMOD E. F. Codd Innovations Award winners and conclude that the rankings generated from CiteSeerX data outperform those from CiteSeer.
The available full text is a preprint of the article.

Keywords: CiteSeer, CiteSeerX, Coauthorships, Citations, Researchers, PageRank

Year: 2013

Journal ISSN: 1992-8645
Download: download Full text [931 kB]

Authors of this publication:

Dalibor Fiala


Dalibor is an associate professor at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in web mining, information retrieval, and information science.