A System for Citations Retrieval on the Web

A System for Citations Retrieval on the Web

A fundamental feature of research papers is how many times they are cited in other articles,i.e. how many later references to them there are. That is the only objective way of evaluationhow important or novel a paper's ideas are. With an increasing number of articles availableonline, it has become possible to find these citations in a more or less automated way. Thisthesis first describes existing possibilities of citations retrieval and indexing and thenintroduces CiteSeeker – a tool for a fully automated citations retrieval. CiteSeeker startscrawling the World Wide Web from given start points and searches for specified authors andpublications in a fuzzy manner. That means that certain inaccuracies in the search strings aretaken into account. CiteSeeker treats all common Internet file formats, including PostScriptand PDF documents and archives. The project is based on the .NET technology.

Keywords: Citations, Retrieval, Web, Fuzzy Search, .NET, C#

Year: 2003

Download: download Full text [741 kB]

Authors of this publication:


Dalibor Fiala


Phone: +420 377 63 2429
E-mail: dalfia@kiv.zcu.cz
WWW: http://www.kiv.zcu.cz/~dalfia/

Dalibor is the research group coordinator and an associate professor at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in data mining, web mining, information retrieval, informetrics, and information science.

Related Projects:


Project

Social Networks Analysis

Authors:  Karel Ježek, Dalibor Fiala, Michal Nykl
Desc.:Application of the PageRank algorithm and its modifications to the exploration of network structures, particularly citation and co-autorship networks.