A System for Citations Retrieval on the Web

A fundamental feature of research papers is how many times they are cited in other articles,i.e. how many later references to them there are. That is the only objective way of evaluationhow important or novel a paper's ideas are. With an increasing number of articles availableonline, it has become possible to find these citations in a more or less automated way. Thisthesis first describes existing possibilities of citations retrieval and indexing and thenintroduces CiteSeeker ÔÇô a tool for a fully automated citations retrieval. CiteSeeker startscrawling the World Wide Web from given start points and searches for specified authors andpublications in a fuzzy manner. That means that certain inaccuracies in the search strings aretaken into account. CiteSeeker treats all common Internet file formats, including PostScriptand PDF documents and archives. The project is based on the .NET technology.

Keywords: Citations, Retrieval, Web, Fuzzy Search, .NET, C#

Year: 2003

Authors of this publication:

Dalibor Fiala

Dalibor is an associate professor at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen, Czech Republic. He is interested in web mining, information retrieval, and information science.