
In Czech: Využití moderních přístupů pro detekci plagiátů
Plagiarism is a widely-spread problem you can meet anywhere. However, the education environment is the most problematic. This paper deals with modern appreaches for plagiarism detection. We propose a method that employes text normalization and Latent Semantic Analysis to infer latent semantic association among documents. Further, we introduce preliminar experiments on our testing corpus, which consists of 950 documents about politics. Our preliminar experiments indicate better results for our approach than the other tested methods. Finally, we discuss the use of WordNet to improve the accuracy of plagiarism detection method. Another issue is the identification of translated documents.
Keywords: Plagiarism, Copy Detection, Phrases, N-grams, WordNet, Thesaurus, Singular Value Decomposition, Latent Semantic Analysis
Year: 2008

Authors of this publication:

Zdeněk Češka
E-mail: zceska@kiv.zcu.cz
WWW: http://www.kiv.zcu.cz/en/department/members/detail.html?login=zceska
Related Projects:

Automatic Plagiarism Detection | |
Authors: | Zdeněk Češka |
Desc.: | This project focuses on the particular field of automatic plagiarism detection in written text. The main principle of this project is the application of Latent Semantic Analysis in conjunction with word N-grams. |