Text-Mining Research Group » Research

In progress

Project

Data Mining for Computer Networks Security
Authors:	Michael Heigl, Laurin Doerr, Dalibor Fiala
Desc.:	Novel data mining methods for the enhancement of computer networks security using advanced outlier detection techniques on streaming data are investigated.

Project

Multilingual Sentiment Analysis
Authors:	Josef Steinberger
Desc.:	Sentiment analysis of news and social media in multiple languages.

Project

Social Networks Analysis
Authors:	Karel Ježek, Dalibor Fiala, Michal Nykl
Desc.:	Application of the PageRank algorithm and its modifications to the exploration of network structures, particularly citation and co-autorship networks.

Finished

Project

Automatic Plagiarism Detection
Authors:	Zdeněk Češka
Desc.:	This project focuses on the particular field of automatic plagiarism detection in written text. The main principle of this project is the application of Latent Semantic Analysis in conjunction with word N-grams.

Project

Automatic Text Summarisation
Authors:	Josef Steinberger, Karel Ježek, Michal Campr, Jiří Hynek
Desc.:	Automatic text summarisation using various text mining methods, mainly Latent Semantic Analysis (LSA).

Project

Document Classification
Authors:	Jiří Hynek, Karel Ježek, Michal Toman, Roman Tesař, Zdeněk Češka, Petr Grolmus
Desc.:	Use of inductive machine learning methods in classification of short text documents.

Project

Document Clustering and Linked Data
Authors:	Karel Ježek, Martin Dostal
Desc.:	Unsupervised methods for automatic tagging and clustering based on information extraction from Linked data.

Project

Exploration of Semantic Spaces
Authors:	Karel Ježek, Lubomír Krčmář, Miloslav Konopík
Desc.:	This work is focused on semantic relations between words and application of these relations in research fields such as information retrieval, machine translation or document clustering.

Project

Extracting Information from Web Content and Structure
Authors:	Dalibor Fiala, Roman Tesař, Karel Ježek
Desc.:	This project deals with classification of Web documents and determination of authoritative Web sites. It was supported in part by the Ministry of Education of the Czech Republic under grant FRVS 1347/2005/G1.

Project

Internet Content Filtering
Authors:	Roman Tesař, Karel Ježek
Desc.:	This project includes Web sites processing, analyzing, classification by means of their content and searching for other Web sites with similar content.

Project

Knowledge Extraction from Texts
Authors:	Karel Ježek, Martin Zíma
Desc.:	This work is focused to obtain facts via extraction the latent information from XML documents and its processing via logic program.

Project

SPOT: English-Czech ICT Terminology On-line Review
Authors:	Jiří Hynek, Přemysl Brada
Desc.:	We hope that SPOT will become an open platform used for discussing controversial computer terms among professionals. The resulting on-line computer dictionary is freely available to the general public, university teachers, students, editors and professional translators.

Project

Searching and Summarizing in Multilingual Enviroment
Authors:	Josef Steinberger, Karel Ježek, Michal Toman
Desc.:	The project includes multilingual searching in text databases and an automatic summarization of retrieved texts. It was supported in part by the Ministry of Education of the Czech Republic under grant FRVS 1326/2005/G1.

Project

User Profile Mining, Social Networks
Authors:	Jiří Hynek, Petr Grolmus, Karel Ježek
Desc.:	Identification of user profiles based on users' behavior on the web. Practical applications in various knowledge and information management projects.