Text-Mining Research Group

Automatic Dialog Acts Recognition based on Words Clusters

This paper deals with automatic dialog acts (DAs) recognition in Czech. A Dialog act is defined by J. L. Austin as a meaning of an utterance at the level of illocutionary force. The four following DAs are considered: statements, orders, yes/no questions and other questions. In our previous works, we proposed, implemented and evaluated two new approaches to automatic DAs recognition based on sentence structure. These methods have been validated on a Czech corpus that simulates a task of train tickets reservation. The main goal of this paper is to propose a new approach to solve the problem of lack of training data for automatic DA recognition. This approach clusters the words in the sentence into several groups using maximization of mutual information between two neighbor word classes. The classification accuracy of the unigram model (our baseline approach) is 91%. The proposed method, a clustered unigram model, reduces the DA error rate by 12%.

Keywords: Dialog, Dialog Act, Mutual Information, Word Clustering, Unigram Model

Year: 2006

Download:

Full text [185 kB]

Authors of this publication:

Pavel Král

Phone: +420 377 632 454
E-mail: pkral@kiv,zcu.cz
WWW: http://home.zcu.cz/~pkral/

Pavel is a lecturer/researcher at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen (Czech Republic). His research is focused on automatic speech processing, dialog act recognition, syntactic parsing, punctuation annotation and document classification.

Text-Mining Research Group

University of West Bohemia

Automatic Dialog Acts Recognition based on Words Clusters

Authors of this publication:

Pavel Král

Jana Klečková

Christophe Cerisara