Automatic Dialog Acts Recognition based on Words Clusters

Automatic Dialog Acts Recognition based on Words Clusters

This paper deals with automatic dialog acts (DAs) recognition in Czech. A Dialog act is defined by J. L. Austin as a meaning of an utterance at the level of illocutionary force. The four following DAs are considered: statements, orders, yes/no questions and other questions. In our previous works, we proposed, implemented and evaluated two new approaches to automatic DAs recognition based on sentence structure. These methods have been validated on a Czech corpus that simulates a task of train tickets reservation. The main goal of this paper is to propose a new approach to solve the problem of lack of training data for automatic DA recognition. This approach clusters the words in the sentence into several groups using maximization of mutual information between two neighbor word classes. The classification accuracy of the unigram model (our baseline approach) is 91%. The proposed method, a clustered unigram model, reduces the DA error rate by 12%.

Keywords: Dialog, Dialog Act, Mutual Information, Word Clustering, Unigram Model

Year: 2006

Download: download Full text [185 kB]

Authors of this publication:

Pavel Kr├íl

Phone: +420 377 632 454
E-mail: pkral@kiv,

Pavel is a lecturer/researcher at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen (Czech Republic). His research is focused on automatic speech processing, dialog act recognition, syntactic parsing, punctuation annotation and document classification.