Combination of Classifiers for Automatic Recognition of Dialog Acts

Combination of Classifiers for Automatic Recognition of Dialog Acts

This paper deals with automatic dialog acts (DAs) recognition in Czech. The dialog acts are sentence-level labels that represent different states of a dialogue, depending on the application. Our work focuses on two applications: a multimodal reservation system and an animated talking head for hearing-impaired people. In that context, we consider the following DAs: statements, orders, yes/no questions and other questions. We propose to use both lexical and prosodic information for DAs recognition. The main goal of this paper is to compare different methods to combine the results of both classifiers. On a Czech corpus simulating a reservation of train tickets, the lexical information only gives about 92% of classification accuracy, while prosody gives only about 45 % of accuracy. When both classifiers are combined with a multilayer perceptron, the lowest (lexical) word error rate further decreases by 26%. We show that this improvement is close to the optimal one, given the correlation of the lexical and prosodic features. The other combination schemes do not outperform the lexical-only results.

Keywords: bayesian network, dialog, dialog act, language model, lexical information, prosody

Year: 2005

Download: download Full text [51 kB]

Authors of this publication:

Pavel Kr├íl

Phone: +420 377 632 454
E-mail: pkral@kiv,

Pavel is a lecturer/researcher at the Department of Computer Science and Engineering at the University of West Bohemia in Pilsen (Czech Republic). His research is focused on automatic speech processing, dialog act recognition, syntactic parsing, punctuation annotation and document classification.