Input Segmentation of Spontaneous Speech in JANUS: a Speech-to-speech Translation System

August 28, 2006

by Alon Lavie, Donna Gates, Noah Coccaro and Lori Levin (1996). ECAI Workshop on Dialogue Processing in Spoken Language Systems.

Abstract: JANUS is a multi-lingual speech-to-speech translation system designed to facilitate communication between two parties engaged in a spontaneous conversation in a limited domain. In this paper we describe how multi-level segmentation of single utterance turns improves translation quality and facilitates accurate translation in our system. We define the basic dialogue units that are handled by our system, and discuss the cues and methods employed by the system in segmenting the input utterance into such units. Utterance segmentation in our system is performed in a multi-level incremental fashion, partly prior and partly during analysis by the parser. The segmentation relies on a combination of acoustic, lexical, semantic and statistical knowledge sources, which are described in detail in the paper. We also discuss how our system is designed to disambiguate among alterantive possible input segmentations.

My Notes: Split input into semantic dialog units (~= speech act), namely semantically coherent pieces of information that can be translated independently.


