Connected Digit Recognition Experiments with the OGI Toolkit's Neural Network and HMM-Based Recognizers

This paper describes a series of experiments that compare different approaches to training a speaker-independent continuous-speech digit recognizer using the CSLU Toolkit. Comparisons are made between the Hidden Markov Model (HMM) and Neural Network (NN) approaches. In addition, a description of the CSLU Toolkit research environment is given.
The CSLU Toolkit is a research and development software environment that provides a powerful and flexible tool for creating and using spoken language systems for telephone and PC applications. In particular, the CSLU-HMM, the CSLU-NN, and the CSLU-FBNN development environments, with which our experiments were implemented, will be described in detail and recognition results will be compared.
Our speech corpus is OGI 30K-Numbers, which is a collection of spontaneous ordinal and cardinal numbers, continuous digit strings and isolated digit strings. The utterances were recorded by having a large number of people recite their ZIP code, street address, or other numeric information over the telephone. This corpus represents a very noisy and difficult recognition task.
Our best results (98% word recognition, 92% sentence recognition), obtained with the FBNN architecture, suggest the effectiveness of the CSLU Toolkit in building real-life speech recognition systems.

Tipo Pubblicazione: 
Contributo in atti di convegno
Author or Creator: 
Cosi P.
Hosom J.P.
Shalkwyk J.
Sutton S.
Cole R.A.
Interactive Voice Technology for Telecommunications Applications, 1998. 1998 IEEE 4th Workshop IVTTA-ETWR '98, pp. 135–140, Turin, Italy, 29-30 September, 1998
Resource Identifier:
ISTC Author: 
Ritratto di Piero Cosi
Real name: