|

Speech recognizers typically
use a front-end pre-processor to extract the features to be
used to identify speech. One such pre-processor, based on
mammalian physiological and auditory models, is the Auditory
Image Model (AIM) developed by Patterson
and Holdsworth. Further speech recognition development
based on AIM has progressed slowly because AIM is computationally
intensive; a software implementation cannot run in real time.
Consequently, Tanner Labs performed extensive simulations
to refine the algorithm and to determine optimal parameter
values, and we used the results to develop a hardware implementation
of AIM that achieves real-time performance and eliminates
the memory requirements of a software implementation.
The Tanner AIM module is a sophisticated multi-component
printed circuit board that can be mounted in a Sun workstation.
It includes reprogrammable logic (a field-programmable gate
array, or "FPGA"), support chips (including ADC/DACs
and RAM), as well as a state-of-the-art custom analog biquadratic
filter-bank IC, the largest (56 channels) monolithic vocoder
ever fabricated. In this chip, we achieved high precision
(60 dB of linearity) in a compact and low-power (approximately
20 mW) implementation. We also developed user interface software.
This research has been sponsored by the U.S. Air Force under
the SBIR
program.
|