Español English Contacte con nosotros http://www.uc3m.es/portal/page/portal/biblioteca
DSpace e-Archivo

Archivo Abierto Institucional de la Universidad Carlos III de Madrid > Investigación > Departamentos > Departamento de Teoría de la Señal y Comunicaciones > Grupo de Procesado Multimedia > DTSC - GPM - Artículos de Revistas >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10016/15932

Files in This Item:
cc_2012.pdf1,24 MBAdobe PDFformato pdf
Title: Auditory-inspired morphological processing of speech spectrograms: applications in automatic speech recognition and speech enhancement
Author(s): Cadore, Joyner
Valverde-Albacete, Francisco J.
Gallardo-Antolín, Ascensión
Peláez-Moreno, Carmen
Publisher: Springer
Issued date: Nov-2012
Citation: Cognitive Computation, November 2012, [16 p.]
URI: http://hdl.handle.net/10016/15932
ISSN: 1866-9956 (Print)
1866-9964 (Online)
DOI: 10.1007/s12559-012-9196-6
Abstract: New auditory-inspired speech processing methods are presented in this paper, combining spectral subtraction and two-dimensional non-linear filtering techniques originally conceived for image processing purposes. In particular, mathematical morphology operations, like erosion and dilation, are applied to noisy speech spectrograms using specifically designed structuring elements inspired in the masking properties of the human auditory system. This is effectively complemented with a pre-processing stage including the conventional spectral subtraction procedure and auditory filterbanks. These methods were tested in both speech enhancement and automatic speech recognition tasks. For the first, time-frequency anisotropic structuring elements over grey-scale spectrograms were found to provide a better perceptual quality than isotropic ones, revealing themselves as more appropriate—under a number of perceptual quality estimation measures and several signal-to-noise ratios on the Aurora database—for retaining the structure of speech while removing background noise. For the second, the combination of Spectral Subtraction and auditory-inspired Morphological Filtering was found to improve recognition rates in a noise-contaminated version of the Isolet database.
Sponsor: This work has been partially supported by the Spanish Ministry of Science and Innovation CICYT Project No. TEC2008-06382/TEC.
Publisher version: http://dx.doi.org/10.1007/s12559-012-9196-6
Keywords: Spectral subtraction
Spectrogram
Morphological processing
Image filtering
Automatic speech recognition
Speech enhancement
Auditory-based features
Rights: © Springer
Appears in Collections:DTSC - GPM - Artículos de Revistas

Refworks Export

SFX Query

Items in E-Archivo are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! © Universidad Carlos III de Madrid - Software DSpace - Terms of use - Feedback