RT Conference Proceedings T1 ASR Feature Extraction with Morphologically-Filtered Power-Normalized Cochleograms A1 Calle Silos, Fernando de la A1 Valverde Albacete, Francisco José A1 Gallardo Antolín, Ascensión A1 Peláez Moreno, Carmen AB In this paper we present advances in the modeling of the masking behavior of the Human Auditory System to enhance the robustness of the feature extraction stage in Automatic Speech Recognition. The solution adopted is based on a non-linear filtering of a spectro-temporal representation applied simultaneously on both the frequency and time domains, by processing it using mathematical morphology operations as if it were an image. A particularly important component of this architecture is the so called structuring element: biologically-based considerations are addressed in the present contribution to design an element that closely resembles the masking phenomena taking place in the cochlea. The second feature of this contribution is the choice of underlying spectro-temporal representation. The best results were achieved by the representation introduced as part of the Power Normalized Cepstral Coefficients together with a spectral subtraction step. On the Aurora 2 noisy continuous digits task, we report relative error reductions of 18.7% compared to PNCC and 39.5% compared to MFCC. PB International Speech Communication Association SN 9781634394352 YR 2014 FD 2014 LK https://hdl.handle.net/10016/21480 UL https://hdl.handle.net/10016/21480 LA eng NO Proceedings of: 15th Annual Conference of the International Speech Communication Association. Singapore, September 14-18, 2014. NO This contribution has been supported by an Airbus Defense and Space Grant (Open Innovation - SAVIER) and Spanish Government-CICYT project 2011-26807/TEC. DS e-Archivo RD 1 jul. 2024