ASR Feature Extraction with Morphologically-Filtered Power-Normalized Cochleograms

Calle Silos, Fernando de la; Valverde Albacete, Francisco José; Gallardo Antolín, Ascensión; Peláez Moreno, Carmen

Publication:
ASR Feature Extraction with Morphologically-Filtered Power-Normalized Cochleograms

Identifiers

URI: https://hdl.handle.net/10016/21480

ISBN: 9781634394352

UXXI: CC/0000022423

Files

Feature_INTERSPEECH_2014.pdf (342.09 KB)

Publication date

2014

Authors

Calle Silos, Fernando de la

Valverde Albacete, Francisco José

Gallardo Antolín, Ascensión

Peláez Moreno, Carmen

Publisher

International Speech Communication Association

Impact

Export

Abstract

In this paper we present advances in the modeling of the masking behavior of the Human Auditory System to enhance the robustness of the feature extraction stage in Automatic Speech Recognition. The solution adopted is based on a non-linear filtering of a spectro-temporal representation applied simultaneously on both the frequency and time domains, by processing it using mathematical morphology operations as if it were an image. A particularly important component of this architecture is the so called structuring element: biologically-based considerations are addressed in the present contribution to design an element that closely resembles the masking phenomena taking place in the cochlea. The second feature of this contribution is the choice of underlying spectro-temporal representation. The best results were achieved by the representation introduced as part of the Power Normalized Cepstral Coefficients together with a spectral subtraction step. On the Aurora 2 noisy continuous digits task, we report relative error reductions of 18.7% compared to PNCC and 39.5% compared to MFCC.

Description

Proceedings of: 15th Annual Conference of the International Speech Communication Association. Singapore, September 14-18, 2014.

Keywords

Spectro-temporal processing, Morphological filtering, Automatic speech recognition, Auditory-based features, PNCC

Bibliographic citation

Li, Haizhou, et al. (eds). (2014). INTERSPEECH 2014, 15th Annual Conference of the International Speech Communication Association, Singapore, September 14-18, 2014. (pp. 2430-2434). International Speech Communication Association.

Collections

DTSC - GPM - Comunicaciones en congresos y otros eventos
DTSC - GPM - Capítulos de Monografías

Full item page

Publication:
ASR Feature Extraction with Morphologically-Filtered Power-Normalized Cochleograms

Identifiers

Files

Publication date

Defense date

Authors

Advisors

Tutors

Journal Title

Journal ISSN

Volume Title

Publisher

Impact

Export

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Bibliographic citation

Collections

Publication: ASR Feature Extraction with Morphologically-Filtered Power-Normalized Cochleograms

Identifiers

Files

Publication date

Defense date

Authors

Advisors

Tutors

Journal Title

Journal ISSN

Volume Title

Publisher

Impact

Export

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Bibliographic citation

Collections

Publication:
ASR Feature Extraction with Morphologically-Filtered Power-Normalized Cochleograms