Publication:
Auditory-inspired morphological processing of speech spectrograms: applications in automatic speech recognition and speech enhancement

dc.affiliation.dptoUC3M. Departamento de Teoría de la Señal y Comunicacioneses
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Procesado Multimediaes
dc.contributor.authorCadore, Joyner
dc.contributor.authorValverde Albacete, Francisco José
dc.contributor.authorGallardo Antolín, Ascensión
dc.contributor.authorPeláez Moreno, Carmen
dc.date.accessioned2012-11-27T08:34:33Z
dc.date.available2012-11-27T08:34:33Z
dc.date.issued2012-11
dc.description.abstractNew auditory-inspired speech processing methods are presented in this paper, combining spectral subtraction and two-dimensional non-linear filtering techniques originally conceived for image processing purposes. In particular, mathematical morphology operations, like erosion and dilation, are applied to noisy speech spectrograms using specifically designed structuring elements inspired in the masking properties of the human auditory system. This is effectively complemented with a pre-processing stage including the conventional spectral subtraction procedure and auditory filterbanks. These methods were tested in both speech enhancement and automatic speech recognition tasks. For the first, time-frequency anisotropic structuring elements over grey-scale spectrograms were found to provide a better perceptual quality than isotropic ones, revealing themselves as more appropriate—under a number of perceptual quality estimation measures and several signal-to-noise ratios on the Aurora database—for retaining the structure of speech while removing background noise. For the second, the combination of Spectral Subtraction and auditory-inspired Morphological Filtering was found to improve recognition rates in a noise-contaminated version of the Isolet database.
dc.description.sponsorshipThis work has been partially supported by the Spanish Ministry of Science and Innovation CICYT Project No. TEC2008-06382/TEC.
dc.description.statusPublicado
dc.format.mimetypeapplication/pdf
dc.identifier.bibliographicCitationCognitive Computation, December 2013, 5(4), pp. 426-441.
dc.identifier.doi10.1007/s12559-012-9196-6
dc.identifier.issn1866-9956 (Print)
dc.identifier.issn1866-9964 (Online)
dc.identifier.publicationfirstpage426
dc.identifier.publicationissue4
dc.identifier.publicationlastpage441
dc.identifier.publicationtitleCognitive Computationen
dc.identifier.publicationtitleCognitive Computation
dc.identifier.urihttps://hdl.handle.net/10016/15932
dc.language.isoeng
dc.publisherSpringer
dc.relation.publisherversionhttp://dx.doi.org/10.1007/s12559-012-9196-6
dc.rights© Springer
dc.rights.accessRightsopen access
dc.subject.ecienciaTelecomunicaciones
dc.subject.otherSpectral subtraction
dc.subject.otherSpectrogram
dc.subject.otherMorphological processing
dc.subject.otherImage filtering
dc.subject.otherAutomatic speech recognition
dc.subject.otherSpeech enhancement
dc.subject.otherAuditory-based features
dc.titleAuditory-inspired morphological processing of speech spectrograms: applications in automatic speech recognition and speech enhancement
dc.typeresearch article*
dc.type.hasVersionAM*
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
cc_2012.pdf
Size:
1.21 MB
Format:
Adobe Portable Document Format