The robustness of echoic log-surprise auditory saliency detection

e-Archivo Repository

Show simple item record

dc.contributor.author Rodríguez Hidalgo, Antonio
dc.contributor.author Peláez Moreno, Carmen
dc.contributor.author Gallardo Antolín, Ascensión
dc.date.accessioned 2020-09-25T08:56:04Z
dc.date.available 2020-09-25T08:56:04Z
dc.date.issued 2018-11-21
dc.identifier.bibliographicCitation IEEE Access, (2018), v. 6, pp.: 72083-72093
dc.identifier.issn 2169-3536
dc.identifier.uri http://hdl.handle.net/10016/30850
dc.description.abstract The concept of saliency describes how relevant a stimulus is for humans. This phenomenon hasbeen studied under different perspectives and modalities, such as audio, visual, or both. It has been employedin intelligent systems to interact with their environment in an attempt to emulate or even outperform humanbehavior in tasks, such as surveillance and alarm systems or even robotics. In this paper, we focus on theaural modality and our goal consists in measuring the robustness of Echoic log-surprise in comparison with aset of auditory saliency techniques when tested on noisy environments for the task of saliency detection. Theacoustic saliency methods that we have analyzed include Kalinli's saliency model, Bayesian log-surprise,and our proposed algorithm, Echoic log-surprise. This last method combines an unsupervised approachbased on the Bayesian log-surprise and the biological concept of echoic or auditory sensory memory bymeans of a statistical fusion scheme, where the use of different distance metrics or statistical divergences,such as Renyi's or Jensen-Shannon's among others, are considered. Additionally, for comparison purposes,we have also compared some classical onset detection techniques, such as those based on voice activity detec-tion or energy thresholding. Results show that Echoic log-surprise outperforms the detection capabilities ofthe rest of the techniques analyzed in this paper under a great variety of noises and signal-to-noise ratios,corroborating its robustness in noisy environments. In particular, our algorithm with the Jensen-Shannonfusion scheme produces the best F-scores. With the aim of better understanding the behavior of Echoic log-surprise, we have also studied the influence of its control parameters, depth and memory, and their influenceat different noise levels.
dc.description.sponsorship This work is partially supported by the Spanish Government-MinECo projects TEC2014-5390-P and TEC2017-84395-P.
dc.format.extent 11
dc.language.iso eng
dc.publisher IEEE
dc.rights © 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
dc.subject.other Acoustic saliency
dc.subject.other Echoic memory
dc.subject.other Multi-scale
dc.subject.other Statistical divergence
dc.subject.other Jensen-Shannon
dc.subject.other Acoustic event detection
dc.title The robustness of echoic log-surprise auditory saliency detection
dc.type article
dc.description.status Publicado
dc.subject.eciencia Telecomunicaciones
dc.identifier.doi https://doi.org/10.1109/ACCESS.2018.2882055
dc.rights.accessRights openAccess
dc.relation.projectID Gobierno de España. TEC2014-53390-P
dc.relation.projectID Gobierno de España. TEC2017-84395-P
dc.type.version acceptedVersion
dc.identifier.publicationfirstpage 72083
dc.identifier.publicationlastpage 72093
dc.identifier.publicationtitle IEEE Access
dc.identifier.publicationvolume 6
dc.identifier.uxxi AR/0000023986
dc.contributor.funder Ministerio de Economía y Competitividad (España)
 Find Full text

Files in this item

*Click on file's image for preview. (Embargoed files's preview is not supported)


The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record