Echoic log-surprise: A multi-scale scheme for acoustic saliency detection

e-Archivo Repository

Show simple item record

dc.contributor.author Rodríguez Hidalgo, Antonio
dc.contributor.author Peláez Moreno, Carmen
dc.contributor.author Gallardo Antolín, Ascensión
dc.date.accessioned 2020-11-30T10:54:02Z
dc.date.available 2020-12-30T00:00:07Z
dc.date.issued 2018-12-30
dc.identifier.bibliographicCitation Rodríguez-Hidalgo, A., Peláez-Moreno, C., & Gallardo-Antolín, A. (2018). Echoic log-surprise: A multi-scale scheme for acoustic saliency detection. Expert Systems with Applications, 114, 255-266.
dc.identifier.issn 0957-4174
dc.identifier.uri http://hdl.handle.net/10016/31499
dc.description.abstract Perceptual signals such as acoustic or visual cues carry a massive amount of information. From a human perspective, this problem is solved by means of cognitive mechanisms related to attention. In particular, saliency is a property of particular stimuli that makes them stand from others to allow the brain to take decisions about their relevance in the process of exploring the world. For artificial intelligence systems it is advantageous to mimic these mechanisms. Visual saliency algorithms have been successfully employed in tasks such as medical diagnosis, detection of violent scenes, environment understanding made by robots, etc. In contrast, computational models of the acoustic saliency mechanisms are less extended. In this context, we propose a novel acoustic saliency algorithm to be used by intelligent and expert systems facing tasks such as sound detection and classification, early alarm, surveillance, robotic exploration of the surroundings, among many other applications. This technique, we termed echoic log-surprise, combines an unsupervised statistical approach based on Bayesian log-surprise and the biological concept of echoic or Auditory Sensory Memory. Our algorithm computes several independent log-surprise cues in parallel considering a wide range of memory values, with the aim of leveraging saliency information from different temporal scales. Then, we explore several statistical metrics to combine these multi-scale signals in a single temporal saliency signal including Renyi entropy, Jensen-Shannon divergence, Cramer or Bhattacharyya distances. We have adopted Acoustic Event Detection tasks as adequate proxies to test its performance.
dc.description.sponsorship This work is partially supported by the Spanish Government-MinECo projects TEC2014-53390-P and TEC2017-84395-P.
dc.language.iso eng
dc.publisher Elsevier
dc.rights © 2018 Elsevier
dc.rights Atribución-NoComercial-SinDerivadas 3.0 España
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subject.other Acoustic saliency
dc.subject.other Echoic memory
dc.subject.other Multi-scale
dc.subject.other Statistical divergence
dc.subject.other Jensen-Shannon
dc.subject.other Acoustic event detection
dc.title Echoic log-surprise: A multi-scale scheme for acoustic saliency detection
dc.type article
dc.subject.eciencia Telecomunicaciones
dc.identifier.doi 10.1016/j.eswa.2018.07.018
dc.rights.accessRights openAccess
dc.relation.projectID Gobierno de España. TEC2014-53390-P
dc.relation.projectID Gobierno de España. TEC2017-84395-P
dc.type.version acceptedVersion
dc.identifier.publicationfirstpage 255
dc.identifier.publicationlastpage 266
dc.identifier.publicationtitle Expert Systems with Applications
dc.identifier.publicationvolume 114
dc.identifier.uxxi AR/0000022194
dc.contributor.funder Ministerio de Economía y Competitividad (España)
 Find Full text

Files in this item

*Click on file's image for preview. (Embargoed files's preview is not supported)


The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record