Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species

e-Archivo Repository

Show simple item record

dc.contributor.author Ludeña Choez, Jimmy Diestin
dc.contributor.author Quispe Soncco, Raisa
dc.contributor.author Gallardo Antolín, Ascensión
dc.date.accessioned 2020-11-30T12:49:28Z
dc.date.available 2020-11-30T12:49:28Z
dc.date.issued 2017-06-19
dc.identifier.bibliographicCitation Ludeña-Choez J, Quispe-Soncco R, Gallardo-Antolín A (2017) Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species. PLOS ONE 12(6): e0179403
dc.identifier.issn 1932-6203
dc.identifier.uri http://hdl.handle.net/10016/31501
dc.description.abstract Feature extraction for Acoustic Bird Species Classification (ABSC) tasks has traditionally been based on parametric representations that were specifically developed for speech signals, such as Mel Frequency Cepstral Coefficients (MFCC). However, the discrimination capabilities of these features for ABSC could be enhanced by accounting for the vocal production mechanisms of birds, and, in particular, the spectro-temporal structure of bird sounds. In this paper, a new front-end for ABSC is proposed that incorporates this specific information through the non-negative decomposition of bird sound spectrograms. It consists of the following two different stages: short-time feature extraction and temporal feature integration. In the first stage, which aims at providing a better spectral representation of bird sounds on a frame-by-frame basis, two methods are evaluated. In the first method, cepstrallike features (NMF_CC) are extracted by using a filter bank that is automatically learned by means of the application of Non-Negative Matrix Factorization (NMF) on bird audio spectrograms. In the second method, the features are directly derived from the activation coefficients of the spectrogram decomposition as performed through NMF (H_CC). The second stage summarizes the most relevant information contained in the short-time features by computing several statistical measures over long segments. The experiments show that the use of NMF_CC and H_CC in conjunction with temporal integration significantly improves the performance of a Support Vector Machine (SVM)-based ABSC system with respect to conventional MFCC.
dc.description.sponsorship This work was supported by the Spanish Government grant TEC2014-53390-P.
dc.language.iso eng
dc.publisher PLOS
dc.rights © 2017 Ludeña-Choez et al.
dc.rights Atribución 3.0 España
dc.rights.uri http://creativecommons.org/licenses/by/3.0/es/
dc.title Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species
dc.type article
dc.subject.eciencia Telecomunicaciones
dc.identifier.doi https://doi.org/10.1371/journal.pone.0179403
dc.rights.accessRights openAccess
dc.relation.projectID Gobierno de España. TEC2014-53390-P
dc.type.version publishedVersion
dc.identifier.publicationissue 6
dc.identifier.publicationtitle PLoS One
dc.identifier.publicationvolume 12
dc.identifier.uxxi AR/0000020315
dc.contributor.funder Ministerio de Economía y Competitividad (España)
 Find Full text

Files in this item

*Click on file's image for preview. (Embargoed files's preview is not supported)


The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record