Publication: Combining audio-visual features for viewers' perception classification of Youtube car commercials
dc.affiliation.dpto | UC3M. Departamento de Teoría de la Señal y Comunicaciones | es |
dc.affiliation.grupoinv | UC3M. Grupo de Investigación: Procesado Multimedia | es |
dc.contributor.author | Fernández Martínez, Fernando | es |
dc.contributor.author | Hernández García, Alejandro | es |
dc.contributor.author | Gallardo Antolín, Ascensión | es |
dc.contributor.author | Díaz de María, Fernando | es |
dc.date.accessioned | 2015-07-30T09:52:10Z | |
dc.date.available | 2015-07-30T09:52:10Z | |
dc.date.issued | 2014 | |
dc.description | Proccedings of: 2nd International Workshop on Speech, Language and Audio in Multimedia. Penang, Malaysia, 11-12 September 2014. | en |
dc.description.abstract | In this paper, we present a computational model capable of predicting the viewer perception of Youtube car TV commercials by using a set of low-level audio and visual descriptors. Our research goal relies on the hypothesis that these descriptors could reflect to some extent the objective value of the videos and, in turn, the average viewer's perception. To that end, and as a novel approach to this problem, we automatically annotate our video corpus, grouped into 2 classes corresponding to differ-ent satisfaction levels, by means of a regular k-means algorithm applied to the video metadata related to users feedback. Evaluation results show that simple linear logistic regression models based on the 10 best visual descriptors and on the 10 best audio descriptors individually perform reasonably well, achieving a classification accuracy of roughly 70% and 75%, respectively. Combination of audio and visual descriptors yields better performance, roughly 86% for the top-20 selected from the entire descriptor set, but tipping the balance in favor of the audio ones (i.e. 17 vs 3). Audio content bigger influence in this domain is also evidenced by a side analysis of the video comments. | en |
dc.description.status | Publicado | en |
dc.format.extent | 5 | es |
dc.format.mimetype | application/pdf | |
dc.identifier.bibliographicCitation | Tien-Ping Tan et al. (eds.) (2014). Proceeding of the 2nd International Workshop on Speech, Language and Audio in Multimedia (SLAM 2014), 11-12 September 2014, Penang, Malaysia. (pp. 14-18). International Speech Communication Association. | en |
dc.identifier.isbn | 978-967-394-199-5 | |
dc.identifier.publicationfirstpage | 14 | |
dc.identifier.publicationlastpage | 18 | |
dc.identifier.publicationtitle | Proceeding of the 2nd International Workshop on Speech, Language and Audio in Multimedia (SLAM 2014), 11-12 September 2014, Penang, Malaysia. | en |
dc.identifier.uri | https://hdl.handle.net/10016/21479 | |
dc.identifier.uxxi | CC/0000022421 | |
dc.language.iso | eng | en |
dc.publisher | International Speech Communication Association. | en |
dc.relation.eventdate | 2014-09-11 | |
dc.relation.eventnumber | 2 | |
dc.relation.eventplace | Penang, Malasia | en |
dc.relation.eventtitle | Workshop on Speech, Language and Audio in Multimedia (SLAM 2014) | en |
dc.relation.publisherversion | http://www.isca-speech.org/archive/slam_2014/papers/slm4_014.pdf | en |
dc.rights | © 2014 ISCA | en |
dc.rights.accessRights | open access | en |
dc.subject.eciencia | Telecomunicaciones | es |
dc.subject.other | Subjective assessment | en |
dc.subject.other | Video aesthetics | es |
dc.subject.other | Music Information Retrieval | en |
dc.subject.other | Video metadata | en |
dc.title | Combining audio-visual features for viewers' perception classification of Youtube car commercials | en |
dc.type | conference proceedings | * |
dc.type.hasVersion | VoR | * |
dspace.entity.type | Publication |
Files
Original bundle
1 - 1 of 1