xmlui.dri2xhtml.METS-1.0.item-contributor-funder:
Ministerio de Economía y Competitividad (España)
Sponsor:
This work has been partially supported by the National Grants TEC2014-53390-P and TEC2017-84395-P of the Spanish Ministry of Economy and Competitiveness.
Project:
Gobierno de España. TEC2014-53390-P Gobierno de España. TEC2017-84395-P
Keywords:
CNN
,
Metadata
,
Loss function
,
Weak labels
Content-based image representation is a very challenging task if we restrict to their visual content. However, associated metadata (such as tags or geolocation) become a valuable source of complementary information that may help to enhance the current system pContent-based image representation is a very challenging task if we restrict to their visual content. However, associated metadata (such as tags or geolocation) become a valuable source of complementary information that may help to enhance the current system performance. In this paper, we propose an automatic training framework that uses both image visual contents and metadata to fine tune deep Convolutional Neural Networks (CNNs), providing better image descriptors adapted to certain locations, such as cities or regions. Specifically, we propose to estimate some weak labels by combining visual- and location-related information and incorporate them to a novel loss-function over pairs of images. Our experiments on a landmark discovery task show that this novel training procedure enhances the performance up to a 55% over well-established CNN-based models and is free from overfitting[+][-]
Description:
Proceeding of: 25th IEEE International Conference on Image Processing (ICIP 2018 ), 7-10 October, 2018, Athens, Greece