Publication:
People Counting in Videos by Fusing Temporal Cues from Spatial Context-Aware Convolutional Neural Networks

Loading...
Thumbnail Image
Identifiers
Publication date
2016-11-03
Defense date
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Impact
Google Scholar
Export
Research Projects
Organizational Units
Journal Issue
Abstract
We present an efficient method for people counting in video sequences from fixed cameras by utilising the responses of spatially context-aware convolutional neural networks (CNN) in the temporal domain. For stationary cameras, the background information remains fairly static, while foreground characteristics, such as size and orientation may depend on their image location, thus the use of whole frames for training a CNN improves the differentiation between background and foreground pixels. Foreground density representing the presence of people in the environment can then be associated with people counts. Moreover the fusion, of the responses of count estimations, in the temporal domain, can further enhance the accuracy of the final count. Our methodology was tested using the publicly available Mall dataset and achieved a mean deviation error of 0.091.
Description
This paper has been presented at : 14th European Conference on Computer Vision
Keywords
People counting, Convolutional neural networks, Video analysis
Bibliographic citation
Sourtzinos, P., Velastin, S.A., Jara, M., Zegers, P. y Makris, D. (2016). People Counting in Videos by Fusing Temporal Cues from Spatial Context-Aware Convolutional Neural Networks. In European Conference on Computer Vision 2016 Workshops, Part II, LNCS 9914, pp. 655–667.