Publication:
An online classification algorithm for large scale data streams: IGNGSVM

Loading...
Thumbnail Image
Identifiers
Publication date
2017-11-01
Defense date
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Impact
Google Scholar
Export
Research Projects
Organizational Units
Journal Issue
Abstract
Stream Processing has recently become one of the current commercial trends to face huge amounts of data. However, normally these techniques need specific infrastructures and high resources in terms of memory and computing nodes. This paper shows how mini-batch techniques and topology extraction methods can help making gigabytes of data to be manageable for just one server using computationally costly Machine Learning techniques as Support Vector Machines. The algorithm iGNGSVM is proposed to improve the performance of Support Vector Machines in datasets where the data is continuously arriving. It is benchmarked against a mini-batch version of LibSVM, achieving good accuracy rates and performing faster than this.
Description
Keywords
Data classification, Topology extraction, Online learning, Large datasets, Growing neural gas, Support vector machines
Bibliographic citation
Suárez-Cetrulo, A.L., Cervantes, A. (2017). An online classification algorithm for large scale data streams: iGNGSVM. Neurocomputing, 262, pp. 67-76.