Publication:
A gearbox model for processing large volumes of data by using pipeline systems encapsulated into virtual containers

dc.affiliation.dptoUC3M. Departamento de Informáticaes
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Arquitectura de Computadores, Comunicaciones y Sistemases
dc.contributor.authorSantiago Durán, Miguel
dc.contributor.authorGonzález Compean, J.L.
dc.contributor.authorBrinkmann, André
dc.contributor.authorReyes Anastacio, Hugo G.
dc.contributor.authorCarretero Pérez, Jesús
dc.contributor.authorMontella, Raffaele
dc.contributor.authorToscano Pulido, Gregorio
dc.contributor.funderMinisterio de Economía y Competitividad (España)es
dc.date.accessioned2021-12-20T10:56:28Z
dc.date.available2022-06-01T23:00:04Z
dc.date.issued2020-05-01
dc.description.abstractSoftware pipelines enable organizations to chain applications for adding value to contents (e.g., confidentially, reliability, and integrity) before either sharing them with partners or sending them to the cloud. However, the pipeline components add overhead when processing large volumes of data, which can become critical in real-world scenarios. This paper presents a gearbox model for processing large volumes of data by using pipeline systems encapsulated into virtual containers. In this model, the gears represent applications, whereas gearboxes represent software pipelines. This model was implemented as a collaborative system that automatically performs Gear up (by using parallel patterns) and/or Gear down (by using in-memory storage) until all gears produce uniform data processing velocities. This model reduces delays and bottlenecks produced by the heterogeneous performance of applications included in software pipelines. The new container tool has been designed to encapsulate both the collaborative system and the software pipelines into a virtual container and deploy it on IT infrastructures. We conducted case studies to evaluate the performance of when processing medical images and PDF repositories. The incorporation of a capsule to a cloud storage service for pre-processing medical imagery was also studied. The experimental evaluation revealed the feasibility of applying the gearbox model to the deployment of software pipelines in real-world scenarios as it can significantly improve the end-user service experience when pre-processing large-scale data in comparison with state-of-the-art solutions such as Sacbe and Parsl.en
dc.description.sponsorshipThis work has been partially supported by the “Spanish Ministerio de Economia y Competitividad ” under the project grant TIN2016-79637-P “Towards Unification of HPC and Big Data paradigms”.en
dc.identifier.bibliographicCitationSantiago Durán, M., González Compean, J.L., Brinkmann, A., Reyes Anastacio, H.C., Carretero Pérez, J., Montella, R., Toscano Pulido, G. (2020). A gearbox model for processing large volumes of data by using pipeline systems encapsulated into virtual containers. Future Generation Computer Systems, 106, pp. 304-319. https://doi.org/10.1016/j.future.2020.01.014en
dc.identifier.doihttps://doi.org/10.1016/j.future.2020.01.014
dc.identifier.issn1872-7115
dc.identifier.publicationfirstpage304
dc.identifier.publicationlastpage319
dc.identifier.publicationtitleFuture generation computer systemsen
dc.identifier.publicationvolume106
dc.identifier.urihttps://hdl.handle.net/10016/33801
dc.identifier.uxxiAR/0000025709
dc.language.isoengen
dc.publisherElsevieren
dc.relation.projectIDGobierno de España. TIN2016-79637-Pes
dc.rights© 2020 Elsevier B.V. All rights reserved.en
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 España
dc.rights.accessRightsopen accessen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subject.ecienciaInformáticaes
dc.subject.othercloud storageen
dc.subject.othercontinuous deliveryen
dc.subject.otherin-memory storageen
dc.subject.otherparallel patternsen
dc.subject.othersoftware pipelinesen
dc.subject.othervirtual containersen
dc.titleA gearbox model for processing large volumes of data by using pipeline systems encapsulated into virtual containersen
dc.typeresearch article*
dc.type.hasVersionAM*
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
gearbox_FGCS_2020_ps.pdf
Size:
3.26 MB
Format:
Adobe Portable Document Format
Description: