Publication:
Evaluating data caching techniques in DMCF workflows using Hercules

Loading...
Thumbnail Image
Identifiers
ISBN: 978-84-608-2581-4
Publication date
2015-10
Defense date
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Impact
Google Scholar
Export
Research Projects
Organizational Units
Journal Issue
Abstract
The Data Mining Cloud Framework (DMCF) is an environment for designing and executing data analysis workflows in cloud platforms. Currently, DMCF relies on the default storage of the public cloud provider for any I/O related operation. This implies that the I/O performance of DMCF is limited by the performance of the default storage. In this work we propose the usage of the Hercules system within DMCF as an ad-hoc storage system for temporary data produced inside workflow-based applications. Hercules is a distributed in-memory storage system highly scalable and easy to deploy. The proposed solution takes advantage of the scalability capabilities of Hercules to avoid the bandwidth limits of the default storage. Early experimental results are presented in this paper, they show promising performance, particularly for write operations, compared to the performance obtained using the default storage services.
Description
Keywords
DMCF, Hercules, Data analysis, Workflows, In-memory storage, Microsoft Azure
Bibliographic citation
Carretero Pérez, Jesús; et.al. (eds.). (2015) Proceedings of the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015): Krakow, Poland. Universidad Carlos III de Madrid, pp. 95-106.