Publication:
A data preparation approach for cloud storage based on containerized parallel patterns

Loading...
Thumbnail Image
Identifiers
Publication date
2019-10-10
Defense date
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Impact
Google Scholar
Export
Research Projects
Organizational Units
Journal Issue
Abstract
In this paper, we present the design, implementation, and evaluation of an efficient data preparation and retrieval approach for cloud storage. The approach includes a deduplication subsystem that indexes the hash of each content to identify duplicated data. As a consequence, avoiding duplicated content reduces reprocessing time during uploads and other costs related to outsource data management tasks. Our proposed data preparation scheme enables organizations to add properties such as security, reliability, and cost-efficiency to their contents before sending them to the cloud. It also creates recovery schemes for organizations to share preprocessed contents with partners and end-users. The approach also includes an engine that encapsulates preprocessing applications into virtual containers (VCs) to create parallel patterns that improve the efficiency of data preparation retrieval process. In a study case, real repositories of satellite images, and organizational files were prepared to be migrated to the cloud by using processes such as compression, encryption, encoding for fault tolerance, and access control. The experimental evaluation revealed the feasibility of using a data preparation approach for organizations to mitigate risks that still could arise in the cloud. It also revealed the efficiency of the deduplication process to reduce data preparation tasks and the efficacy of parallel patterns to improve the end-user service experience.
Description
Keywords
deduplication systems, virtual containers, parallel patterns, content delivery, cloud storage
Bibliographic citation
Carrizales D. et al. (2019) A Data Preparation Approach for Cloud Storage Based on Containerized Parallel Patterns. In: Montella R., Ciaramella A., Fortino G., Guerrieri A., Liotta A. (eds) Internet and Distributed Computing Systems. IDCS 2019. Lecture Notes in Computer Science, vol 11874. Springer, Cham. https://doi.org/10.1007/978-3-030-34914-1_45