Carrizales, DianaSánchez Gallegos, Dante D.Reyes, HugoGonzález Compean, J.L.Morales Sandoval, MiguelCarretero Pérez, JesúsGalaviz Mosqueda, Alejandro2021-12-172021-12-172019-10-10Carrizales D. et al. (2019) A Data Preparation Approach for Cloud Storage Based on Containerized Parallel Patterns. In: Montella R., Ciaramella A., Fortino G., Guerrieri A., Liotta A. (eds) Internet and Distributed Computing Systems. IDCS 2019. Lecture Notes in Computer Science, vol 11874. Springer, Cham. https://doi.org/10.1007/978-3-030-34914-1_45978-3-030-34913-4https://hdl.handle.net/10016/33790In this paper, we present the design, implementation, and evaluation of an efficient data preparation and retrieval approach for cloud storage. The approach includes a deduplication subsystem that indexes the hash of each content to identify duplicated data. As a consequence, avoiding duplicated content reduces reprocessing time during uploads and other costs related to outsource data management tasks. Our proposed data preparation scheme enables organizations to add properties such as security, reliability, and cost-efficiency to their contents before sending them to the cloud. It also creates recovery schemes for organizations to share preprocessed contents with partners and end-users. The approach also includes an engine that encapsulates preprocessing applications into virtual containers (VCs) to create parallel patterns that improve the efficiency of data preparation retrieval process. In a study case, real repositories of satellite images, and organizational files were prepared to be migrated to the cloud by using processes such as compression, encryption, encoding for fault tolerance, and access control. The experimental evaluation revealed the feasibility of using a data preparation approach for organizations to mitigate risks that still could arise in the cloud. It also revealed the efficiency of the deduplication process to reduce data preparation tasks and the efficacy of parallel patterns to improve the end-user service experience.eng© Springer Nature Switzerland AG 2019deduplication systemsvirtual containersparallel patternscontent deliverycloud storageA data preparation approach for cloud storage based on containerized parallel patternsconference proceedingsInformáticahttps://doi.org/10.1007/978-3-030-34914-1_45open access478490Internet and Distributed Computing Systems: 12th International Conference, IDCS 2019. Naples, Italy, October 10&-12, 2019. ProceedingsCC/0000032837