RT Conference Proceedings T1 A data preparation approach for cloud storage based on containerized parallel patterns A1 Carrizales, Diana A1 Sánchez Gallegos, Dante D. A1 Reyes, Hugo A1 González Compean, J.L. A1 Morales Sandoval, Miguel A1 Carretero Pérez, Jesús A1 Galaviz Mosqueda, Alejandro AB In this paper, we present the design, implementation, and evaluation of an efficient data preparation and retrieval approach for cloud storage. The approach includes a deduplication subsystem that indexes the hash of each content to identify duplicated data. As a consequence, avoiding duplicated content reduces reprocessing time during uploads and other costs related to outsource data management tasks. Our proposed data preparation scheme enables organizations to add properties such as security, reliability, and cost-efficiency to their contents before sending them to the cloud. It also creates recovery schemes for organizations to share preprocessed contents with partners and end-users. The approach also includes an engine that encapsulates preprocessing applications into virtual containers (VCs) to create parallel patterns that improve the efficiency of data preparation retrieval process. In a study case, real repositories of satellite images, and organizational files were prepared to be migrated to the cloud by using processes such as compression, encryption, encoding for fault tolerance, and access control. The experimental evaluation revealed the feasibility of using a data preparation approach for organizations to mitigate risks that still could arise in the cloud. It also revealed the efficiency of the deduplication process to reduce data preparation tasks and the efficacy of parallel patterns to improve the end-user service experience. PB Springer SN 978-3-030-34913-4 YR 2019 FD 2019-10-10 LK https://hdl.handle.net/10016/33790 UL https://hdl.handle.net/10016/33790 LA eng NO This research was supported by "Fondo Sectorial de Investigación para la Educación";, SEP-CONACyT Mexico, through projects 281565 and 285276. DS e-Archivo RD 18 jul. 2024