Marozzo, FabrizioCarretero Pérez, JesúsRodrigo Duro, Francisco JoséGarcía Blas, JavierTalia, DomenicoTrunfio, PaoloCarretero Pérez, JesúsGarcía Blas, JavierMargenov, SvetozarUniversidad Carlos III de Madrid. Computer Architecture, Communications and Systems Group (ARCOS)2017-02-202017-02-202016-10-06Carretero Pérez, Jesús; et.al. (eds.). (2016) Proceedings of the Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016): Sofia, Bulgaria. Universidad Carlos III de Madrid, pp. 37-44.978-84-617-7450-0https://hdl.handle.net/10016/24234Proceedings of: Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016). Sofia (Bulgaria), October, 6-7, 2016.As data-intensive scientific prevalence arises, there is a necessity of simplifying the development, deployment, and execution of complex data analysis applications. The Data Mining Cloud Framework is a service-oriented system for allowing users to design and execute data analysis applications, defined as workflows, on cloud platforms, relying on cloud-provided storage services for I/O operations. Hercules is an in-memory I/O solution that can be deployed as an alternative to cloud storage services, providing additional performance and flexibility features. This work extends the DMCF-Hercules cooperation by applying novel data placement and task scheduling techniques for exposing and exploiting data locality in data-intensive workflows.8application/pdfengDMCFHerculesWorkflowsIn-memory storageData cacheMicrosoft AzureData localityA Data-Aware Scheduling Strategy for DMCF workflows over Herculesconference paperInformáticaopen access3744Proceedings of the Third International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2016): Sofia, BulgariaCC/0000024925