Publication:
Combining malleability and I/O control mechanisms to enhance the execution of multiple applications

Loading...
Thumbnail Image
Identifiers
Publication date
2019-02-01
Defense date
Advisors
Tutors
Journal Title
Journal ISSN
Volume Title
Publisher
Elsevier
Impact
Google Scholar
Export
Research Projects
Organizational Units
Journal Issue
Abstract
This work presents a common framework that integrates CLARISSE, a cross-layer runtime for the I/O software stack, and FlexMPI, a runtime that provides dynamic load balancing and malleability capabilities for MPI applications. This integration is performed both at application level, as libraries executed within the application, as well as at central-controller level, as external components that manage the execution of different applications. We show that a cooperation between both runtimes provides important benefits for overall system performance: first, by means of monitoring, the CPU, communication and I/O performances of all executing applications are collected, providing a holistic view of the complete platform utilization. Secondly, we introduce a coordinated way of using CLARISSE and FlexMPI control mechanisms, based on two different optimization strategies, with the aim of improving both the application I/O and overall system performance. Finally, we present a detailed description of this proposal, as well as an empirical evaluation of the framework on a cluster showing significant performance improvements at both application and wide-platform levels. We demonstrate that with this proposal the overall I/O time of an application can be reduced by up to 49% and the aggregated FLOPS of all running applications can be increased by 10% with respect to the baseline case. (C) 2018 Elsevier Inc. All rights reserved.
Description
Keywords
malleability, i/o scheduling, mpi hight-performance computing, cross-layer optimizations, parallel I/O, middleware, performance
Bibliographic citation
Singh, D. E., Carretero, J. (2019). Combining malleability and I/O control mechanisms to enhance the execution of multiple applications. The Journal of Systems and Software, 148, pp. 21-36. https://doi.org/10.1016/j.jss.2018.11.006