Mapping and scheduling HPC applications for optimizing I/O

Thumbnail Image
Publication date
Defense date
Journal Title
Journal ISSN
Volume Title
Association For Computing Machinery (Acm)
Google Scholar
Research Projects
Organizational Units
Journal Issue
In HPC platforms, concurrent applications are sharing the same file system. This can lead to conflicts, especially as applications are more and more data intensive. I/O contention can represent a performance bottleneck. The access to bandwidth can be split in two complementary yet distinct problems. The mapping problem and the scheduling problem. The mapping problem consists in selecting the set of applications that are in competition for the I/O resource. The scheduling problem consists then, given I/O requests on the same resource, in determining the order to these accesses to minimize the I/O time. In this work we propose to couple a novel bandwidth-aware mapping algorithm to I/O list-scheduling policies to develop a cross-layer optimization solution. We study this solution experimentally using an I/O middleware: CLARISSE. We show that naive policies such as FIFO perform relatively well in order to schedule I/O movements, and that the important part to reduce congestion lies mostly on the mapping part. We evaluate the algorithm that we propose using a simulator that we validated experimentally. This evaluation shows important gains for the simple, bandwidth-aware mapping solution that we provide compared to its non bandwidth-aware counterpart. The gains are both in terms of machine efficiency (makespan) and application efficiency (stretch). This stresses even more the importance of designing efficient, bandwidth-aware mapping strategies to alleviate the cost of I/O congestion.
I/O scheduling, I/O contention, cross-layer optimizations, MPI
Bibliographic citation
Jesus Carretero, Emmanuel Jeannot, Guillaume Pallez, David E. Singh, and Nicolas Vidal. 2020. Mapping and Scheduling HPC Applications for Optimizing I/O. In 2020 International Conference on Supercomputing (ICS ’20), June 29-July 2, 2020, Barcelona, Spain. ACM, New York, NY, USA, 12 pages.