RT Conference Proceedings
T1 Improving the Performance of the MPI_Allreduce Collective Operation through Rank Renaming
A1 Rico-Gallego, Juan-Antonio
A1 Díaz-Martín, Juan-Carlos
A2 Carretero Pérez, Jesús
A2 García Blas, Javier
A2 Barbosa, Jorge
A2 Morla, Ricardo
A2 Universidad Carlos III de Madrid. Computer Architecture, Communications and Systems Group (ARCOS)
AB Collective operations, a key issue in the global efficiency of HPC applications, are optimized in current MPI libraries by choosing at runtime between a set of algorithms, based on platform-dependent beforehand established parameters, as the message size or the number of processes. However, with progressively more cores per node, the cost of a collective algorithm must be mainly imputed to process-to-processor mapping, because its decisive influence over the network traffic. Hierarchical design of collective algorithms pursuits to minimize the data movement through the slowest communication channels of the multi-core cluster. Nevertheless, the hierarchical implementation of some collectives becomes inefficient, and even impracticable, due to the operation definition itself. This paper proposes a new approach that departs from a frequently found regular mapping, either sequential or round-robin. While keeping the mapping, the rank assignation to the processes is temporarily changed prior to the execution of the collective algorithm. The new assignation makes the communication pattern to adapt to the communication channels hierarchy. We explore this technique for the Ring algorithm when used in the well-known MPI_Allreduce collective, and discuss the obtained performance results. Extensions to other algorithms and collective operations are proposed.
SN 978-84-617-2251-8
YR 2014
FD 2014-11
LK https://hdl.handle.net/10016/21978
UL https://hdl.handle.net/10016/21978
LA eng
NO Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014.
NO The work presented in this paper has been partially supported by EUunder the COST programme Action IC1305, ’Network for SustainableUltrascale Computing (NESUS)’, and by the computing facilitiesof Extremadura Research Centre for Advanced Technologies (CETACIEMAT),funded by the European Regional Development Fund(ERDF). CETA-CIEMAT belongs to CIEMAT and the Government ofSpain.
DS e-Archivo
RD 17 jul. 2024