Improving the Performance of the MPI_Allreduce Collective Operation through Rank Renaming

Rico-Gallego, Juan-AntonioDíaz-Martín, Juan-CarlosCarretero Pérez, JesúsGarcía Blas, JavierBarbosa, JorgeMorla, RicardoUniversidad Carlos III de Madrid. Computer Architecture, Communications and Systems Group (ARCOS)2015-11-112015-11-112014-11Carretero Pérez, Jesús; et.al. (eds.). (2014) Proceedings of the First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014): Porto, Portugal. Universidad Carlos III de Madrid, pp. 1-6.978-84-617-2251-8https://hdl.handle.net/10016/21978Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014.Collective operations, a key issue in the global efficiency of HPC applications, are optimized in current MPI libraries by choosing at runtime between a set of algorithms, based on platform-dependent beforehand established parameters, as the message size or the number of processes. However, with progressively more cores per node, the cost of a collective algorithm must be mainly imputed to process-to-processor mapping, because its decisive influence over the network traffic. Hierarchical design of collective algorithms pursuits to minimize the data movement through the slowest communication channels of the multi-core cluster. Nevertheless, the hierarchical implementation of some collectives becomes inefficient, and even impracticable, due to the operation definition itself. This paper proposes a new approach that departs from a frequently found regular mapping, either sequential or round-robin. While keeping the mapping, the rank assignation to the processes is temporarily changed prior to the execution of the collective algorithm. The new assignation makes the communication pattern to adapt to the communication channels hierarchy. We explore this technique for the Ring algorithm when used in the well-known MPI_Allreduce collective, and discuss the obtained performance results. Extensions to other algorithms and collective operations are proposed.6application/pdfengMPI CollectivesParallel algorithmsMessage passing interfaceMulti-core clustersImproving the Performance of the MPI_Allreduce Collective Operation through Rank Renamingconference paperInformáticaopen access16Proceedings of the First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014): Porto, Portugal