Improving the Performance of the MPI_Allreduce Collective Operation through Rank Renaming

Rico-Gallego, Juan-Antonio; Díaz-Martín, Juan-Carlos

Publication:
Improving the Performance of the MPI_Allreduce Collective Operation through Rank Renaming

dc.affiliation.dpto	UC3M. Departamento de Informática	es
dc.affiliation.grupoinv	UC3M. Grupo de Investigación: Arquitectura de Computadores, Comunicaciones y Sistemas	es
dc.contributor.author	Rico-Gallego, Juan-Antonio
dc.contributor.author	Díaz-Martín, Juan-Carlos
dc.contributor.editor	Carretero Pérez, Jesús
dc.contributor.editor	García Blas, Javier
dc.contributor.editor	Barbosa, Jorge
dc.contributor.editor	Morla, Ricardo
dc.contributor.other	Universidad Carlos III de Madrid. Computer Architecture, Communications and Systems Group (ARCOS)
dc.date.accessioned	2015-11-11T10:30:47Z
dc.date.available	2015-11-11T10:30:47Z
dc.date.issued	2014-11
dc.description	Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014.	en
dc.description.abstract	Collective operations, a key issue in the global efficiency of HPC applications, are optimized in current MPI libraries by choosing at runtime between a set of algorithms, based on platform-dependent beforehand established parameters, as the message size or the number of processes. However, with progressively more cores per node, the cost of a collective algorithm must be mainly imputed to process-to-processor mapping, because its decisive influence over the network traffic. Hierarchical design of collective algorithms pursuits to minimize the data movement through the slowest communication channels of the multi-core cluster. Nevertheless, the hierarchical implementation of some collectives becomes inefficient, and even impracticable, due to the operation definition itself. This paper proposes a new approach that departs from a frequently found regular mapping, either sequential or round-robin. While keeping the mapping, the rank assignation to the processes is temporarily changed prior to the execution of the collective algorithm. The new assignation makes the communication pattern to adapt to the communication channels hierarchy. We explore this technique for the Ring algorithm when used in the well-known MPI_Allreduce collective, and discuss the obtained performance results. Extensions to other algorithms and collective operations are proposed.	en
dc.description.sponsorship	The work presented in this paper has been partially supported by EU under the COST programme Action IC1305, ’Network for Sustainable Ultrascale Computing (NESUS)’, and by the computing facilities of Extremadura Research Centre for Advanced Technologies (CETACIEMAT), funded by the European Regional Development Fund (ERDF). CETA-CIEMAT belongs to CIEMAT and the Government of Spain.	en
dc.format.extent	6
dc.format.mimetype	application/pdf
dc.identifier.bibliographicCitation	Carretero Pérez, Jesús; et.al. (eds.). (2014) Proceedings of the First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014): Porto, Portugal. Universidad Carlos III de Madrid, pp. 1-6.	en
dc.identifier.isbn	978-84-617-2251-8
dc.identifier.publicationfirstpage	1
dc.identifier.publicationlastpage	6
dc.identifier.publicationtitle	Proceedings of the First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014): Porto, Portugal	en
dc.identifier.uri	https://hdl.handle.net/10016/21978
dc.language.iso	eng
dc.relation.eventdate	August 27-28, 2014	en
dc.relation.eventnumber	1
dc.relation.eventplace	Porto, Portugal	en
dc.relation.eventtitle	International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014)	en
dc.rights.accessRights	open access
dc.subject.eciencia	Informática	es
dc.subject.other	MPI Collectives	en
dc.subject.other	Parallel algorithms	en
dc.subject.other	Message passing interface	en
dc.subject.other	Multi-core clusters	en
dc.title	Improving the Performance of the MPI_Allreduce Collective Operation through Rank Renaming	en
dc.type	conference paper	*
dc.type.hasVersion	VoR	*
dspace.entity.type	Publication

Files

Original bundle

Now showing 1 - 1 of 1

Name:: improving_NESUS_2014.pdf
Size:: 551.92 KB
Format:: Adobe Portable Document Format

Download

Collections

First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014)

Publication: Improving the Performance of the MPI_Allreduce Collective Operation through Rank Renaming

Files

Original bundle

Collections

Publication:
Improving the Performance of the MPI_Allreduce Collective Operation through Rank Renaming