Evaluating the computational performance of the Xilinx Ultrascale+ EG Heterogeneous MPSoC

dc.affiliation.dptoUC3M. Departamento de Tecnología Electrónicaes
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Diseño Microelectrónico y Aplicaciones (DMA)es
dc.contributor.authorBelloch Rodríguez, José Antonio
dc.contributor.authorLeón, Germán
dc.contributor.authorBadia, José M.
dc.contributor.authorLindoso Muñoz, Almudena
dc.contributor.authorSan Millán Heredia, Enrique
dc.contributor.funderMinisterio de Economía y Competitividad (España)es
dc.description.abstractThe emergent technology of Multi-Processor System-on-Chip (MPSoC), which combines heterogeneous computing with the high performance of Field Programmable Gate Arrays (FPGAs) is a very interesting platform for a huge number of applications ranging from medical imaging and augmented reality to high-performance computing in space. In this paper, we focus on the Xilinx Zynq UltraScale EG Heterogeneous MPSoC, which is composed of four different processing elements (PE): a dual-core Cortex-R5, a quad-core ARM Cortex-A53, a graphics processing unit (GPU) and a high end FPGA. Proper use of the heterogeneity and the different levels of parallelism of this platform becomes a challenging task. This paper evaluates this platform and each of its PEs to carry out fundamental operations in terms of computational performance. To this end, we evaluate image-based applications and a matrix multiplication kernel. On former, the image-based applications leverage the heterogeneity of the MPSoc and strategically distributes its tasks among both kinds of CPU cores and the FPGA. On the latter, we analyze separately each PE using different matrix multiplication benchmarks in order to assess and compare their performance in terms of MFlops. This kind of operations are being carried out for example in a large number of space-related applications where the MPSoCs are currently gaining momentum. Results stand out the fact that different PEs can collaborate efficiently with the aim of accelerating the computational-demanding tasks of an application. Another important aspect to highlight is that leveraging the parallel OpenBLAS library we achieve up to 12 GFlops with the four Cortex-A53 cores of the platform, which is a considerable performance for this kind of devices.en
dc.description.sponsorshipThis work has been supported by the Spanish Government through TIN2017-82972-R, ESP2015-68245-C4-1-P, the Valencian Regional Government through PROMETEO/2029/109 and the Universitat Jaume I through UJI-B2019-36. We thank Prof. L. Kosmidis and M. M. Trompouki for providing us the OpenGL ES 2.0 code implementation of the matrix multiplication.en
dc.identifier.bibliographicCitationBelloch, J. A., León, G., Badía, J. M., Lindoso, A. & San Millan, E. (2020, 6 junio). Evaluating the computational performance of the Xilinx Ultrascale+ EG Heterogeneous MPSoC. The Journal of Supercomputing, 77(2), 2124-2137.en
dc.identifier.publicationtitleThe Journal of Supercomputingen
dc.relation.projectIDGobierno de España. TIN2017-82972-Res
dc.relation.projectIDGobierno de España. ESP2015-68245-C4-1-Pes
dc.rights© Springer Science+Business Media, LLC, part of Springer Nature 2020en
dc.rights.accessRightsopen accessen
dc.subject.otherHeterogeneous computingen
dc.subject.otherParallel computingen
dc.subject.otherMulti-processor systemon-chip (mpsoc)en
dc.subject.otherXilinx ultrascale+en
dc.titleEvaluating the computational performance of the Xilinx Ultrascale+ EG Heterogeneous MPSoCen
dc.typeresearch article*
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
482.88 KB
Adobe Portable Document Format