Sponsor:
This work was supported in part by the Valencian Regional Government under Grant PROMETEO/2019/109, in part
by Jaume I University under Project UJIB2019-36, and in part by the Spanish
Ministry of Science and Innovation under Project PID2019-106455GB-C21 and Project PID2020-113656RB-C21.
Project:
Gobierno de España. PID2019-106455GB-C21 Gobierno de España. PID2020-113656RB-C21
Keywords:
Embedded systems
,
Graphics processing unit (GPU)
,
Parallelization
,
Proton irradiation
Commercial off-the-shelf (COTS) system-on-chip (SoC) are becoming widespread in embedded systems. Many of them include a multicore central processing unit (CPU) and a high-end graphics processing unit (GPU). They combine high computational performance with lowCommercial off-the-shelf (COTS) system-on-chip (SoC) are becoming widespread in embedded systems. Many of them include a multicore central processing unit (CPU) and a high-end graphics processing unit (GPU). They combine high computational performance with low power consumption and flexible multilevel parallelism. This kind of device is also being considered for radiation environments where large amounts of data must be processed or compute-intensive applications must be executed. In this article, we compare three different strategies to perform matrix multiplication in the GPU of a Tegra TK1 SoC. Our aim is to analyze how the different use of the resources of the GPU influences not only the computational performance of the algorithm, but also its radiation sensitivity. Radiation experiments with protons were performed to compare the behavior of the three strategies. Experimental results show that most of the errors force a reboot of the platform. The number of errors is directly related with how the algorithms use the internal memories of the GPU and increases with the matrix size. It is also related with the number of transactions with the global memory, which in our experiments is not affected by the radiation. Results show that the smallest cross section is obtained with the fastest algorithm, even if it uses the cores of the GPU more intensively.[+][-]