Publication:
Evaluating the soft error sensitivity of a GPU-based SoC for matrix multiplication

dc.affiliation.dptoUC3M. Departamento de Tecnología Electrónicaes
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Diseño Microelectrónico y Aplicaciones (DMA)es
dc.contributor.authorLeón, Germán
dc.contributor.authorBadía, Jose M.
dc.contributor.authorBelloch Rodríguez, José Antonio
dc.contributor.authorLindoso Muñoz, Almudena
dc.contributor.authorEntrena Arrontes, Luis Alfonso
dc.contributor.funderMinisterio de Economía y Competitividad (España)es
dc.contributor.funderMinisterio de Ciencia y Tecnología (España)es
dc.date.accessioned2022-09-01T09:44:11Z
dc.date.available2022-11-01T00:00:07Z
dc.date.issued2020-11
dc.descriptionProceeding of: 31th European Symposium on Reliability of Electron Devices, Failure Physics and Analysis (ESREF 2020), Athens, Greece, 4th to 8 October 2020 (Virtual conference)en
dc.description.abstractSystem-on-Chip (SoC) devices can be composed of low-power multicore processors combined with a small graphics accelerator (or GPU) which offers a trade-off between computational capacity and low-power consumption. In this work we use the LLFI-GPU fault injection tool on one of these devices to compare the sensitivity to soft errors of two different CUDA versions of matrix multiplication benchmark. Specifically, we perform fault injection campaigns on a Jetson TK1 development kit, a board equipped with a SoC including an NVIDIA 'Kepler” Graphics Processing Unit (GPU). We evaluate the effect of modifying the size of the problem and also the thread-block size on the behaviour of the algorithms. Our results show that the block version of the matrix multiplication benchmark that leverages the shared memory of the GPU is not only faster than the element-wise version, but it is also much more resilient to soft errors. We also use the cuda-gdb debugger to analyze the main causes of the crashes in the code due to soft errors. Our experiments show that most of the errors are due to accesses to invalid positions of the different memories of the GPU, which causes that the block version suffers a higher percentage of this kind of errors.en
dc.description.sponsorshipThis work has been supported by the Spanish Government through TIN2017-82972-R and ESP2015-68245-C4-1-P, and by the Valencian Regional Government through PROMETEO/2019/109.en
dc.format.extent6es
dc.identifier.bibliographicCitationGermán, L., et al. Evaluating the soft error sensitivity of a GPU-based SoC for matrix multiplication. Microelectronics reliability, Vol. 114, 113856 , (Special issue: Proceedings of ESREF 2020, 31th European Symposium on Reliability of Electron Devices, Failure Physics and Analysis). Elsevier, 2020, 6 p.en
dc.identifier.doihttps://doi.org/10.1016/j.microrel.2020.113856
dc.identifier.issn0026-2714
dc.identifier.publicationfirstpage1es
dc.identifier.publicationlastpage6es
dc.identifier.publicationtitleMicroelectronics reliability (Special issue: Proceedings of ESREF 2020, 31th European Symposium on Reliability of Electron Devices, Failure Physics and Analysis)en
dc.identifier.publicationvolume114, 113856es
dc.identifier.urihttps://hdl.handle.net/10016/35621
dc.identifier.uxxiCC/0000033574
dc.language.isoengen
dc.publisherElsevier Ltd.en
dc.relation.eventdate2020-10-04es
dc.relation.eventplaceAtenas, GRECIA (Conferencia virtual)es
dc.relation.eventtitle31th European Symposium on Reliability of Electron Devices, Failure Physics and Analysis (ESREF 2020)en
dc.relation.projectIDGobierno de España. TIN2017-82972-Res
dc.relation.projectIDGobierno de España. ESP2015-68245-C4-1-Pes
dc.rights© 2020 Elsevier Ltd. All rights reserved.en
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.en
dc.rightsAtribución-NoComercial-SinDerivadas 3.0 España*
dc.rights.accessRightsopen accessen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/es/*
dc.subject.ecienciaElectrónicaes
dc.subject.otherGPUen
dc.subject.otherSoft errorsen
dc.subject.otherSensitivityen
dc.subject.otherFault injectionen
dc.titleEvaluating the soft error sensitivity of a GPU-based SoC for matrix multiplicationen
dc.typeconference paper*
dc.type.hasVersionAM*
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
evaluating_ESREF_2020_ps.pdf
Size:
258.23 KB
Format:
Adobe Portable Document Format