Publication:
Strategies to parallelize a finite element mesh truncation technique on multi-core and many-core architectures

dc.affiliation.dptoUC3M. Departamento de Tecnología Electrónicaes
dc.affiliation.dptoUC3M. Departamento de Teoría de la Señal y Comunicacioneses
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Diseño Microelectrónico y Aplicaciones (DMA)es
dc.affiliation.grupoinvUC3M. Grupo de Investigación: Radiofrecuencia, Electromagnetismo, Microondas y Antenas (GREMA)es
dc.contributor.authorBadía, José M.
dc.contributor.authorAmor Martín, Adrián
dc.contributor.authorBelloch Rodríguez, José Antonio
dc.contributor.authorGarcía Castillo, Luis Emilio
dc.contributor.funderComunidad de Madrides
dc.contributor.funderMinisterio de Ciencia e Innovación (España)es
dc.date.accessioned2023-05-23T07:53:41Z
dc.date.available2023-05-23T07:53:41Z
dc.date.issued2023-05
dc.description.abstractAchieving maximum parallel performance on multi-core CPUs and many-core GPUs is a challenging task depending on multiple factors. These include, for example, the number and granularity of the computations or the use of the memories of the devices. In this paper, we assess those factors by evaluating and comparing different parallelizations of the same problem on a multiprocessor containing a CPU with 40 cores and four P100 GPUs with Pascal architecture. We use, as study case, the convolutional operation behind a non-standard finite element mesh truncation technique in the context of open region electromagnetic wave propagation problems. A total of six parallel algorithms implemented using OpenMP and CUDA have been used to carry out the comparison by leveraging the same levels of parallelism on both types of platforms. Three of the algorithms are presented for the first time in this paper, including a multi-GPU method, and two others are improved versions of algorithms previously developed by some of the authors. This paper presents a thorough experimental evaluation of the parallel algorithms on a radar cross-sectional prediction problem. Results show that performance obtained on the GPU clearly overcomes those obtained in the CPU, much more so if we use multiple GPUs to distribute both data and computations. Accelerations close to 30 have been obtained on the CPU, while with the multi-GPU version accelerations larger than 250 have been achieved.en
dc.description.sponsorshipOpen Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work has been supported by the Spanish Government PID2020-113656RB-C21, PID2019-106455GB-C21 and by the Valencian Regional Government through PROMETEO/2019/109, as well as the Regional Government of Madrid throughout the project MIMACUHSPACE-CM-UC3M.en
dc.format.extent17
dc.identifier.bibliographicCitationBadía, J. M., Amor-Martin, A., Belloch, J. A., & Garcia-Munoz, L. E. (2023). Strategies to parallelize a finite element mesh truncation technique on multi-core and many-core architectures. The Journal of Supercomputing, 79(7), 7648–7664.en
dc.identifier.doihttp://dx.doi.org/10.1007/s11227-022-04975-6
dc.identifier.issn0920-8542
dc.identifier.publicationfirstpage7648
dc.identifier.publicationissue7
dc.identifier.publicationlastpage7664
dc.identifier.publicationtitleThe Journal of Supercomputingen
dc.identifier.publicationvolume79
dc.identifier.urihttps://hdl.handle.net/10016/37333
dc.identifier.uxxiAR/0000032923
dc.language.isoeng
dc.publisherSpringeren
dc.relation.projectIDGobierno de España. PID2019-106455GB-C21es
dc.relation.projectIDComunidad de Madrid. MIMACUHSPACE-CM-UC3Mes
dc.relation.projectIDGobierno de España. PID2020-113656RB-C21es
dc.rights© The Author(s) 2022.en
dc.rightsAtribución 3.0 España*
dc.rights.accessRightsopen accessen
dc.rights.urihttp://creativecommons.org/licenses/by/3.0/es/*
dc.subject.ecienciaElectrónicaes
dc.subject.ecienciaInformáticaes
dc.subject.ecienciaTelecomunicacioneses
dc.subject.otherParallel computingen
dc.subject.otherCUDAen
dc.subject.otherOpenMPen
dc.subject.otherFinite elementsen
dc.subject.otherGPUen
dc.titleStrategies to parallelize a finite element mesh truncation technique on multi-core and many-core architecturesen
dc.typeresearch article*
dc.type.hasVersionVoR*
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Strategies_TJS_2023.pdf
Size:
753.55 KB
Format:
Adobe Portable Document Format