xmlui.dri2xhtml.METS-1.0.item-contributor-funder:
European Commission
Sponsor:
This work has received funding from the EC-funded H2020 ASPIDE project (Agreement 801091: Exascale programming models for extreme data processing). This work was supported with hardware resources by the Romanian grant BID (PN-III-P1-PFE-28: Big Data
Science).
Exascale systems are a hot topic of research in computer science. These systems in contrast to current Cloud, Big Data and HPC systems will routinely contain hundreds of thousand of nodes generating millions of events. At this scale of hardware fault and anomaExascale systems are a hot topic of research in computer science. These systems in contrast to current Cloud, Big Data and HPC systems will routinely contain hundreds of thousand of nodes generating millions of events. At this scale of hardware fault and anomalous behaviour is not only more likely but to be expected. In this paper we describe the architecture of and Exascale monitoring solution coupled with an event detection component. The latter component is extremely important in order to handle the multitude of potential events. We describe the major lacking research that needs to be done, which will make event detection freezable in real world Exascale systems.[+][-]
Description:
Proceeding of: 2019 IEEE International Conference on Advanced Scientific Computing (ICASC)