First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)

Permanent URI for this collection

First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)
  • Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)


  • ÍNDICE



  • Computational Intelligence Modeling of Pharmaceutical Properties, Hossam Zawbaa

  • Towards a Smart Selection of Multi-CPUMulti-GPU Platforms for Image and Video Processing Algorithms, Sidi Ahmed Mahmoudi and Pierre Manneback

  • Energy aware execution environments and algorithms on low power multi-core architectures, Sandra Catalan, Rafael Rodríguez-Sánchez and Enrique S. Quintana-Ortí

  • CuDB: a Relational Database Engine Boosted by Graphics Processing Units, Samuel Cremer, Michel Bagein, Saïd Mahmoudi and Pierre Manneback

  • The analysis of parallel OpenFOAM solver for the heat transfer in electrical power cables, Andrej Bugajev and Raimondas Ciegis

  • Cloud resource management, Dimitris Tychalas and Helen Karatza

  • Techniques for Autotuning Algorithms on Heterogenous Platforms, Adrián Pérez Diéguez, Margarita Amor and Ramón Doallo

  • Resilience of Parallel Applications, Nuria Losada, María J. Martín and Patricia González

  • Beamforming filtering with real-time constraints on mobile embedded devices, Francisco Javier Alventosa Rueda, Pedro Alonso Jordá, Gemma Piñero Sipan and Antonio Manuel Vidal Macia

  • Data mining for autonomous wearable sensors used for elderly healthcare monitoring, Raluca Maria Aileni, Rodica Strungaru and Carlos Valderrama

  • Processor Model for the Instruction Mapping Tool, Roman Mego

  • Distributed Processing in Cloud Computing, Ilias Mavridis and Helen Karatza

  • The Analysis of Diachronic Variation in Romanian Print Press, Daniela Gîfu

  • Dynamic Management of Resource Allocation for OmpSs Jobs, Sergio Iserte, Antonio J. Peña, Rafael Mayo Gual, Enrique S. Quintana-Orti and Vicenç Beltran

  • Spatial and Temporal Cache Sharing Analysis in Tasks, Germán Ceballos, David Black-Schaffer

  • Application Partitioning and Mapping Techniques for Heterogeneous Parallel Platforms, Rafael Sotomayor and José Daniel García

  • A Framework for Knowledge Management using Complex Networks Methods, Alex Becheru

  • A generic I/O architecture for data-intensive applications based on in-memory distributed cache, Francisco Rodrigo Duro, Javier Garcia Blas and Jesus Carretero

  • Machine Learning Methods Applied to Biometrics, Cristina Madalina Noaica

  • Work in progress about enhancing the programmability and energy efficiency of storage in HPC and cloud environments, Pablo Llopis Sanmillan, Javier Garcia Blas and Florin Isaila

  • Browse

    Recent Submissions

    Now showing 1 - 20 of 21
    • Publication
      Application Partitioning and Mapping Techniques for Heterogeneous Parallel Platforms
      (2016-02) Sotomayor Fernández, Rafael; García Sánchez, José Daniel; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Parallelism has become one of the most extended paradigms used to improve performance. Legacy source code needs to be re-written so that it can take advantage of multi-core and many-core computing devices, such as GPGPU, FPGA, DSP or specific accelerators. However, it forces software developers to adapt applications and coding mechanisms in order to exploit the available computing devices. It is a time consuming and error prone task that usually results in expensive and sub-optimal parallel software. In this work, we describe a parallel programming model, a set of annotating techniques and a static scheduling algorithm for parallel applications. Their purpose is to simplify the task of transforming sequential legacy code into parallel code capable of making full use of several different computing devices with the objetive of increasing performance, lowering energy consumption and increase the productivity of the developer.
    • Publication
      Towards a Smart Selection of Hybrid Platforms for Multimedia Processing
      (2016-02) Mahmoudi, Sidi Ahmed; Manneback, Pierre; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Nowadays, images and videos have been present everywhere, they can come directly from camera, mobile devices or from other peoples that share their images and videos. The latter are used to illustrate different objects in a large number of situations. This makes from image and video processing algorithms a very important tool used for various domains related to computer vision such as video surveillance, medical imaging and database (images and videos) indexation methods. The performance of these algorithms have been so reduced due the the high intensive computation required when using new image and video standards. In this paper, we propose a new framework that allows users to select in a smart and efficient way the processing units (GPU or/and CPU) within heterogeneous systems, when treating different kinds of multimedia objects : single image, multiple images, multiple videos and video in real time. The framework disposes of different image and video primitive functions that are implemented on GPU, such as shape (silhouette) detection, motion tracking using optical flow estimation, edges and corners detection. We have exploited these functions for several situations such as indexing videos, segmenting vertebrae in in X-ray and MR images, detecting and localizing event in multi-user scenarios. Experimentation showed interesting accelerations ranging from 6 to 118, by comparison with sequential implementations. Moreover, the parallel and heterogeneous implementations offered lower power consumption as a result for the fast treatment.
    • Publication
      Work in progress about enhancing the programmability and energy efficiency of storage in HPC and cloud environments
      (2016-02) Llopis Sanmillán, Pablo; García Blas, Javier; Isaila, Florin Daniel; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      We present the work in progress for the PhD thesis titled “Enhancing the programmability and energy efficiency of storage in HPC and cloud environments”. In this thesis, we focus on studying and optimizing data movement across different layers of the operating system’s I/O stack. We study the power consumption during I/O-intensive workloads using sophisticated software and hardware instrumentation, collecting time series data from internal ATX power lines that feed every system component, and several run-time operating system metrics. Data exploration and data analysis reveal for each I/O access pattern various power and performance regimes. These regimes show how power is used by the system as data moved through the I/O stack. We use this knowledge to build I/O power models that are able to predict power consumption for different I/O workloads, and optimize the CPU device driver that manage performance states to obtain great power savings (over 30%). Finally, we develop new mechanisms and abstractions that allow co-located virtual machines to share data with each other more efficiently. Our virtualized data sharing solution reduces data movement among virtual domains, leading to energy savings I/O performance improvements.
    • Publication
      CuDB : a Relational Database Engine Boosted by Graphics Processing Units
      (2016-02) Cremer, Samuel; Bagein, Michel; Mahmoudi, Saïd; Manneback, Pierre; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      GPUs benefit from much more computation power with the same order of energy consumption than CPUs. Thanks to their massive data parallel architecture, GPUs can outperform CPUs, especially on Single Program Multiple Data (SPMD) programming paradigm on a large amount of data. Database engines are now everywhere, from different sizes and complexities, for multiple usages, embedded or distributed; in 2012, 500 million of SQLite active instances were estimated over the world. Our goal is to exploit the computation power of GPUs to improve performance of SQLite, which is a key software component of many applications and systems. In this paper, we introduce CuDB, a GPU-boosted in-memory database engine (IMDB) based on SQLite. The SQLite API remains unchanged, allowing developers to easily upgrade database engine from SQlite to CuDB even on already existing applications. Preliminary results show significant speedups of 70x with join queries on datasets of 1 million records. We also demonstrate the "memory bounded" character of GPU-databases and show the energy efficiency of our approach.
    • Publication
      Computational Intelligence Modeling of Pharmaceutical Properties
      (2016-02) Zawbaa, Hossam M.; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      In the pharmaceutical industry, a good understanding of the casual relationship between product quality and attributes of formulations is very useful in developing new products, and optimizing manufacturing processes. Feature selection is mandatory due to the abundance of noisy, irrelevant, or misleading features. The selected features will improve the performance of the prediction model and will provide a faster and more cost effective prediction than using all the features. With the big data captured in the pharmaceutical product development practice, computational intelligence (CI) models and machine learning algorithms could potentially be used to identify the process parameters of formulations and manufacturing processes. That needs a deep investigation of roller compaction process parameters of pharmaceutical formulations that affect the ribbons production. In this work, we are using the bio-inspired optimization algorithms for feature selection such as (grey wolf, Bat, flower pollination, social spider, antlion, moth-flame, genetic algorithms, and particle swarm) to predict the different pharmaceutical properties.
    • Publication
      Cloud Resource Management
      (2016-02) Tychalas, Dimitrios; Karatza, Helen; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Nowadays computational needs increase exponentially every year. We analyze, calculate and process large data sets every day and the "traditional" servers do not meet these computational criteria. As a result cloud computing was "invented" offering multiple resources at an affordable cost. Besides that, Cloud Computing supports scalability, fault tolerance and high availability [2] [16]. Our goal is to delve deeper into Cloud Computing to be able to carry out independent research to study and improve the state of the art load balancing techniques.
    • Publication
      Techniques for Autotuning Algorithms on Heterogenous Platforms
      (2016-02) Diéguez, Adrián P.; Amor, Margarita; Doallo, Ramón; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Current GPUs (Graphic Processing Units) can obtain high computational performance in scientific applications. Nevertheless, programmers have to use suitable parallel algorithms for these architectures and have to consider optimization techniques in the implementation in order to achieve that performance. This thesis is focused on designing and implementing parallel prefix algorithms into GPU architectures with little effort. For that, we have developed a very optimized library called BPLG (Tuning Butterfly Processing Library for GPUs) and based on a set of building blocks that enable to easily design well-known algorithms such as FFT, tridiagonal systems solvers, scan operator, sorting or signal processing. This library is designed under a tuning methodology based on two-stages indentified as GPU resource analysis and operator string manipulation. Specifically, this strategy is focused on a set of parallel prefix algorithms that can be represented according to a set of common permutations of the digits of each of its element indices [4], denoted as Index-Digit (ID) algorithms. So far, the proposed methodology has obtained very good results with respect to state-of-art libraries, as CUFFT, CUSPARSE, CUDPP or ModernGPU.
    • Publication
      Beamforming filtering with real-time constraints on mobile embedded devices
      (2016-02) Alventosa Rueda, Francisco Javier; Alonso Jordá, Pedro; Piñero Sipan, Gemma; Vidal Macia, Antonio Manuel; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Nowadays Tables and Smart phones are equipped with low power processor. Some of them, like the NVIDIA Tegra SoC, also come with a GPU integrated so that both, the CPU and the GPU have access directly to the same RAM memory. In another vein, one the main limitations of microphone array algorithms for audio processing is the high computational cost required to reproduce real acoustics environments when real-time signal processing is absolutely required. One of these algorithms is the Beamforming Algorithm, which is used to recover acoustic signals from their observations when they are corrupted by noise, reverberation and other interfering signals. In order to achieve real-time processing executing this algorithm we have employed high performance libraries such as OPENBLAS, LAPACK, CUBLAS, PLASMA and MAGMA, and a particular tune programming for these mobile devices.
    • Publication
      Spatial and Temporal Cache Sharing Analysis in Tasks
      (2016-02) Ceballos, Germán; Black-Schaffer, David; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Understanding performance of large scale multicore systems is crucial for getting faster execution times and optimize workload efficiency, but it is becoming harder due to the increased complexity of hardware architectures. Cache sharing is a key component for performance in modern architectures, and it has been the focus of performance analysis tools and techniques in recent years. At the same time, new programming models have been introduced to aid the programmer dealing with the complexity of large scale systems, simplifying the coding process and making applications more scalable regardless of resource sharing. Taskbased runtime systems are one example of this that became popular recently. In this work we develop models to tackle performance analysis of shared resources in the task-based context, and for that we study cache sharing both in temporal and spatial ways. In temporal cache sharing, the effect of data reused over time by the tasks executed is modeled to predict different scenarios resulting in a tool called StatTask. In spatial cache sharing, the effect of tasks fighting for the cache at a given point in time through their execution is quantified and used to model their behavior on arbitrary cache sizes. Finally, we explain how these tools set up a unique and solid platform to improve runtime systems schedulers, maximizing performance of execution of large-scale task-based applications.
    • Publication
      Resilience of Parallel Applications
      (2016-02) Losada, Nuria; Martín, María J.; González, Patricia; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Future exascale systems are predicted to be formed by millions of cores. This is a great opportunity for HPC applications, however, it is also a hazard for the completion of their execution. Even if one computation node presents a failure every one century, a machine with 100.000 nodes will encounter a failure every 9 hours. Thus, HPC applications need to make use of fault tolerance techniques to ensure they successfully finish their execution. This PhD thesis is focused on fault tolerance solutions for generic parallel applications, more specifically in checkpointing solutions. We have extended CPPC, an MPI application-level portable checkpointing tool developed in our research group, to work with OpenMP applications, and hybrid MPI-OpenMP applications. Currently, we are working on transparently obtaining resilient MPI applications, that is, applications that are able to recover themselves from failures without stopping their execution.
    • Publication
      Processor Model for the Instruction Mapping Tool
      (2016-02) Mego, Roman; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      This paper describes the model designed for the instruction mapping tool, which can be used for generating the low level assembly code for the digital signal processing algorithms. The model is based on the Very Long Instruction Word architecture. The Texas Instrument TMS320C6678 was the pattern and finally was described with the created model. The paper is showing the parameters of the hardware resources and also the instruction set.
    • Publication
      A generic I/O architecture for data-intensiveapplications based on in-memorydistributed cache
      (2016-02) Rodrigo Duro, Francisco José; García Blas, Javier; Carretero Pérez, Jesús; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      The evolution in scientific computing towards data-intensive applications and the increase of heterogeneity in the computing resources, are exposing new challenges in the I/O layer requirements. We propose a generic I/O architecture for data-intensive applications based on in-memory distributed caching. This solution leverages the evolution of network capacities and the price drop in memory to improve I/O performance for I/O-bounded applications adaptable to existing high-performance scenarios. We have showed the potential improvements
    • Publication
      The Analysis of Diachronic Variation in Romanian Print Press
      (2016-02) Gîfu, Daniela; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      The paper describes a study based on diachronic exploration of Romanian texts in order to implement a technology for detecting automatically the morpho-lexical from 1840 to nowadays. The chosen timings put in evidence the language changes, describing, also, the phenomena related to the evolution of the Romanian language, especially, in print press. We define a complex methodology for recovering of old Romanian texts in two different spaces: Romania (until 1918, representing 3 countries, Moldova, Wallachia and Transylvania) and Republic of Moldavia, the last being a territory lost of Romania after the historic events. The aim of this survey it to analyse the morphology and lexical-semantics of Romanian language, based on important corpus starting with the middle of the 19th century until today, in order to compare them, emphasizing the language differences and similarities. This work could be of interest to lexicographers and computational linguistics specialists, who want to clarify the linguistic identity.
    • Publication
      The analysis of parallel OpenFOAM solver for the heat transfer in electrical power cables
      (2016-02) Bugajev, Andrej; Ciegis, Raimondas; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Here we present the part of results obtained in PhD thesis “The investigation of efficiency of physical phenomena modelling using differential equations on distributed systems” by Andrej Bugajev. This work is dedicated to development of mathematical modelling software. While applying a numerical method it is important to take into account the limited computer resources, the architecture of these resources and how do methods affect software robustness. Three main aspects of this investigation are that software implementation must be efficient, robust and be able to utilize specific hardware resources. The hardware specificity in this work is related to distributed computations. The investigation is done for FVM method usage to implement efficient calculations of a very specific heat transferring problem. That lets to create technological components that make a software implementation robust and efficient. OpenFOAM open source software is selected as a basis for implementation of calculations and a few algorithms to solve efficiency issues are proposed. The FVM parallel solver is implemented and analyzed, it is adapted to heterogeneous cluster Vilkas.
    • Publication
      Distributed Processing in Cloud Computing
      (2016-02) Mavridis, Ilias; Karatza, Eleni; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Cloud computing offers a wide range of resources and services through the Internet that can been used for various purposes. The rapid growth of cloud computing has exempted many companies and institutions from the burden of maintaining expensive hardware and software infrastructure. With characteristics like high scalability, availability and fault tolerance, cloud computing meet the new era needs for massive data processing at an affordable cost. In our doctoral research we intend to study, analyze, evaluate and make proposals in order to further improve the performance of cloud computing.
    • Publication
      Energy aware execution environments and algorithms on low power multi-core architectures
      (2016-02) Catalán, Sandra; Rodríguez-Sánchez, Rafael; Quintana-Ortí, Enrique S.; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Energy consumption is a key aspect that conditions the proper functioning of nowadays data centers and high performance computing just like the launch of new services, due to its environmental negative impact and the increasing economic costs of energy. The energy efficiency of the applications used in these data centers could be improved, especially when systems’ utilization rate is low or moderate, or when targeting memory bounded applications. In this sense, energy proportionality stands for systems which power consumption is in line with the amount of work performed in each moment. As a response to these needs, the main objective of this project is to study, design, develop and analyze experimental solutions (models, programs, tools and techniques) aware of energy proportionality for scientific and engineering applications on low-power architectures. With the aim of showing the benefits of this contribution, two applications, coming from the image processing and dynamic molecular simulation fields, have been chosen.
    • Publication
      Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016)
      (2016-02) Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      The PhD Symposium was a very good opportunity for the young researchers to share information and knowledge, to present their current research, and to discuss topics with other students in order to look for synergies and common research topics. The idea was very successful and the assessment made by the PhD Student was very good. It also helped to achieve one of the major goals of the NESUS Action: to establish an open European research network targeting sustainable solutions for ultrascale computing aiming at cross fertilization among HPC, large scale distributed systems, and big data management, training, contributing to glue disparate researchers working across different areas and provide a meeting ground for researchers in these separate areas to exchange ideas, to identify synergies, and to pursue common activities in research topics such as sustainable software solutions (applications and system software stack), data management, energy efficiency, and resilience.
    • Publication
      Dynamic Management of Resource Allocation for OmpSs Jobs
      (2016-02) Iserte, Sergio; Peña, Antonio J.; Mayo, Rafael; Quintana-Ortí, Enrique S.; Beltran, Vicenç; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      The main purpose of this thesis is to research in the relation between task-based programming models and resource management systems in order to provide a smart autonomous load-balancing and fault-tolerant system. Thus, taking advantage of MPI malleable applications and execution models such as SMPD and MPMD we will dig in the principle of the dynamical reconfiguration. Apart from providing an overview of the thesis idea, this paper explains our initial motivation and reviews briefly the most remarkable work done in this field.
    • Publication
      Machine Learning Methods Applied to Biometrics
      (2016-02) Noaica, Cristina Madalina; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      Biometrics is a challenging field which uses physiological and behavioral characteristics of persons in order to establish their identities. Biometrics research requires the fusion of several other fields, fields that are in a continuous development. Among these fields we find image processing, pattern recognition and machine learning. There are many research oportunities in this field, some of the most recent ones being cross-sensor comparisons, liveness detection (in iris recognition), behavioral biometrics and mobile biometrics. My PhD thesis will contribute by applying Machine Learning methods at least for some of the enumerated research oportunities.
    • Publication
      A Framework for Knowledge Management using Complex Networks Methods
      (2016) Becheru, Alex; Carretero Pérez, Jesús; García Blas, Javier; Petcu, Dana
      In a world where complexity is constantly increasing due to the technological advancement, large scale of data available and increased interaction between various phenomena there was a need for a field of study to model and understand such complex systems. One such field of research is called Complex Networks Analysis (CNA) or Network Science. The heart of this research field leverages on Graph Theory and Computer Science. In this paper we shall briefly present a common framework for knowledge management using CNA methods. The power of the framework shall be proven by extracting knowledge from various heterogeneous domains like: Tourism, E-learning, Freight Transportation , and Organisational Analysis.