DI - ATC - Artículos de revistas

Permanent URI for this collection

https://hdl.handle.net/10016/25722

Browse

Now showing 1 - 20 of 59

A Generic Parallel Pattern Interface for Stream and Data Processing
(Wiley, 2017-05-01) Río Astorga, David del; Dolz Zaragoza, Manuel Francisco; Fernández Muñoz, Javier; García Sánchez, José Daniel; European Commission; Ministerio de Economía y Competitividad (España)
Current parallel programming frameworks aid developers to a great extent in implementing applications that exploit parallel hardware resources. Nevertheless, developers require additional expertise to properly use and tune them to operate efficiently on specific parallel platforms. On the other hand, porting applications between different parallel programming models and platforms is not straightforward and demands considerable efforts and specific knowledge. Apart from that, the lack of high-level parallel pattern abstractions, in those frameworks, further increases the complexity in developing parallel applications. To pave the way in this direction, this paper proposes GRPPI, a generic and reusable parallel pattern interface for both stream processing and data-intensive C++ applications. GRPPI accommodates a layer between developers and existing parallel programming frameworks targeting multi-core processors, such as C++ threads, OpenMP and Intel TBB, and accelerators, as CUDA Thrust. Furthermore, thanks to its high-level C++ application programming interface and pattern composability features, GRPPI allows users to easily expose parallelism via standalone patterns or patterns compositions matching in sequential applications. We evaluate this interface using an image processing use case and demonstrate its benefits from the usability, flexibility, and performance points of view. Furthermore, we analyze the impact of using stream and data pattern compositions on CPUs, GPUs and heterogeneous configurations.
Evaluation of vaccination strategies for the metropolitan area of Madrid via agent-based simulation
(BMJ Journals, 2022-12-09) Expósito Singh, David; Omedo Lucerón, Carmen; Limia Sánchez, Aurora; Guzmán Merino, Miguel; Duran Gonzalez, Christian; Delgado Sanz, Concepción; Gómez Barroso, Diana; Carretero Pérez, Jesús; Marinescu, María Cristina; Comunidad de Madrid; European Commission; Universidad Carlos III de Madrid
Objective We analyse the impact of different vaccination strategies on the propagation of COVID-19 within the Madrid metropolitan area, starting on 27 December 2020 and ending in Summer of 2021. Materials and methods The predictions are based on simulation using EpiGraph, an agent-based COVID-19 simulator. We first summarise the different models implemented in the simulator, then provide a comprehensive description of the vaccination model and define different vaccination strategies. The simulator—including the vaccination model—is validated by comparing its results with real data from the metropolitan area of Madrid during the third COVID-19 wave. This work considers different COVID-19 propagation scenarios for a simulated population of about 5 million. Results The main result shows that the best strategy is to vaccinate first the elderly with the two doses spaced 56 days apart; this approach reduces the final infection rate by an additional 6% and the number of deaths by an additional 3% with respect to vaccinating first the elderly at the interval recommended by the vaccine producer. The reason is the increase in the number of vaccinated individuals at any time during the simulation. Conclusion The existing level of detail and maturity of EpiGraph allowed us to evaluate complex scenarios and thus use it successfully to help guide the strategy for the COVID-19 vaccination campaign of the Spanish health authorities.
On the efficient delivery and storage of IoT data in edge-fog-cloud environments
(MDPI, 2022-09-02) Barron, Alfredo; Sánchez Gallegos, Dante D.; Carrizales Espinoza, Diana; Gonzalez-Compean, J. L.; Morales-Sandoval, Miguel
Cloud storage has become a keystone for organizations to manage large volumes of data produced by sensors at the edge as well as information produced by deep and machine learning applications. Nevertheless, the latency produced by geographic distributed systems deployed on any of the edge, the fog, or the cloud, leads to delays that are observed by end-users in the form of high response times. In this paper, we present an efficient scheme for the management and storage of Internet of Thing (IoT) data in edge-fog-cloud environments. In our proposal, entities called data containers are coupled, in a logical manner, with nano/microservices deployed on any of the edge, the fog, or the cloud. The data containers implement a hierarchical cache file system including storage levels such as in-memory, file system, and cloud services for transparently managing the input/output data operations produced by nano/microservices (e.g., a sensor hub collecting data from sensors at the edge or machine learning applications processing data at the edge). Data containers are interconnected through a secure and efficient content delivery network, which transparently and automatically performs the continuous delivery of data through the edge-fog-cloud. A prototype of our proposed scheme was implemented and evaluated in a case study based on the management of electrocardiogram sensor data. The obtained results reveal the suitability and efficiency of the proposed scheme.
Neutrino interaction classification with a convolutional neural network in the DUNE far detector
(APS, 2020-11-01) Abi, B.; Alonso Monsalve, Saúl; García Carballeira, Félix; DUNE
The Deep Underground Neutrino Experiment is a next-generation neutrino oscillation experiment that aims to measure CP-violation in the neutrino sector as part of a wider physics program. A deep learning approach based on a convolutional neural network has been developed to provide highly efficient and pure selections of electron neutrino and muon neutrino charged-current interactions. The electron neutrino (antineutrino) selection efficiency peaks at 90% (94%) and exceeds 85% (90%) for reconstructed neutrino energies between 2-5 GeV. The muon neutrino (antineutrino) event selection is found to have a maximum efficiency of 96% (97%) and exceeds 90% (95%) efficiency for reconstructed neutrino energies above 2 GeV. When considering all electron neutrino and antineutrino interactions as signal, a selection purity of 90% is achieved. These event selections are critical to maximize the sensitivity of the experiment to CP-violating effects.
CD/CV: Blockchain-based schemes for continuous verifiability and traceability of IoT data for edge-fog-cloud
(Elsevier, 2023-01-01) Martinez Rendon, Cristhian; Gonzalez-Compean, Jose Luis; Sánchez Gallegos, Dante D.; Carretero Pérez, Jesús; Ministerio de Ciencia e Innovación (España)
This paper presents a continuous delivery/continuous verifiability (CD/CV) method for IoT dataflows in edge¿fog¿cloud. A CD model based on extraction, transformation, and load (ETL) mechanism as well as a directed acyclic graph (DAG) construction, enable end-users to create efficient schemes for the continuous verification and validation of the execution of applications in edge¿fog¿cloud infrastructures. This scheme also verifies and validates established execution sequences and the integrity of digital assets. CV model converts ETL and DAG into business model, smart contracts in a private blockchain for the automatic and transparent registration of transactions performed by each application in workflows/pipelines created by CD model without altering applications nor edge¿fog¿cloud workflows. This model ensures that IoT dataflows delivers verifiable information for organizations to conduct critical decision-making processes with certainty. A containerized parallelism model solves portability issues and reduces/compensates the overhead produced by CD/CV operations. We developed and implemented a prototype to create CD/CV schemes, which were evaluated in a case study where user mobility information is used to identify interest points, patterns, and maps. The experimental evaluation revealed the efficiency of CD/CV to register the transactions performed in IoT dataflows through edge¿fog¿cloud in a private blockchain network in comparison with state-of-art solutions.
A survey of techniques for reducing interference in real-time applications on multicore platforms
(IEEE, 2022-02-15) Lugo, Tamara; Lozano, Santiago; Fernández Muñoz, Javier; Carretero Pérez, Jesús; Comunidad de Madrid
This survey reviews the scientific literature on techniques for reducing interference in real-time multicore systems, focusing on the approaches proposed between 2015 and 2020. It also presents proposals that use interference reduction techniques without considering the predictability issue. The survey highlights interference sources and categorizes proposals from the perspective of the shared resource. It covers techniques for reducing contentions in main memory, cache memory, a memory bus, and the integration of interference effects into schedulability analysis. Every section contains an overview of each proposal and an assessment of its advantages and disadvantages.
Computer-aided diagnostic for classifying chest X-ray images using deep ensemble learning
(BMC, 2022-10-15) Visuña, Lara; García Blas, Francisco Javier; Yang, Dandi; Carretero Pérez, Jesús; European Commission
Background: Nowadays doctors and radiologists are overwhelmed with a huge amount of work. This led to the effort to design different Computer-Aided Diagnosis systems (CAD system), with the aim of accomplishing a faster and more accurate diagnosis. The current development of deep learning is a big opportunity for the development of new CADs. In this paper, we propose a novel architecture for a convolutional neural network (CNN) ensemble for classifying chest X-ray (CRX) images into four classes: viral Pneumonia, Tuberculosis, COVID-19, and Healthy. Although Computed tomography (CT) is the best way to detect and diagnoses pulmonary issues, CT is more expensive than CRX. Furthermore, CRX is commonly the first step in the diagnosis, so it’s very important to be accurate in the early stages of diagnosis and treatment. Results: We applied the transfer learning technique and data augmentation to all CNNs for obtaining better performance. We have designed and evaluated two different CNN-ensembles: Stacking and Voting. This system is ready to be applied in a CAD system to automated diagnosis such a second or previous opinion before the doctors or radiology’s. Our results show a great improvement, 99% accuracy of the Stacking Ensemble and 98% of accuracy for the the Voting Ensemble. Conclusions: To minimize missclassifications, we included six different base CNN models in our architecture (VGG16, VGG19, InceptionV3, ResNet101V2, DenseNet121 and CheXnet) and it could be extended to any number as well as we expect extend the number of diseases to detected. The proposed method has been validated using a large dataset created by mixing several public datasets with different image sizes and quality. As we demonstrate in the evaluation carried out, we reach better results and generalization compared with previous works. In addition, we make a first approach to explainable deep learning with the objective of providing professionals more information that may be valuable when evaluating CRXs.
Surfing the optimization space of a multiple-GPU parallel implementation of a X-ray tomography reconstruction algorithm
(Elsevier, 2014-09-01) García Blas, Francisco Javier; Abella García, Mónica; Isaila, Florin Daniel; Carretero Pérez, Jesús; Desco Menéndez, Manuel; Comunidad de Madrid
The increasing popularity of massively parallel architectures based on accelerators have opened up the possibility of significantly improving the performance of X-ray computed tomography (CT) applications towards achieving real-time imaging. However, achieving this goal is a challenging process, as most CT applications have not been designed for exploiting the amount of parallelism existing in these architectures. In this paper we present the massively parallel implementation and optimization of Mangoose(++), a CT application for reconstructing 3D volumes from 20 images collected by scanners based on cone-beam geometry. The main contribution of this paper are the following. First, we develop a modular application design that allows to exploit the functional parallelism inside the application and to facilitate the parallelization of individual application phases. Second, we identify a set of optimizations that can be applied individually and in combination for optimally deploying the application on a massively parallel multi-GPU system. Third, we present a study of surfing the optimization space of the modularized application and demonstrate that a significant benefit can be obtained from employing the adequate combination of application optimizations. (C) 2014 Elsevier Inc. All rights reserved.
Making the case for reforming the I/O software stack of extreme-scale systems
(Elsevier, 2017-09-01) Isaila, Florin Daniel; García Blas, Francisco Javier; Carretero Pérez, Jesús; Ross, Robert; Kimpe, Dries; European Commission
Different aspects of workflow scheduling in large-scale distributed systems
(Elsevier, 2017-01-01) Stavrinides, Georgios L.; Rodrigo Duro, Francisco José; Karatza, Helen D.; García Blas, Francisco Javier; Carretero Pérez, Jesús; European Commission
As large-scale distributed systems gain momentum, the scheduling of workflow applications with multiple requirements in such computing platforms has become a crucial area of research. In this paper, we investigate the workflow scheduling problem in large-scale distributed systems, from the Quality of Service (QoS) and data locality perspectives. We present a scheduling approach, considering two models of synchronization for the tasks in a workflow application: (a) communication through the network and (b) communication through temporary files. Specifically, we investigate via simulation the performance of a heterogeneous distributed system, where multiple soft real-time workflow applications arrive dynamically. The applications are scheduled under various tardiness bounds, taking into account the communication cost in the first case study and the I/O cost and data locality in the second.
SkyCDS: A resilient content delivery service based on diversified cloud storage
(Elsevier, 2015-05-01) González, J.L.; Carretero Pérez, Jesús; Sosa-Sosa, Victor J.; Sánchez García, Luis Miguel; Bergua Guerra, Borja Maximiliano; European Commission
Cloud-based storage is a popular outsourcing solution for organizations to deliver contents to end-users. However, there is a need for contingency plans to ensure service provision when the provider either suffers outages or is going out of business. This paper presents SkyCDS: a resilient content delivery service based on a publish/subscribe overlay over diversified cloud storage. SkyCDS splits the content delivery into metadata and content storage flow layers. The metadata flow layer is based on publish-subscribe patterns for insourcing the metadata control back to content owner. The storage layer is based on dispersal information over multiple cloud locations with which organizations outsource content storage in a controlled manner. In SkyCDS, the content dispersion is performed on the publisher side and the content retrieving process on the end-user side (the subscriber), which reduces the load on the organization side only to metadata management. SkyCDS also lowers the overhead of the content dispersion and retrieving processes by taking advantage of multi-core technology. A new allocation strategy based on cloud storage diversification and failure masking mechanisms minimize side effects of temporary, permanent cloud-based service outages and vendor lock-in. We developed a SkyCDS prototype that was evaluated by using synthetic workloads and a study case with real traces. Publish/subscribe queuing patterns were evaluated by using a simulation tool based on characterized metrics taken from experimental evaluation. The evaluation revealed the feasibility of SkyCDS in terms of performance, reliability and storage space profitability. It also shows a novel way to compare the storage/delivery options through risk assessment. (C) 2015 Elsevier B.V. All rights reserved.
Assessing population-sampling strategies for reducing the COVID-19 incidence
(ELSEVIER BV, 2021-12) Guzmán Merino, Miguel; Durán, Christian; Marinescu, María Cristina; Delgado Sanz, Concepción; Gómez Barroso, Diana; Carretero Pérez, Jesús; Singh, David E.; European Commission; Ministerio de Sanidad, Consumo y Bienestar Social (España)
As long as critical levels of vaccination have not been reached to ensure heard immunity, and new SARS-CoV-2 strains are developing, the only realistic way to reduce the infection speed in a population is to track the infected individuals before they pass on the virus. Testing the population via sampling has shown good results in slowing the epidemic spread. Sampling can be implemented at different times during the epidemic and may be done either per individual or for combined groups of people at a time. The work we present here makes two main contributions. We first extend and refine our scalable agent-based COVID-19 simulator to incorporate an improved socio-demographic model which considers professions, as well as a more realistic population mixing model based on contact matrices per country. These extensions are necessary to develop and test various sampling strategies in a scenario including the 62 largest cities in Spain; this is our second contribution. As part of the evaluation, we also analyze the impact of different parameters, such as testing frequency, quarantine time, percentage of quarantine breakers, or group testing, on sampling efficacy. Our results show that the most effective strategies are pooling, rapid antigen test campaigns, and requiring negative testing for access to public areas. The effectiveness of all these strategies can be greatly increased by reducing the number of contacts for infected individual.
A wot-based method for creating digital sentinel twins of iot devices
(MDPI, 2021-08) López Arévalo, Iván; Gonzalez-Compean, Jose Luis; Hinojosa-Tijerina, Mariana; Martinez Rendon, Cristhian; Montella, Raffaele; Martinez-Rodriguez, Jose L.
The data produced by sensors of IoT devices are becoming keystones for organizations to conduct critical decision-making processes. However, delivering information to these processes in real-time represents two challenges for the organizations: the first one is achieving a constant dataflow from IoT to the cloud and the second one is enabling decision-making processes to retrieve data from dataflows in real-time. This paper presents a cloud-based Web of Things method for creating digital twins of IoT devices (named sentinels).The novelty of the proposed approach is that sentinels create an abstract window for decision-making processes to: (a) find data (e.g., properties, events, and data from sensors of IoT devices) or (b) invoke functions (e.g., actions and tasks) from physical devices (PD), as well as from virtual devices (VD). In this approach, the applications and services of decision-making processes deal with sentinels instead of managing complex details associated with the PDs, VDs, and cloud computing infrastructures. A prototype based on the proposed method was implemented to conduct a case study based on a blockchain system for verifying contract violation in sensors used in product transportation logistics. The evaluation showed the effectiveness of sentinels enabling organizations to attain data from IoT sensors and the dataflows used by decision-making processes to convert these data into useful information.
On the continuous contract verification using blockchain and real-time data
(Springer, 2021-02-23) Martinez Rendon, Cristhian; Camarmas Alonso, Diego; Carretero Pérez, Jesús; González Compean, J.L.; Comunidad de Madrid
Supply chains play today a crucial role in the success of a company's logistics. In the last years, multiple investigations focus on incorporating new technologies to the supply chains, being Internet of Things (IoT) and blockchain two of the most recent and popular technologies applied. However, their usage has currently considerable challenges, such as transactions performance, scalability, and near real-time contract verification. In this paper we propose a model for continuous verification of contracts in supply chains using the benefits of blockchain technology and real-time data acquisition from IoT devices for early decision-making. We propose two platform independent optimization techniques (atomic transactions and grouped validation) that enhances data transactions protocol and the data storage procedure and a method for continuous verification of contracts, which allows to take corrective actions to reduce ¿
CloudBench: an integrated evaluation of VM placement algorithms in clouds
(Springer, 2020-01-10) Gomez Rodriguez, Mario A.; Gómez Rodríguez, Mario A.; Sosa-Sosa, Victor J.; Carretero Pérez, Jesús; González, José Luis; Ministerio de Economía, Industria y Competitividad (España)
A complex and important task in the cloud resource management is the efficient allocation of virtual machines (VMs), or containers, in physical machines (PMs). The evaluation of VM placement techniques in real-world clouds can be tedious, complex and time-consuming. This situation has motivated an increasing use of cloud simulators that facilitate this type of evaluations. However, most of the reported VM placement techniques based on simulations have been evaluated taking into account one specific cloud resource (e.g., CPU), whereas values often unrealistic are assumed for other resources (e.g., RAM, awaiting times, application workloads, etc.). This situation generates uncertainty, discouraging their implementations in real-world clouds. This paper introduces CloudBench, a methodology to facilitate the evaluation and deployment of VM placement strategies in private clouds. CloudBench considers the integration of a cloud simulator with a real-world private cloud. Two main tools were developed to support this methodology, a specialized multi-resource cloud simulator (CloudBalanSim), which is in charge of evaluating VM placement techniques, and a distributed resource manager (Balancer), which deploys and tests in a real-world private cloud the best VM placement configurations that satisfied user requirements defined in the simulator. Both tools generate feedback information, from the evaluation scenarios and their obtained results, which is used as a learning asset to carry out intelligent and faster evaluations. The experiments implemented with the CloudBench methodology showed encouraging results as a new strategy to evaluate and deploy VM placement algorithms in the cloud.
Improving performance and capacity utilization in cloud storage for content delivery and sharing services
(IEEE, 2020-01-21) Sosa-Sosa, Victor J.; Barrón, Alfredo; González Compean, J.L.; Carretero Pérez, Jesús; López Arévalo, Iván; Ministerio de Economía, Industria y Competitividad (España)
Content delivery and sharing (CDS) is a popular and cost effective cloud-based service for organizations to deliver/share contents to/with end-users, partners and insider users. This type of service improves the data availability and I/O performance by producing and distributing replicas of shared contents. However, such a technique increases the storage/network resources utilization. This paper introduces a threefold methodology to improve the trade-off between I/O performance and capacity utilization of cloud storage for CDS services. This methodology includes: i) Definition of a classification model for identifying types of users and contents by analyzing their consumption/ demand and sharing patterns, ii) Usage of the classification model for defining content availability and load balancing schemes, and iii) Integration of a dynamic availability scheme into a cloud based CDS system. Our method was implemented ¿
From the edge to the cloud: A continuous delivery and preparation model for processing big IoT data
(Elsevier, 2020-12-01) Sánchez Gallegos, Dante D.; Carrizales Espinoza, Diana; Reyes Anastacio, Hugo G.; González Compean, J.L.; Carretero Pérez, Jesús; Morales Sandoval, Miguel; Galaviz Mosqueda, Alejandro; Comunidad de Madrid
New parallel and distributed tools and algorithms for life sciences
(Elsevier, 2020-11-01) Carretero Pérez, Jesús; Krefting, Dagmar
Computational methods are nowadays ubiquitous in the field of bioinformatics and biomedicine.Besides established fields like molecular dynamics, genomics or neuroimaging, new emerging methodsrely heavily on large scale computational resources. These new methods need to manage Tbytes orPbytes of data with large-scale structural and functional relationships, TFlops or PFlops of computingpower for simulating highly complex models, or many-task processes and workflows for processingand analyzing data. Today, many areas in Life Sciences are facing these challenges. This special issuecontains papers showing existing solutions and latest developments in Life Sciences and Computing Sciences to collaboratively explore new ideas and approaches to successfully apply distributed IT-systems in translational research, clinical intervention, and decision-making.
On the continuous processing of health data in edge-fog-cloud computing by using micro/nanoservice composition
(IEEE, 2020-06-30) Sánchez Gallegos, Dante D.; Galaviz Mosqueda, Alejandro; González Compean, J.L.; Villarreal Reyes, Salvador; Pérez Ramos, Aldo E.; Carrizales Espinoza, Diana; Carretero Pérez, Jesús; Comunidad de Madrid
The edge, the fog, the cloud, and even the end-user's devices play a key role in the management of the health sensitive content/data lifecycle. However, the creation and management of solutions including multiple applications executed by multiple users in multiple environments (edge, the fog, and the cloud) to process multiple health repositories that, at the same time, fulfilling non-functional requirements (NFRs) represents a complex challenge for health care organizations. This paper presents the design, development, and implementation of an architectural model to create, on-demand, edge-fog-cloud processing structures to continuously handle big health data and, at the same time, to execute services for fulfilling NFRs. In this model, constructive and modular $blocks$ , implemented as microservices and nanoservices, are recursively interconnected to create edge-fog-cloud processing structures as ¿
A gearbox model for processing large volumes of data by using pipeline systems encapsulated into virtual containers
(Elsevier, 2020-05-01) Santiago Durán, Miguel; González Compean, J.L.; Brinkmann, André; Reyes Anastacio, Hugo G.; Carretero Pérez, Jesús; Montella, Raffaele; Toscano Pulido, Gregorio; Ministerio de Economía y Competitividad (España)
Software pipelines enable organizations to chain applications for adding value to contents (e.g., confidentially, reliability, and integrity) before either sharing them with partners or sending them to the cloud. However, the pipeline components add overhead when processing large volumes of data, which can become critical in real-world scenarios. This paper presents a gearbox model for processing large volumes of data by using pipeline systems encapsulated into virtual containers. In this model, the gears represent applications, whereas gearboxes represent software pipelines. This model was implemented as a collaborative system that automatically performs Gear up (by using parallel patterns) and/or Gear down (by using in-memory storage) until all gears produce uniform data processing velocities. This model reduces delays and bottlenecks produced by the heterogeneous performance of applications included in software pipelines. The new container tool has been designed to encapsulate both the collaborative system and the software pipelines into a virtual container and deploy it on IT infrastructures. We conducted case studies to evaluate the performance of when processing medical images and PDF repositories. The incorporation of a capsule to a cloud storage service for pre-processing medical imagery was also studied. The experimental evaluation revealed the feasibility of applying the gearbox model to the deployment of software pipelines in real-world scenarios as it can significantly improve the end-user service experience when pre-processing large-scale data in comparison with state-of-the-art solutions such as Sacbe and Parsl.

Browse

Recent Submissions