DI - GCERN - Capítulos de Monografías

Permanent URI for this collection


Recent Submissions

Now showing 1 - 20 of 100
  • Publication
    Exploring the Application of Hybrid Evolutionary Computation Techniques to Physical Activity Recognition
    (Association for Computing Machinery, 2016-07) Baldominos Gómez, Alejandro; Barrio Cerro, María del Carmen del; Sáez Achaerandio, Yago; European Commission; Ministerio de Economía y Competitividad (España)
    This paper focuses on the problem of physical activity recognition, i.e., the development of a system which is able to learn patterns from data in order to be able to detect which physical activity (e.g. running, walking, ascending stairs, etc.) a certain user is performing.While this field is broadly explored in the literature, there are few works that face the problem with evolutionary computation techniques. In this case, we propose a hybrid system which combines particle swarm optimization for clustering features and genetic programming combined with evolutionary strategies for evolving a population of classifiers, shaped in the form of decision trees. This system would run the segmentation, feature extraction and classification stages of the activity recognition chain.For this paper, we have used the PAMAP2 dataset with a basic preprocessing. This dataset is publicly available at UCI ML repository. Then, we have evaluated the proposed system using three different modes: a user-independent, a user-specific and a combined one. The results in terms of classification accuracy were poor for the first and the last mode, but it performed significantly well for the user-specific case. This paper aims to describe work in progress, to share early results an discuss them. There are many things that could be improved in this proposed system, but overall results were interesting especially because no manual data transformation took place.
  • Publication
    Predicting Global Irradiance Combining Forecasting Models Through Machine Learning
    (Springer, 2018-06-08) Huertas Tato, Javier; Aler, Ricardo; Rodríguez Benítez, F. J.; Arbizu Barrena, C.; Galván, Inés M.; Ministerio de Economía y Competitividad (España)
    Predicting solar irradiance is an active research problem, with many physical models having being designed to accurately predict Global Horizontal Irradiance. However, some of the models are better at short time horizons, while others are more accurate for medium and long horizons. The aim of this research is to automatically combine the predictions of four different models (Smart Persistence, Satellite, Cloud Index Advection and Diffusion, and Solar Weather Research and Forecasting) by means of a state-of-the-art machine learning method (Extreme Gradient Boosting). With this purpose, the four models are used as inputs to the machine learning model, so that the output is an improved Global Irradiance forecast. A 2-year dataset of predictions and measures at one radiometric station in Seville has been gathered to validate the method proposed. Three approaches are studied: a general model, a model for each horizon, and models for groups of horizons. Experimental results show that the machine learning combination of predictors is, on average, more accurate than the predictors themselves.
  • Publication
    Studying the Effect of Measured Solar Power on Evolutionary Multi-objective Prediction Intervals
    (Springer, 2018-11-09) Martin Vazquez, Ruben; Huertas Tato, Javier; Aler, Ricardo; Galván, Inés M.; Ministerio de Economía y Competitividad (España)
    While it is common to make point forecasts for solar energy generation, estimating the forecast uncertainty has received less attention. In this article, prediction intervals are computed within a multi-objective approach in order to obtain an optimal coverage/width tradeoff. In particular, it is studied whether using measured power as an another input, additionally to the meteorological forecast variables, is able to improve the properties of prediction intervals for short time horizons (up to three hours). Results show that they tend to be narrower (i.e. less uncertain), and the ratio between coverage and width is larger. The method has shown to obtain intervals with better properties than baseline Quantile Regression.
  • Publication
    Feature set optimization for physical activity recognition using genetic algorithms
    (Association For Computing Machinery (ACM), 2015-07) Baldominos Gómez, Alejandro; Sáez Achaerandio, Yago; Isasi, Pedro
    Physical activity is recognized as one of the key factors for a healthy life due to its beneficial effects. The range of physical activities is very broad, and not all of them require the same effort to be performed nor have the same effects on health. For this reason, automatically recognizing the physical activity performed by a user (or patient) turns out to be an interesting research field, mainly because of two reasons: (1) it increases personal awareness about the activity being performed and its consequences on health, allowing to receive proper credit (e.g. social recognition) for the effort; and (2) it allows doctors to perform continuous remote patient monitoring. This paper proposes a new approach for improving activity recognition by describing an activity recognition chain (ARC) that is optimized by means of genetic algorithms.This optimization process determines the most suitable and informative set of features that turns out into higher recognition accuracy while reducing the total number of sensors required to track the user activity. These improvements can be translated into lower costs in hardware and less intrusive devices for the patients. In this work, for the assessment of the proposed approach versus other techniques and for replication purposes, a publicly available dataset on physical activity (PAMAP2) has been used. Experiments are designed and conducted to evaluate the proposed ARC by using leave-one-subject-out cross validation and results are encouraging, reaching an average classification accuracy of about 94%.
  • Publication
    Monte Carlo schemata searching for physical activity recognition
    (IEEE Computer Society, 2015-11-02) Baldominos Gómez, Alejandro; Isasi, Pedro; Sáez Achaerandio, Yago; Manderick, Bernard
    Medical literature have recognized physical activity as a key factor for a healthy life due to its remarkable benefits. However, there is a great variety of physical activities and not all of them have the same effects on health nor require the same effort. As a result, and due to the ubiquity of commodity devices able to track users' motion, there is an increasing interest on performing activity recognition in order to detect the type of activity carried out by the subjects and being able to credit them for their effort, which has been detected as a key requirement to promote physical activity. This paper proposes a novel approach for performing activity recognition using Monte Carlo Schemata Search (MCSS) for feature selection and random forests for classification. To validate this approach we have carried out an evaluation over PAMAP2, a public dataset on physical activity available in UCI Machine Learning repository, enabling replication and assessment. The experiments are conducted using leave-one-subject-out cross validation and attain classification accuracies of over 93% by using roughly one third of the total set of features. Results are promising, as they outperform those obtained in other works on the same dataset and significantly reduce the set of features used, which could translate in a decrease of the number of sensors required to perform activity recognition and, as a result, a reduction of costs.
  • Publication
    Learning Levels of Mario AI Using Genetic Algorithms
    (Springer, 2015-11-11) Baldominos Gómez, Alejandro; Sáez Achaerandio, Yago; Recio, Gustavo; Calle Gómez, Francisco Javier
    This paper introduces an approach based on Genetic Algorithms to learn levels from the Mario AI simulator, based on the Infinite Mario Bros. game (which is, at the same time, based on the Super Mario World game from Nintendo). In this approach, an autonomous agent playing Mario is able to learn a sequence of actions in order to maximize the score, not looking at the current state of the game at each time. Different parameters for the Genetic Algorithm are explored, and two different stages are executed: in the first, domain independent genetic operators are used; while in the second knowledge about the domain is incorporated to these operators in order to improve the results. Results are encouraging, as Mario is able to complete very difficult levels full of enemies, resembling the behavior of an expert human player.
  • Publication
    An Efficient and Scalable Recommender System for the Smart Web
    (IEEE. Computer Society, 2015-11-01) Baldominos Gómez, Alejandro; Sáez Achaerandio, Yago; Albacete García, Esperanza; Marrero, Ignacio
    This work describes the development of a web recommender system implementing both collaborative filtering and content-based filtering. Moreover, it supports two different working modes, either sponsored or related, depending on whether websites are to be recommended based on a list of ongoing ad campaigns or in the user preferences. Novel recommendation algorithms are proposed and implemented, which fully rely on set operations such as union and intersection in order to compute the set of recommendations to be provided to end users. The recommender system is deployed over a real-time big data architecture designed to work with Apache Hadoop ecosystem, thus supporting horizontal scalability, and is able to provide recommendations as a service by means of a RESTful API. The performance of the recommender is measured, resulting in the system being able to provide dozens of recommendations in few milliseconds in a single-node cluster setup.
  • Publication
    A Scalable Machine Learning Online Service for Big Data Real-Time Analysis
    (Ieee - The Institute Of Electrical And Electronics Engineers, Inc, 2014-12) Baldominos Gómez, Alejandro; Albacete García, Esperanza; Sáez Achaerandio, Yago; Isasi, Pedro
    This work describes a proposal for developing and testing a scalable machine learning architecture able to provide real-time predictions or analytics as a service over domain-independent big data, working on top of the Hadoop ecosystem and providing real-time analytics as a service through a RESTful API. Systems implementing this architecture could provide companies with on-demand tools facilitating the tasks of storing, analyzing, understanding and reacting to their data, either in batch or stream fashion; and could turn into a valuable asset for improving the business performance and be a key market differentiator in this fast pace environment. In order to validate the proposed architecture, two systems are developed, each one providing classical machine-learning services in different domains: the first one involves a recommender system for web advertising, while the second consists in a prediction system which learns from gamers' behavior and tries to predict future events such as purchases or churning. An evaluation is carried out on these systems, and results show how both services are able to provide fast responses even when a number of concurrent requests are made, and in the particular case of the second system, results clearly prove that computed predictions significantly outperform those obtained if random guess was used.
  • Publication
    A Study of Machine Learning Techniques for Daily Solar Energy Forecasting using Numerical Weather Models
    (Springer International Publishing, 2015) Aler, Ricardo; Martín, Ricardo; Valls, José M.; Galván, Inés M.
    Forecasting solar energy is becoming an important issue in the context of renewable energy sources and Machine Learning Algorithms play an important rule in this field. The prediction of solar energy can be addressed as a time series prediction problem using historical data. Also, solar energy forecasting can be derived from numerical weather prediction models (NWP). Our interest is focused on the latter approach.We focus on the problem of predicting solar energy from NWP computed from GEFS, the Global Ensemble Forecast System, which predicts meteorological variables for points in a grid. In this context, it can be useful to know how prediction accuracy improves depending on the number of grid nodes used as input for the machine learning techniques. However, using the variables from a large number of grid nodes can result in many attributes which might degrade the generalization performance of the learning algorithms. In this paper both issues are studied using data supplied by Kaggle for the State of Oklahoma comparing Support Vector Machines and Gradient Boosted Regression. Also, three different feature selection methods have been tested: Linear Correlation, the ReliefF algorithm and, a new method based on local information analysis.
  • Publication
    Analysing the Advantages of Using Exploration and Exploitation Strategies in an Adaptive and Intelligent Educational System
    (Junta De Extremadura. Consejería de Educación, Ciencia y Tecnología, 2003-12-03) Iglesias Maqueda, Ana María; Martínez Fernández, Paloma; Aler, Ricardo; Fernández Rebollo, Fernando
    One of the most important issues in Adaptive and Intelligent Educational Systems (AIES) is to define pedagogical strategies for tutoring studies according to their needs. In previous papers we have proposed to use a pedagogical knowledge representation based on a Reinforcement Learning (RL) model. Using the reinforcement learning model, the system is able to automatically learn which is the best pedagogical way to teach student individually based only on acquired experience other students with similar learning characteristics, like a human tutor does. In this paper we study the viability of the application of the RL model in a DataBase Design (DBD) AIES using in this study simulated students. The viability is measured on 1hree important issues. First, we are going to check that the system converges to a pedagogical policy when it interacts with simulated students with different leaming characteristics. Second, we are going to prove that tite system leans an optimal pedagogical strategy, measured in number of actions that the system must execute to teach all the contents to tbe student. And third, we are going to prove that the system does not need many students to leanr to teach optimally. Choosing a good exploration and exploitation strategy is determinant for the three elements defined above, so two typical exploration/exploitatíon policies in RL problems have been used for the experiments in order to analyze tbe differences between them when the system teaches simulated students: the e-greedy and the Boltzmann exploration strategies.
  • Publication
    Optimizing the DFCN Broadcast Protocol with a Parallel Cooperative Strategy of Multi-Objective Evolutionary Algorithms
    (Springer, 2009) Segura, Carlos; Cervantes, Alejandro; Nebro, Antonio J.; Jaraíz-Simón, María Dolores; Segredo, Eduardo; García-Rodríguez, Sandra; Luna, Francisco; Gómez-Pulido, Juan A.; Miranda, Gara; Luque, Cristóbal; Alba, Enrique; Vega-Rodríguez, Miguel A.; León, Coromoto; Galván, Inés M.
    This work presents the application of a parallel coopera- tive optimization approach to the broadcast operation in mobile ad-hoc networks (manets). The optimization of the broadcast operation im- plies satisfying several objectives simultaneously, so a multi-objective approach has been designed. The optimization lies on searching the best configurations of the dfcn broadcast protocol for a given manet sce- nario. The cooperation of a team of multi-objective evolutionary al- gorithms has been performed with a novel optimization model. Such model is a hybrid parallel algorithm that combines a parallel island- based scheme with a hyperheuristic approach. Results achieved by the algorithms in different stages of the search process are analyzed in order to grant more computational resources to the most suitable algorithms. The obtained results for a manets scenario, representing a mall, demon- strate the validity of the new proposed approach.
  • Publication
    Evolving spatial and frequency selection filters for brain-computer interfaces
    (2010-07) Aler, Ricardo; Galván, Inés M.; Valls, José M.
    Abstract—Machine Learning techniques are routinely applied to Brain Computer Interfaces in order to learn a classifier for a particular user. However, research has shown that classiffication techniques perform better if the EEG signal is previously preprocessed to provide high quality attributes to the classifier. Spatial and frequency-selection filters can be applied for this purpose. In this paper, we propose to automatically optimize these filters by means of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). The technique has been tested on data from the BCI-III competition, because both raw and manually filtered datasets were supplied, allowing to compare them. Results show that the CMA-ES is able to obtain higher accuracies than the datasets preprocessed by manually tuned filters.
  • Publication
    Improving classification for brain computer interfaces using transitions and a moving window
    (2009-01-14) Aler, Ricardo; Galván, Inés M.; Valls, José M.
    The context of this paper is the brain-computer interface (BCI), and in particular the classification of signals with machine learning methods. In this paper we intend to improve classification accuracy by taking advantage of a feature of BCIs: instances run in sequences belonging to the same class. In that case, the classiffication problem can be reformulated into two subproblems: detecting class transitions and determining the class for sequences of instances between transitions. We detect a transition when the Euclidean distance between the power spectra at two different times is larger than a threshold. To tackle the second problem, instances are classified by taking into account, not just the prediction for that instance, but a moving window of predictions for previous instances. Experimental results show that our transition detection method improves results for datasets of two out of three subjects of the BCI III competition. If the moving window is used, classification accuracy is further improved, depending on the window size.
  • Publication
    Using evolutionary multiobjective techniques for imbalanced classifcation data
    (Springer, 2010-09) García-Rodríguez, Sandra; Aler, Ricardo; Galván, Inés M.
    The aim of this paper is to study the use of Evolutionary Multiobjective Techniques to improve the performance of Neural Net- works (NN). In particular, we will focus on classi¯cation problems where classes are imbalanced. We propose an evolutionary multiobjective ap- proach where the accuracy rate of all the classes is optimized at the same time. Thus, all classes will be treated equally independently of their pres- ence in the training data set. The chromosome of the evolutionary algo- rithm encodes only the weights of the training patterns missclassi¯ed by the NN, instead of all the parameters of the NN as in other approaches. Results show that the multiobjective approach is able to consider all classes at the same time, disregarding to some extent their abundance in the training set or other biases that restrain some of the classes of being learned properly.
  • Publication
    Transition detection for brain computer interface classification
    (Springer, 2010) Aler, Ricardo; Galván, Inés M.; Valls, José M.
    Abstract. This paper deals with the classification of signals for brain-computer interfaces (BCI).We take advantage of the fact that thoughts last for a period, and therefore EEG samples run in sequences belonging to the same class (thought). Thus, the classification problem can be reformulated into two subproblems: de- tecting class transitions and determining the class for sequences of samples be- tween transitions. The method detects transitions when the L1 norm between the power spectra at two different times is larger than a threshold. To tackle the sec- ond problem, samples are classified by taking into account a window of previous predictions. Two types of windows have been tested: a constant-size moving win- dow and a variable-size growing window. In both cases, results are competitive with those obtained in the BCI III competition.
  • Publication
    Optimizing linear and quadratic data transformations for classification tasks
    (IEEE, 2009) Valls, José M.; Aler, Ricardo
    Many classification algorithms use the concept of distance or similarity between patterns. Previous work has shown that it is advantageous to optimize general Euclidean distances (GED). In this paper, we optimize data transformations, which is equivalent to searching for GEDs, but can be applied to any learning algorithm, even if it does not use distances explicitly. Two optimization techniques have been used: a simple Local Search (LS) and the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). CMA-ES is an advanced evolutionary method for optimization in difficult continuous domains. Both diagonal and complete matrices have been considered. The method has also been extended to a quadratic non-linear transformation. Results show that in general, the transformation methods described here either outperform or match the classifier working on the original data.
  • Publication
    Optimizing data transformations for classification tasks
    (Springer, 2009-09) Valls, José M.; Aler, Ricardo
    Many classification algorithms use the concept of distance or similarity between patterns. Previous work has shown that it is advantageous to optimize general Euclidean distances (GED). In this paper, data transformations are optimized instead. This is equivalent to searching for GEDs, but can be applied to any learning algorithm, even if it does not use distances explicitly. Two optimization techniques have been used: a simple Local Search (LS) and the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). CMA-ES is an advanced evolutionary method for optimization in difficult continuous domains. Both diagonal and complete matrices have been considered. Results show that in general, complete matrices found by CMA-ES either outperform or match both Local Search, and the classifier working on the original untransformed data.
  • Publication
    An experimental study on fitness distributions of tree shapes in GP with one-point crossover
    (Springer, 2009) Estébanez Tascón, César; Aler, Ricardo; Valls, José M.; Alonso, Pablo J.
    In Genetic Programming (GP), One-Point Crossover is an alternative to the destructive properties and poor performance of Standard Crossover. One-Point Crossover acts in two phases, first making the population converge to a common tree shape, then looking for the best individual within that shape. So, we understand that One-Point Crossover is making an implicit evolution of tree shapes. We want to know if making this evolution explicit could lead to any improvement in the search power of GP. But we first need to define how this evolution could be performed. In this work we made an exhaustive study of fitness distributions of tree shapes for 6 different GP problems. We were able to identify common properties on distributions, and we propose a method to explicitly evaluate tree shapes. Based on this method, in the future, we want to implement a new genetic operator and a novel representation system for GP.
  • Publication
    Genetic programming for predicting protein networks
    (Springer, 2008-10) García Jiménez, Beatriz; Aler, Ricardo; Ledezma Espino, Agapito Ismael; Sanchis de Miguel, María Araceli
    One of the definitely unsolved main problems in molecular biology is the protein-protein functional association prediction problem. Genetic Programming (GP) is applied to this domain. GP evolves an expression, equivalent to a binary classifier, which predicts if a given pair of proteins interacts. We take advantages of GP flexibility, particularly, the possibility of defining new operations. In this paper, the missing values problem benefits from the definition of if-unknown, a new operation which is more appropriate to the domain data semantics. Besides, in order to improve the solution size and the computational time, we use the Tarpeian method which controls the bloat effect of GP. According to the obtained results, we have verified the feasibility of using GP in this domain, and the enhancement in the search efficiency and interpretability of solutions due to the Tarpeian method.
  • Publication
    Correcting and improving imitation models of humans for Robosoccer agents
    (IEEE, 2005-09) Aler, Ricardo; García, Oscar; Valls, José M.
    The Robosoccer simulator is a challenging environment, where a human introduces a team of agents into a football virtual environment. Typically, agents are programmed by hand, but it would be a great advantage to transfer human experience into football agents. The first aim of this paper is to use machine learning techniques to obtain models of humans playing Robosoccer. These models can be used later to control a Robosoccer agent. However, models did not play as smoothly and optimally as the human. To solve this problem, the second goal of this paper is to incrementally correct models by means of evolutionary techniques, and to adapt them against more difficult opponents than the ones beatable by the human.