DI - GCERN - Comunicaciones en Congresos y otros eventos
Permanent URI for this collection
Browse
Recent Submissions
Now showing 1 - 20 of 115
Publication Exploring the Application of Hybrid Evolutionary Computation Techniques to Physical Activity Recognition(Association for Computing Machinery, 2016-07) Baldominos Gómez, Alejandro; Barrio Cerro, María del Carmen del; Sáez Achaerandio, Yago; European Commission; Ministerio de Economía y Competitividad (España)This paper focuses on the problem of physical activity recognition, i.e., the development of a system which is able to learn patterns from data in order to be able to detect which physical activity (e.g. running, walking, ascending stairs, etc.) a certain user is performing.While this field is broadly explored in the literature, there are few works that face the problem with evolutionary computation techniques. In this case, we propose a hybrid system which combines particle swarm optimization for clustering features and genetic programming combined with evolutionary strategies for evolving a population of classifiers, shaped in the form of decision trees. This system would run the segmentation, feature extraction and classification stages of the activity recognition chain.For this paper, we have used the PAMAP2 dataset with a basic preprocessing. This dataset is publicly available at UCI ML repository. Then, we have evaluated the proposed system using three different modes: a user-independent, a user-specific and a combined one. The results in terms of classification accuracy were poor for the first and the last mode, but it performed significantly well for the user-specific case. This paper aims to describe work in progress, to share early results an discuss them. There are many things that could be improved in this proposed system, but overall results were interesting especially because no manual data transformation took place.Publication Wind Energy Forecasting at Different Time Horizons with Individual and Global Models(Springer, 2018-05) Martin Vazquez, Ruben; Aler, Ricardo; Galván, Inés M.; Ministerio de Economía y Competitividad (España)In this work two different machine learning approaches have been studied to predict wind power for different time horizons: individual and global models. The individual approach constructs a model for each horizon while the global approach obtains a single model that can be used for all horizons. Both approaches have advantages and disadvantages. Each individual model is trained with data pertaining to a single horizon, thus it can be specific for that horizon, but can use fewer data for training than the global model, which is constructed with data belonging to all horizons. Support Vector Machines have been used for constructing the individual and global models. This study has been tested on energy production data obtained from the Sotavento wind farm and meteorological data from the European Centre for Medium-Range Weather Forecasts, for a 5 × 5 grid around Sotavento. Also, given the large amount of variables involved, a feature selection algorithm (Sequential Forward Selection) has been used in order to improve the performance of the models. Experimental results show that the global model is more accurate than the individual ones, specially when feature selection is used.Publication Predicting Global Irradiance Combining Forecasting Models Through Machine Learning(Springer, 2018-06-08) Huertas Tato, Javier; Aler, Ricardo; Rodríguez Benítez, F. J.; Arbizu Barrena, C.; Galván, Inés M.; Ministerio de Economía y Competitividad (España)Predicting solar irradiance is an active research problem, with many physical models having being designed to accurately predict Global Horizontal Irradiance. However, some of the models are better at short time horizons, while others are more accurate for medium and long horizons. The aim of this research is to automatically combine the predictions of four different models (Smart Persistence, Satellite, Cloud Index Advection and Diffusion, and Solar Weather Research and Forecasting) by means of a state-of-the-art machine learning method (Extreme Gradient Boosting). With this purpose, the four models are used as inputs to the machine learning model, so that the output is an improved Global Irradiance forecast. A 2-year dataset of predictions and measures at one radiometric station in Seville has been gathered to validate the method proposed. Three approaches are studied: a general model, a model for each horizon, and models for groups of horizons. Experimental results show that the machine learning combination of predictors is, on average, more accurate than the predictors themselves.Publication Studying the Effect of Measured Solar Power on Evolutionary Multi-objective Prediction Intervals(Springer, 2018-11-09) Martin Vazquez, Ruben; Huertas Tato, Javier; Aler, Ricardo; Galván, Inés M.; Ministerio de Economía y Competitividad (España)While it is common to make point forecasts for solar energy generation, estimating the forecast uncertainty has received less attention. In this article, prediction intervals are computed within a multi-objective approach in order to obtain an optimal coverage/width tradeoff. In particular, it is studied whether using measured power as an another input, additionally to the meteorological forecast variables, is able to improve the properties of prediction intervals for short time horizons (up to three hours). Results show that they tend to be narrower (i.e. less uncertain), and the ratio between coverage and width is larger. The method has shown to obtain intervals with better properties than baseline Quantile Regression.Publication Feature set optimization for physical activity recognition using genetic algorithms(Association For Computing Machinery (ACM), 2015-07) Baldominos Gómez, Alejandro; Sáez Achaerandio, Yago; Isasi, PedroPhysical activity is recognized as one of the key factors for a healthy life due to its beneficial effects. The range of physical activities is very broad, and not all of them require the same effort to be performed nor have the same effects on health. For this reason, automatically recognizing the physical activity performed by a user (or patient) turns out to be an interesting research field, mainly because of two reasons: (1) it increases personal awareness about the activity being performed and its consequences on health, allowing to receive proper credit (e.g. social recognition) for the effort; and (2) it allows doctors to perform continuous remote patient monitoring. This paper proposes a new approach for improving activity recognition by describing an activity recognition chain (ARC) that is optimized by means of genetic algorithms.This optimization process determines the most suitable and informative set of features that turns out into higher recognition accuracy while reducing the total number of sensors required to track the user activity. These improvements can be translated into lower costs in hardware and less intrusive devices for the patients. In this work, for the assessment of the proposed approach versus other techniques and for replication purposes, a publicly available dataset on physical activity (PAMAP2) has been used. Experiments are designed and conducted to evaluate the proposed ARC by using leave-one-subject-out cross validation and results are encouraging, reaching an average classification accuracy of about 94%.Publication Monte Carlo schemata searching for physical activity recognition(IEEE Computer Society, 2015-11-02) Baldominos Gómez, Alejandro; Isasi, Pedro; Sáez Achaerandio, Yago; Manderick, BernardMedical literature have recognized physical activity as a key factor for a healthy life due to its remarkable benefits. However, there is a great variety of physical activities and not all of them have the same effects on health nor require the same effort. As a result, and due to the ubiquity of commodity devices able to track users' motion, there is an increasing interest on performing activity recognition in order to detect the type of activity carried out by the subjects and being able to credit them for their effort, which has been detected as a key requirement to promote physical activity. This paper proposes a novel approach for performing activity recognition using Monte Carlo Schemata Search (MCSS) for feature selection and random forests for classification. To validate this approach we have carried out an evaluation over PAMAP2, a public dataset on physical activity available in UCI Machine Learning repository, enabling replication and assessment. The experiments are conducted using leave-one-subject-out cross validation and attain classification accuracies of over 93% by using roughly one third of the total set of features. Results are promising, as they outperform those obtained in other works on the same dataset and significantly reduce the set of features used, which could translate in a decrease of the number of sensors required to perform activity recognition and, as a result, a reduction of costs.Publication An Efficient and Scalable Recommender System for the Smart Web(IEEE. Computer Society, 2015-11-01) Baldominos Gómez, Alejandro; Sáez Achaerandio, Yago; Albacete García, Esperanza; Marrero, IgnacioThis work describes the development of a web recommender system implementing both collaborative filtering and content-based filtering. Moreover, it supports two different working modes, either sponsored or related, depending on whether websites are to be recommended based on a list of ongoing ad campaigns or in the user preferences. Novel recommendation algorithms are proposed and implemented, which fully rely on set operations such as union and intersection in order to compute the set of recommendations to be provided to end users. The recommender system is deployed over a real-time big data architecture designed to work with Apache Hadoop ecosystem, thus supporting horizontal scalability, and is able to provide recommendations as a service by means of a RESTful API. The performance of the recommender is measured, resulting in the system being able to provide dozens of recommendations in few milliseconds in a single-node cluster setup.Publication An approach to physical rehabilitation using state-of-the-art virtual reality and motion tracking technologies(Elsevier, 2015) Baldominos Gómez, Alejandro; Sáez Achaerandio, Yago; García del Pozo, María CristinaPublication A Scalable Machine Learning Online Service for Big Data Real-Time Analysis(Ieee - The Institute Of Electrical And Electronics Engineers, Inc, 2014-12) Baldominos Gómez, Alejandro; Albacete García, Esperanza; Sáez Achaerandio, Yago; Isasi, PedroThis work describes a proposal for developing and testing a scalable machine learning architecture able to provide real-time predictions or analytics as a service over domain-independent big data, working on top of the Hadoop ecosystem and providing real-time analytics as a service through a RESTful API. Systems implementing this architecture could provide companies with on-demand tools facilitating the tasks of storing, analyzing, understanding and reacting to their data, either in batch or stream fashion; and could turn into a valuable asset for improving the business performance and be a key market differentiator in this fast pace environment. In order to validate the proposed architecture, two systems are developed, each one providing classical machine-learning services in different domains: the first one involves a recommender system for web advertising, while the second consists in a prediction system which learns from gamers' behavior and tries to predict future events such as purchases or churning. An evaluation is carried out on these systems, and results show how both services are able to provide fast responses even when a number of concurrent requests are made, and in the particular case of the second system, results clearly prove that computed predictions significantly outperform those obtained if random guess was used.Publication A Study of Machine Learning Techniques for Daily Solar Energy Forecasting using Numerical Weather Models(Springer International Publishing, 2015) Aler, Ricardo; Martín, Ricardo; Valls, José M.; Galván, Inés M.Forecasting solar energy is becoming an important issue in the context of renewable energy sources and Machine Learning Algorithms play an important rule in this field. The prediction of solar energy can be addressed as a time series prediction problem using historical data. Also, solar energy forecasting can be derived from numerical weather prediction models (NWP). Our interest is focused on the latter approach.We focus on the problem of predicting solar energy from NWP computed from GEFS, the Global Ensemble Forecast System, which predicts meteorological variables for points in a grid. In this context, it can be useful to know how prediction accuracy improves depending on the number of grid nodes used as input for the machine learning techniques. However, using the variables from a large number of grid nodes can result in many attributes which might degrade the generalization performance of the learning algorithms. In this paper both issues are studied using data supplied by Kaggle for the State of Oklahoma comparing Support Vector Machines and Gradient Boosted Regression. Also, three different feature selection methods have been tested: Linear Correlation, the ReliefF algorithm and, a new method based on local information analysis.Publication Learning from non-stationary data using a growing network of prototypes(IEEE, 2013) Cervantes, Alejandro; Isasi, Pedro; Gagné, Christian; Parizeau, MarcLearning from non-stationary data requires methods that are able to deal with a continuous stream of data instances, possibly of infinite size, where the class distributions are potentially drifting over time. For handling such datasets, we are proposing a new method that incrementally creates and adapts a network of prototypes for classifying complex data received in an online fashion. The algorithm includes both an accuracy-based and time-based forgetting mechanisms that ensure that the model size does not grow indefinitely with large datasets. We have performed tests on seven benchmarking datasets for comparing our proposal with several approaches found in the literature, including ensemble algorithms associated to two different base classifiers. Performances obtained show that our algorithm is comparable to the best of the ensemble classifiers in terms of accuracy/time trade-off. Moreover, our approach appears to have significant advantages for dealing with data that has a complex, non-linearly separable topology.Publication Analysing the Advantages of Using Exploration and Exploitation Strategies in an Adaptive and Intelligent Educational System(Junta De Extremadura. Consejería de Educación, Ciencia y Tecnología, 2003-12-03) Iglesias Maqueda, Ana María; Martínez Fernández, Paloma; Aler, Ricardo; Fernández Rebollo, FernandoOne of the most important issues in Adaptive and Intelligent Educational Systems (AIES) is to define pedagogical strategies for tutoring studies according to their needs. In previous papers we have proposed to use a pedagogical knowledge representation based on a Reinforcement Learning (RL) model. Using the reinforcement learning model, the system is able to automatically learn which is the best pedagogical way to teach student individually based only on acquired experience other students with similar learning characteristics, like a human tutor does. In this paper we study the viability of the application of the RL model in a DataBase Design (DBD) AIES using in this study simulated students. The viability is measured on 1hree important issues. First, we are going to check that the system converges to a pedagogical policy when it interacts with simulated students with different leaming characteristics. Second, we are going to prove that tite system leans an optimal pedagogical strategy, measured in number of actions that the system must execute to teach all the contents to tbe student. And third, we are going to prove that the system does not need many students to leanr to teach optimally. Choosing a good exploration and exploitation strategy is determinant for the three elements defined above, so two typical exploration/exploitatíon policies in RL problems have been used for the experiments in order to analyze tbe differences between them when the system teaches simulated students: the e-greedy and the Boltzmann exploration strategies.Publication Learning Pedagogical Policies from Few Training Data(2006-08-01) Iglesias Maqueda, Ana María; Martínez Fernández, Paloma; Aler, Ricardo; Fernández Rebollo, FernandoLearning a pedagogical policy in an Adaptive Educational System (AIES) fits as a Reinforcement Learning (RL) problem. However, to learn pedagogical policies requires to acquire a huge amount of experience interacting with the students, so applying RL to the AIES from scratch is infeasible. In this paper we describe RLATES, an AIES that uses RL to learn an accurate pedagogical policy to teach a course of Data Base Design. To reduce the experience required to learn the pedagogical policy, we propose to use an initial value function learned with simulated students, whose model is provided by an expert as a Markov Decision Process. Empirical results demonstrate that the value function learned with the simulated students and transferred to the AIES is a very accurate initial pedagogical policy. The evaluation is based on the interaction of more than 70 Computer Science undergraduate students, and demonstrates that an efficient guide through the contents of the educational system is obtained.Publication Comparing Multi-objective and Threshold-moving ROC Curve Generation for a Prototype-based Classifier(ACM, 2013-07) Aler, Ricardo; Handl, Julia; Knowles, Joshua D.Receiver Operating Characteristics (ROC) curves represent the performance of a classifier for all possible operating con-ditions, i.e., for all preferences regarding the tradeoff be-tween false positives and false negatives. The generation of a ROC curve generally involves the training of a single classifier for a given set of operating conditions, with the subsequent use of threshold-moving to obtain a complete ROC curve. Recent work has shown that the generation of ROC curves may also be formulated as a multi-objective optimization problem in ROC space: the goals to be min-imized are the false positive and false negative rates. This technique also produces a single ROC curve, but the curve may derive from operating points for a number of different classifiers. This paper aims to provide an empirical compar-ison of the performance of both of the above approaches, for the specific case of prototype-based classifiers. Results on synthetic and real domains shows a performance advantage for the multi-objective approach.Publication Optimizing the DFCN Broadcast Protocol with a Parallel Cooperative Strategy of Multi-Objective Evolutionary Algorithms(Springer, 2009) Segura, Carlos; Cervantes, Alejandro; Nebro, Antonio J.; Jaraíz-Simón, María Dolores; Segredo, Eduardo; García-Rodríguez, Sandra; Luna, Francisco; Gómez-Pulido, Juan A.; Miranda, Gara; Luque, Cristóbal; Alba, Enrique; Vega-Rodríguez, Miguel A.; León, Coromoto; Galván, Inés M.This work presents the application of a parallel coopera- tive optimization approach to the broadcast operation in mobile ad-hoc networks (manets). The optimization of the broadcast operation im- plies satisfying several objectives simultaneously, so a multi-objective approach has been designed. The optimization lies on searching the best configurations of the dfcn broadcast protocol for a given manet sce- nario. The cooperation of a team of multi-objective evolutionary al- gorithms has been performed with a novel optimization model. Such model is a hybrid parallel algorithm that combines a parallel island- based scheme with a hyperheuristic approach. Results achieved by the algorithms in different stages of the search process are analyzed in order to grant more computational resources to the most suitable algorithms. The obtained results for a manets scenario, representing a mall, demon- strate the validity of the new proposed approach.Publication Hibridación de dos algoritmos evolutivos para la optimización de funciones multiobjetivo: MOPSO y ESN(2009-02) García-Rodríguez, Sandra; Galván, Inés M.El presente trabajo de investigación tiene como objetivo estudiar la hibridación de dos algoritmos multiobjetivo: enjambres de partículas (MOPSO) y un algoritmo multiobjetivo basado en la combinación de NSGA-II con Estrategias Evolutivas (ESN). Se pretende analizar si la hibridación permite obtener frentes de Pareto mejores que los obtenidos individualmente por los algoritmos ya que, en estudios previos sobre estos algoritmos, se observó que, para ciertos problemas, un algoritmo puede ayudar a otro (y viceversa) en la obtención de frentes más óptimos. Una forma de plantear esta hibridación es utilizar la población obtenida por un algoritmo para inicializar el otro y, para ello, se han realizado experimentos ejecutados de manera homogénea, para cada una de las aproximaciones así como para la hibridación de ambas, con cuatro funciones teóricas (ZDT1, ZDT2, ZDT3 y ZDT4) y un problema real: MANETs.Publication Multiobjective Algorithms Hybridization to Optimize Broadcasting Parameters in Mobile Ad-Hoc Networks(Bio-Inspired Systems: Computational and Ambient Intelligence, 2009) García-Rodríguez, Sandra; Luque, Cristóbal; Cervantes, Alejandro; Galván, Inés M.The aim os this paper is to study the hybridization of two multi-objective algorithms in the context of a real problem, the MANETs problem. The algorithms studied are Particle Swarm Optimization (MOPSO) and a new multiobjective algorithm based in the combination of NSGA-II with Evolution Strategies (ESN). This work analyzes the improvement produced by hybridization over the Pareto’s fronts compared with the non-hybridized algorithms. The purpose of this work is to validate how hybridization of two evolutionary algorithms of different families may help to solve certain problems together in the context of MANETs problem. The hybridization used for this work consists on a sequential execution of the two algorithms and using the final population of the first algorithm as initial population of the second one.Publication Portfolio Optimization Using SPEA2 with Resampling(Springer, 2011) García-Rodríguez, Sandra; Quintana, David; Galván, Inés M.; Isasi, PedroThe subject of financial portfolio optimization under real-world constraints is a difficult problem that can be tackled using multiobjective evolutionary algorithms. One of the most problematic issues is the dependence of the results on the estimates for a set of parameters, that is, the robustness of solutions. These estimates are often inaccurate and this may result on solutions that, in theory, offered an appropriate risk/return balance and, in practice, resulted being very poor. In this paper we suggest that using a resampling mechanism may filter out the most unstable. We test this idea on real data using SPEA2 as optimization algorithm and the results show that the use of resampling increases significantly the reliability of the resulting portfolios.Publication Evolving spatial and frequency selection filters for brain-computer interfaces(2010-07) Aler, Ricardo; Galván, Inés M.; Valls, José M.Abstract—Machine Learning techniques are routinely applied to Brain Computer Interfaces in order to learn a classifier for a particular user. However, research has shown that classiffication techniques perform better if the EEG signal is previously preprocessed to provide high quality attributes to the classifier. Spatial and frequency-selection filters can be applied for this purpose. In this paper, we propose to automatically optimize these filters by means of the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). The technique has been tested on data from the BCI-III competition, because both raw and manually filtered datasets were supplied, allowing to compare them. Results show that the CMA-ES is able to obtain higher accuracies than the datasets preprocessed by manually tuned filters.Publication Improving classification for brain computer interfaces using transitions and a moving window(2009-01-14) Aler, Ricardo; Galván, Inés M.; Valls, José M.The context of this paper is the brain-computer interface (BCI), and in particular the classification of signals with machine learning methods. In this paper we intend to improve classification accuracy by taking advantage of a feature of BCIs: instances run in sequences belonging to the same class. In that case, the classiffication problem can be reformulated into two subproblems: detecting class transitions and determining the class for sequences of instances between transitions. We detect a transition when the Euclidean distance between the power spectra at two different times is larger than a threshold. To tackle the second problem, instances are classified by taking into account, not just the prediction for that instance, but a moving window of predictions for previous instances. Experimental results show that our transition detection method improves results for datasets of two out of three subjects of the BCI III competition. If the moving window is used, classification accuracy is further improved, depending on the window size.