Departamento de Estadísticahttp://hdl.handle.net/10016/122017-04-28T23:59:00Z2017-04-28T23:59:00ZClustering Big Data by Extreme Kurtosis ProjectionsPeña Sánchez de Rivera, DanielPrieto Fernández, Francisco JavierRendon Aguirre, Janeth Carolinahttp://hdl.handle.net/10016/245222017-04-28T16:33:14Z2017-04-27T00:00:00ZClustering Big Data by Extreme Kurtosis Projections
Peña Sánchez de Rivera, Daniel; Prieto Fernández, Francisco Javier; Rendon Aguirre, Janeth Carolina
Universidad Carlos III de Madrid. Departamento de Estadística
Clustering Big Data is an important problem because large samples of many variables are usually heterogeneous and include mixtures of several populations. It often happens that only some of a large set of variables are useful for clustering and working with all of them would be very inefficient and may make more difficult the identification of the clusters. Thus, searching for spaces of lower dimension that include all the relevant information about the clusters seems a sensible way to proceed in these situations. Peña and Prieto (2001) showed that the extreme kurtosis directions of projected data are optimal when the data has been generated by mixtures of two normal distributions. We generalize this result for any number of mixtures and show that the extreme kurtosis directions of the projected data are linear combinations of the optimal discriminant directions if we knew the centers of the components of the mixture. In order to separate the groups we want directions that split the data into two groups, each corresponding to different components of the mixture. We prove that these directions can be found from extreme kurtosis projections. This result suggests a new procedure to deal with many groups, working in a binary decision way and deciding at each step if the data should be split into two groups or we should stop. The decision is based on comparing a single distribution with a mixture of two distribution. The performance of the algorithm is analyzed through a simulation study.
2017-04-27T00:00:00ZEvaluating significant effects from alternative seeding systems : a Bayesian approach, with an application to the UEFA Champions LeagueCorona, FranciscoForrest, DavidTena, Juan de DiosWiper, Michael Peterhttp://hdl.handle.net/10016/245212017-04-28T13:47:11Z2017-04-01T00:00:00ZEvaluating significant effects from alternative seeding systems : a Bayesian approach, with an application to the UEFA Champions League
Corona, Francisco; Forrest, David; Tena, Juan de Dios; Wiper, Michael Peter
Universidad Carlos III de Madrid. Departamento de Estadística
The paper discusses how to evaluate alternative seeding systems in sports competitions. Prior papers have developed an approach which uses a forecasting model at the level of the individual match and then applies Monte Carlo simulation of the whole tournament to estimate the probabilities associated with various outcomes or combinations of outcomes. This allows, for example, a measure of outcome uncertainty to be attached to each proposed seeding regime. However, this established approach takes no note of the uncertainty surrounding the parameter estimates in the underlying match forecasting model and this precludes testing for statistically significant differences between probabilities or outcome uncertainty measures under alternative regimes. We propose a Bayesian approach which resolves this weakness in standard methodology and illustrate its potential by examining the effect of seeding rule changes implemented in the UEFA Champions League, a major football tournament, in 2015. The reform appears to have increased outcome uncertainty. We identify which clubs and which sorts of clubs were favourably or unfavourably affected by the reform, distinguishing effects on probabilities of progression to different phases of the competition.
2017-04-01T00:00:00ZBIAS correction for dynamic factor modelsAlonso Fernández, Andrés ModestoBastos, GuadalupeGarcía-Martos, Carolinahttp://hdl.handle.net/10016/240292017-04-18T13:52:47Z2017-01-01T00:00:00ZBIAS correction for dynamic factor models
Alonso Fernández, Andrés Modesto; Bastos, Guadalupe; García-Martos, Carolina
Universidad Carlos III de Madrid. Departamento de Estadística
In this paper we work with multivariate time series that follow a Dynamic Factor Model. In particular, we consider the setting where factors are dominated by highly persistent AutoRegressive (AR) processes, and samples that are rather small. Therefore, the factors' AR models are estimated using small sample bias correction techniques. A Monte Carlo study reveals that bias-correcting the AR coefficients of the factors allows to obtain better results in terms of prediction interval coverage. As expected, the simulation reveals that bias-correction is more successful for smaller samples. Results are gathered assuming the AR order and number of factors are known as well as unknown. We also study the advantages of this technique for a set of Industrial Production Indexes of several European countries.
2017-01-01T00:00:00ZElectricity prices forecasting by averaging dynamic factor modelsAlonso Fernández, Andrés ModestoBastos, GuadalupeGarcía-Martos, Carolinahttp://hdl.handle.net/10016/240282017-04-18T13:52:47Z2017-01-01T00:00:00ZElectricity prices forecasting by averaging dynamic factor models
Alonso Fernández, Andrés Modesto; Bastos, Guadalupe; García-Martos, Carolina
Universidad Carlos III de Madrid. Departamento de Estadística
In the context of the liberalization of electricity markets, forecasting prices
is essential. With this aim, research has evolved to model the particularities of
electricity prices.
In particular, Dynamic Factor Models have been quite successful in the task, both in
the short and long run. However, specifying a single model for the unobserved factors
is difficult, and it can not be guaranteed that such a model exists. In this paper, Model
Averaging is employed to overcome this difficulty, with the expectation that
electricity prices would be better forecast by acombination of models for the factors
than by a single model. Although our procedure is applicable in other markets, it is
illustrated with applications to forecasting spot prices of the Iberian Market, MIBEL
(The Iberian Electricity Market) and the Italian Market. Three combinations of
forecasts are successful in providing improved results for alternative forecasting
horizons.
2017-01-01T00:00:00Z