Departamento de Estadística
http://hdl.handle.net/10016/12
Wed, 24 May 2017 00:19:00 GMT2017-05-24T00:19:00ZEstimating non-stationary common factors : Implications for risk sharing
http://hdl.handle.net/10016/24585
Estimating non-stationary common factors : Implications for risk sharing
Corona, Francisco; Poncela, Pilar; Ruiz Ortega, Esther
Universidad Carlos III de Madrid. Departamento de Estadística
In this paper, we analyze and compare the finite sample properties of alternative factor extraction procedures in the context of non-stationary Dynamic Factor Models (DFMs). On top of considering procedures already available in the literature, we extend the hybrid method based on the combination of principal components and Kalman filter and smoothing algorithms to non-stationary models. We show that, unless the idiosyncratic noise is non-stationary, procedures based on extracting the factors using the nonstationary original series work better than those based on differenced variables. The results are illustrated in an empirical application fitting non-stationary DFM to aggregate GDP and consumption of the set of 21 OECD industrialized countries. The goal is to check international risk sharing is a short or long-run issue.
Mon, 01 May 2017 00:00:00 GMThttp://hdl.handle.net/10016/245852017-05-01T00:00:00ZParallel Bayesian Inference for High Dimensional Dynamic Factor Copulas
http://hdl.handle.net/10016/24552
Parallel Bayesian Inference for High Dimensional Dynamic Factor Copulas
Nguyen, Hoang; Ausín Olivera, María Concepción; Galeano San Miguel, Pedro
Universidad Carlos III de Madrid. Departamento de Estadística
Copula densities are widely used to model the dependence structure of financial time series. However, the number of parameters involved becomes explosive in high dimensions which results in most of the models in the literature being static. Factor copula models have been recently proposed for tackling the curse of dimensionality by describing the behaviour of return series in terms of a few common latent factors. To account for asymmetric dependence in extreme events, we propose a class of dynamic one factor copula where the factor loadings are modelled as generalized autoregressive score (GAS) processes. We perform Bayesian inference in different specifications of the proposed class of dynamic one factor copula models. Conditioning on the latent factor, the components of the return series become independent, which allows the algorithm to run in a parallel setting and to reduce the computational cost needed to obtain the conditional posterior distributions of model parameters. We illustrate our approach with the analysis of a simulated data set and the analysis of the returns of 150 companies listed in the S&P500 index.
Mon, 01 May 2017 00:00:00 GMThttp://hdl.handle.net/10016/245522017-05-01T00:00:00ZRobust and sparse estimation of high-dimensional precision matrices via bivariate outlier detection
http://hdl.handle.net/10016/24534
Robust and sparse estimation of high-dimensional precision matrices via bivariate outlier detection
Lafit, Ginette; Nogales Martín, Francisco Javier
Robust estimation of Gaussian Graphical models in the high-dimensional setting is becoming increasingly important since large and real data may contain outlying observations. These outliers can lead to drastically wrong inference on the intrinsic graph structure. Several procedures apply univariate transformations to make the data Gaussian distributed. However, these transformations do not work well under the presence of structural bivariate outliers. We propose a robust precision matrix estimator under the cellwise contamination mechanism that is robust against structural bivariate outliers. This estimator exploits robust pairwise weighted correlation coefficient estimates, where the weights are computed by the Mahalanobis distance with respect to an affine equivariant robust correlation coefficient estimator. We show that the convergence rate of the proposed estimator is the same as the correlation coefficient used to compute the Mahalanobis distance. We conduct numerical simulation under different contamination settings to compare the graph recovery performance of different robust estimators. Finally, the proposed method is then applied to the classification of tumors using gene expression data. We show that our procedure can effectively recover the true graph under cellwise data contamination.
Mon, 01 May 2017 00:00:00 GMThttp://hdl.handle.net/10016/245342017-05-01T00:00:00ZClustering Big Data by Extreme Kurtosis Projections
http://hdl.handle.net/10016/24522
Clustering Big Data by Extreme Kurtosis Projections
Peña Sánchez de Rivera, Daniel; Prieto Fernández, Francisco Javier; Rendon Aguirre, Janeth Carolina
Universidad Carlos III de Madrid. Departamento de Estadística
Clustering Big Data is an important problem because large samples of many variables are usually heterogeneous and include mixtures of several populations. It often happens that only some of a large set of variables are useful for clustering and working with all of them would be very inefficient and may make more difficult the identification of the clusters. Thus, searching for spaces of lower dimension that include all the relevant information about the clusters seems a sensible way to proceed in these situations. Peña and Prieto (2001) showed that the extreme kurtosis directions of projected data are optimal when the data has been generated by mixtures of two normal distributions. We generalize this result for any number of mixtures and show that the extreme kurtosis directions of the projected data are linear combinations of the optimal discriminant directions if we knew the centers of the components of the mixture. In order to separate the groups we want directions that split the data into two groups, each corresponding to different components of the mixture. We prove that these directions can be found from extreme kurtosis projections. This result suggests a new procedure to deal with many groups, working in a binary decision way and deciding at each step if the data should be split into two groups or we should stop. The decision is based on comparing a single distribution with a mixture of two distribution. The performance of the algorithm is analyzed through a simulation study.
Thu, 27 Apr 2017 00:00:00 GMThttp://hdl.handle.net/10016/245222017-04-27T00:00:00Z