DES - Working Papers. Statistics and Econometrics. WS

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 20 of 643
  • Publication
    Clustering and forecasting of day-ahead electricity supply curves using a market-based distance
    (2024) Li, Zehang; Alonso Fernández, Andrés Modesto; Elías, Antonio; Morales, Juan M.; Universidad Carlos III de Madrid. Departamento de Estadística; European Commission; Ministerio de Ciencia e Innovación (España); European Commission
    Gathering knowledge of supply curves in electricity markets is critical to both energy producers and regulators. Indeed, power producers strategically plan their generation of electricity considering various scenarios to maximize profit, leveraging the characteristics of these curves. For their part, regulators need to forecast the supply curves to monitor the market’s performance and identify market distortions. However, the prevailing approaches in the technical literature for analyzing, clustering, and predicting these curves are based on structural assumptions that electricity supply curves do not satisfy in practice, namely, boundedness and smoothness. Furthermore, any attempt to satisfactorily cluster the supply curves observed in a market must take into account the market’s specific features. Against this background, this article introduces a hierarchical clustering method based on a novel weighted-distance that is specially tailored to non bounded and non-smooth supply curves and embeds information on the price distribution of offers, thus overcoming the drawbacks of conventional clustering techniques. Once the clusters have been obtained, a supervised classification procedure is used to characterize them as a function of relevant market variables. Additionally, the proposed distance is used in a learning procedure by which explanatory information is exploited to forecast the supply curves in a day-ahead electricity market. This procedure combines the idea of nearest neighbors with a machine-learning method. The prediction performance of our proposal is extensively evaluated and compared against two nearest-neighbor benchmarks and existing competing methods. To this end, supply curves from the markets of Spain, Pennsylvania-New Jersey-Maryland (PJM), and West Australia are considered.
  • Publication
    Observability analysis for structural system identification based on state estimation
    (2023-11-23) Alahmad, Ahmad; Mínguez Solana, Roberto; Porras, Rocío; Lozano Galant, José Antonio; Turmo, José; Universidad Carlos III de Madrid. Departamento de Estadística
    The concept of observability analysis (OA) has garnered substantial attention in the field of Structural System Identification. Its primary aim is to identify a specific set of structural characteristics, such as Young's modulus, area, inertia, and possibly their combinations (e.g., flexural or axial stiffness). These characteristics can be uniquely determined when provided with a suitable subset of deflections, forces, and/or moments at the nodes of the structure. This problem is particularly intricate within the realm of Structural System Identification, mainly due to the presence of nonlinear unknown variables, such as the product of vertical deflection and flexural stiffness, in accordance with modern methodologies. Consequently, the mechanical and geometrical properties of the structure are intricately linked with node deflections and/or rotations. The paper at hand serves a dual purpose: firstly, it introduces the concept of State Estimation (SE), specially tailored for the identification of structural systems; and secondly, it presents a novel OA method grounded in SE principles, designed to overcome the aforementioned challenges. Computational experiments shed light on the algorithm's potential for practical Structural System Identification applications, demonstrating significant advantages over the existing state-of-the-art methods found in the literature. It is noteworthy that these advantages could potentially be further amplified by addressing the SE problem, which constitutes a subject for future research. Solving this problem would help address the additional challenge of developing efficient techniques that can accommodate redundancy and uncertainty when estimating the current state of the structure.
  • Publication
    Deep Learning and Bayesian Calibration Approach to Hourly Passenger Occupancy Prediction in Beijing Metro: A Study Exploiting Cellular Data and Metro Conditions
    (2023-11-07) Sun, He; Cabras, Stefano; Universidad Carlos III de Madrid. Departamento de Estadística
    In In burgeoning urban landscapes, the proliferation of the populace necessitates swift and accurate urban transit solutions to cater to the citizens' commuting requirements. A pivotal aspect of fostering optimized traffic management and ensuring resilient responses to unanticipated passenger surges is precisely forecasting hourly occupancy levels within urban subway systems. This study embarks on delineating a two-tiered model designed to address this imperative adeptly: 1. Preliminary Phase - Employing a Feed Forward Neural Network (FFNN): In the initial phase, a Feed Forward Neural Network (FFNN) is employed to gauge the occupancy levels across various subway stations. The FFNN, a class of artificial neural networks, is well-suited for this task because it can learn from the data and make predictions or decisions without being explicitly programmed to perform the task. Through a series of interconnected nodes, known as neurons, arranged in layers, the FFNN processes the input data, adjusts its weights based on the error of its predictions, and optimizes the network for accurate forecasting. For the random process of occupation levels in time and space, this phase encapsulates the so-called process filtration, wherein the underlying patterns and dynamics of subway occupancy are captured and represented in a structured format, ready for subsequent analysis. The estimates garnered from this phase are pivotal and form the foundation for the subsequent modelling stage. 2. Subsequent Phase - Implementing a Bayesian Proportional-Odds Model with Hourly Random Effects: With the estimates from the FFNN at disposal, the study transitions to the subsequent phase wherein a Bayesian Proportional-Odds Model is utilized. This model is particularly adept for scenarios where the response variable is ordinal, as in the case of occupancy levels (Low, Medium, High). The Bayesian framework, underpinned by the principles of probability, facilitates the incorporation of prior probabilities on model parameters and updates this knowledge with observed data to make informed predictions. The unique feature of this model is the incorporation of a random effect for hours, which acknowledges the inherent variability across different hours of the day. This is paramount in urban transit systems where passenger influx varies significantly with the hour. The synergy of these two models facilitates calibrated estimations of occupancy levels, both conditionally (relative to the sample) and unconditionally (on a detached test set). This dual-phase methodology furnishes analysts with a robust and reliable insight into the quality of predictions propounded by this model. This, in turn, avails a data-driven foundation for making informed decisions in real-time traffic management, emergency response planning, and overall operational optimization of urban subway systems. The model expounded in this study is presently under scrutiny for potential deployment by the Beijing Metro Group Ltd. This initiative reflects a practical stride towards embracing sophisticated analytical models to ameliorate urban transit management, thereby contributing to the broader objective of fostering sustainable and efficient urban living environments amidst the surging urban populace.
  • Publication
    Risk management in solar-based power plants with storage: a comparative study
    (2023-09-18) Oliveira, Fernando S.; Ruiz Mora, Carlos; Universidad Carlos III de Madrid. Departamento de Estadística
    Investment in solar generation is essential to achieve EU climate neutrality by 2050. Using stochastic programming, we study the management of solar power plants considering trading in the spot and future markets, weather derivatives based on solar radiation, storage, and risk management. We provide a comparative study of two technologies: a concentrated solar power plant with thermal storage and a photovoltaic power plant with electrical batteries. The significant managerial contributions can be classified into four levels. First, regarding trading and generation decisions, we proved that: a) plants sell energy in the spot market during the night and store energy in the morning; b) storage happens at the same time as electricity is purchased in the spot market; c) in the Summer the plants sell more in the futures market; d) storage, in both types of technology, increases trading in futures and spot markets and creates value for generators. Second, regarding the use of options on solar radiation, we show that a) the value of put and call options depends on the expected solar radiation; b) the radiation option prices are correlated with generation and storage levels and with the anticipated trading in spot and futures markets; c) the optimal strategy is to sell calls and buy put options; d) generators with a storage system sell significantly more call options. Third, regarding risk aversion, we proved that: a) the higher the risk aversion, the more the generator sells in the futures market and the higher the number of purchased put contracts; b) the risk-adjusted profit from options trading is zero. Finally, in comparing both technologies, even though the operation and financial management patterns are similar, the photovoltaic power plant is more profitable, and the batteries create more value.
  • Publication
    Economic activity and C02 emissions in Spain
    (2023-07-24) Juan, Aranzazu de; Poncela, Maria Pilar; Ruiz Ortega, Esther; Universidad Carlos III de Madrid. Departamento de Estadística
    Carbon dioxide (CO2) emissions, largely by-products of energy consumption, account for the largest share of greenhouse gases (GHG). The addition of GHG to the atmosphere disturbs the earth's radiative balance, leading to an increase in the earth's surface temperature and to related effects on climate, sea level rise, ocean acidification and world agriculture, among other effects. Forecasting and designing policies to curb CO2 emissions globally is gaining interest. In this paper, we look at the relationship between CO2 emissions and economic activity using Spanish data from 1964 to 2020. We consider a structural (contemporaneous) equation between selected indicators of economic activity and CO2 emissions, that we further augment with dynamic common factors extracted from a large macroeconomic database. We show that the way the common factors are extracted is crucial to exploit their information content. In particular, when using standard methods to extract the common factors from large data sets, once private consumption and maritime transportation are considered, the information contained in the macroeconomic data set has only negligible explanatory power for emissions. However, if we extract the common factors oriented towards CO2 emissions, they add valuable information not contained in the individual economic indicators.
  • Publication
    Effects of extreme temperature on the European equity market
    (2023-07-24) Bellocca, Gian Pietro Enzo; Alessi, Lucia; Poncela Blanco, Maria Pilar; Ruiz Ortega, Esther; Universidad Carlos III de Madrid. Departamento de Estadística
    The increasing frequency and severity of extreme temperatures are potential threats to financial stability. Indeed, physical risk related to these extreme phenomena can affect the whole financial system and, in particular, the equity market. In this study,we analyze the impact of extreme temperature exposure on firms' performance in Europe over the XXI century. We show that extreme temperatures can affect firms' profitability depending on their industry and the quarter of the year. Our results are of interest for both investors operating in the equity market and for regulators in charge of securing financial stability.
  • Publication
    Modelling intervals of minimum/maximum temperatures in the Iberian Peninsula
    (2023-07-24) González-Rivera, Gloria; Rodríguez Caballero, Carlos Vladimir; Ruiz Ortega, Esther; Universidad Carlos III de Madrid. Departamento de Estadística
    In this paper, we propose to model intervals of minimum/maximum temperatures observed at a given location by fitting unobserved component models to bivariate systems of center and log-range temperatures. In doing so, the center and logrange temperature are decomposed into potentially stochastic trends, seasonal and transitory components. We contribute to the debate on whether the trend and seasonal components are better represented by stochastic or deterministic components. The methodology is implemented to intervals of minimum/maximum temperatures observed monthly in four locations in the Iberian Peninsula, namely, Barcelona, Coruña, Madrid and Seville. We show that, at each location, the center temperature can be represented by a smooth integrated random walk with time-varying slope while the log-range seems to be better represented by a stochastic level. We also show that center and log-range temperature are unrelated. The methodology is then extended to model simultaneously minimum/maximum temperatures observed at several locations. We fit a multi-level dynamic factor model to extract potential commonalities among center (log-range) temperature while also allowing for heterogeneity in different areas. The model is fitted to intervals of minimum/maximum temperatures observed at a large number of locations in the Iberian Peninsula.
  • Publication
    Penalized function-on-function partial leastsquares regression
    (2023-07-05) Hernandez Roig, Harold Antonio; Aguilera Morillo, María del Carmen; Aguilera, Ana M.; Preda, Cristian; Universidad Carlos III de Madrid. Departamento de Estadística
    This paper deals with the "function-on-function'" or "fully functional" linear regression problem. We address the problem by proposing a novel penalized Function-on-Function Partial Least-Squares (pFFPLS) approach that imposes smoothness on the PLS weights. Our proposal introduces an appropriate finite-dimensional functional space with an associated set of bases on which to represent the data and controls smoothness with a roughness penalty operator. Penalizing the PLS weights imposes smoothness on the resulting coefficient function, improving its interpretability. In a simulation study, we demonstrate the advantages of pFFPLS compared to non-penalized FFPLS. Our comparisons indicate a higher accuracy of pFFPLS when predicting the response and estimating the true coefficient function from which the data were generated. We also illustrate the advantages of our proposal with two case studies involving two well-known datasets from the functional data analysis literature. In the first one, we predict log precipitation curves from the yearly temperature profiles recorded in 35 weather stations in Canada. In the second case study, we predict the hip angle profiles during a gait cycle of children from their corresponding knee angle profiles.
  • Publication
    Tall big data time series of high frequency: stylized facts and econometric modelling
    (2023-07-04) Espasa, Antoni; Carlomagno Real, Guillermo; Universidad Carlos III de Madrid. Departamento de Estadística
    The paper starts commenting on the hard tasks of data treatment -mainly, cleaning, classification, and aggregation- that are required at the beginning of any analysis with big data. Subsequently, it focuses on non-financial big data time series of high frequency that for many problems are aggregated at daily, hourly, or higher frequency levels of several minutes. Then, the paper discusses possible stylized facts present in these data. In this respect, it studies relevant seasonality: daily, weekly, monthly, and annually, and analyses how, for the data in question, these cycles could be affected by weather variables and by factors due to the annual composition of the calendar. Consequently, the paper investigates the possible main characteristics of the mentioned cycles and the types of responses to the exogenous weather and calendar factors that data could show. The shorter cycles could change along the annual cycle and interact with the exogenous variables. The modelling strategy could require regime-switching, dynamic, non-linear structures, and interactions between the factors considered. Then the paper analyses the construction of explanatory variables that could be useful for taking into account all the above peculiarities. We propose the use of the automated procedure, Autometrics, to discover -in words of Prof Hendry- a parsimonious model not dominated by any other, which is able to explain all the characteristics of the data. The model can be used for structural analysis, forecasting, and, when it is the case, to build real-time quantitative macroeconomic leading indicators. Finally, the paper includes an application to the daily series of jobless claims in Chile.
  • Publication
    Adaptive posterior distributions for covariance matrix learning in Bayesian inversion problems for multioutput signals
    (2023-05-30) Curbelo Benitez, Ernesto Angel; Martino, Luca; Llorente Fernandez, Fernando; Delgado Gómez, David; Universidad Carlos III de Madrid. Departamento de Estadística
    In this work, we propose an adaptive importance sampling (AIS) scheme for multivariate Bayesian inversion problems, which is based in two main ideas: the inference procedure is divided in two parts and the variables of interest are split in two blocks. We assume that the observations are generated from a complex multivariate non-linear function perturbed by correlated Gaussian noise. We estimate both the unknown parameters of the multivariate non-linear model and the covariance matrix of the noise. In the first part of the proposed inference scheme, a novel AIS technique called adaptive target AIS (ATAIS) is designed, which alternates iteratively between an IS technique over the parameters of the non-linear model and a frequentist approach for the covariance matrix of the noise. In the second part of the proposed inference scheme, a prior density over the covariance matrix is considered and the cloud of samples obtained by ATAIS are recycled and re-weighted for obtaining a complete Bayesian study over the model parameters and covariance matrix. Two numerical examples are presented that show the benefits of the proposed approach.
  • Publication
    Modelling physical activity profiles in COPD patients: a new approach to variable-domain functional regression models
    (2023-05-05) Hernandez Amaro, Pavel; Durbán Reguera, María Luz; Aguilera Morillo, María del Carmen; Esteban Gonzalez, Cristobal; Arostegui, Inma; Universidad Carlos III de Madrid. Departamento de Estadística
    Motivated by the increasingly common technology for collecting data, like cellphones, smartwatches, etc, functional data analysis has been intensively studied in recent decades, and along with it, functional regression models. However, the majority of functional data methods in general and functional regression models, in particular, are based on the fact that the observed datapresent the same domain. When the data have variable domain it needs to be aligned or registered in order to be fitted with the usual modeling techniques adding computational burden. To avoid this, a model that contemplates the variable domain features of the data is needed, but this type of models are scarce and its estimation method presents some limitations. In this article, we propose a new scalar-on-function regression model for variable domain functional data that eludes the need for alignment and a new estimation methodology that we extend to other variable domain regression models.
  • Publication
    Data cloning for a threshold asymmetric stochastic volatility model
    (2023-02-14) Marín Díazaraque, Juan Miguel; Lopes Moreira Da Veiga, María Helena; Universidad Carlos III de Madrid. Departamento de Estadística
    In this paper, we propose a new asymmetric stochastic volatility model whose asymmetry parameter can change depending on the intensity of the shock and is modeled as a threshold function whose threshold depends on past returns. We study the model in terms of leverage and propagation using a new concept that has recently appeared in the literature. We find that the new model can generate more leverage and propagation than a well-known asymmetric volatility model. We also propose to estimate the parameters of the model by cloning data. We compare the estimates in finite samples of data cloning and a Bayesian approach and find that data cloning is often more accurate. Data cloning is a general technique for computing maximum likelihood estimators and their asymptotic variances using a Markov chain Monte Carlo (MCMC) method. The empirical application shows that the new model often improves the fit compared to the benchmark model. Finally, the new proposal together with data cloning estimation often leads to more accurate 1-day and 10-day volatility forecasts, especially for return series with high volatility.
  • Publication
    Risk Management of Energy Communities with Hydrogen Production and Storage Technologies
    (2023-01-16) Feng, Wenxiu; Ruiz Mora, Carlos; Universidad Carlos III de Madrid. Departamento de Estadística
    The distributed integration of renewable energy sources plays a central role in the decarbonization of economies. In this regard, energy communities arise as a promising entity to coordinate groups of proactive consumers (prosumers) and incentivize the investment on clean technologies. However, the uncertain nature of renewable energy generation, residential loads, and trading tariffs pose important challenges, both at the operational and economic levels. We study how this management can be directly undertaken by an arbitrageur that, making use of an adequate price tariff system, serves as an intermediary with the central electricity market to coordinate different types of prosumers under risk aversion. In particular, we consider a sequential futures and spot market where the aggregated shortage or excess of energy within the community can be traded. We aim to study the impact of the integration of hydrogen production and storage systems, together with a parallel hydrogen market, on the community operation. These interactions are modeled as a game theoretical setting in the form of a stochastic two-stage bilevel optimization problem, which is latter reformulated without approximation as a single-level mixed-integer linear problem (MILP). An extensive set of numerical experiments based on real data is performed to study the operation of the energy community under different technical and economical conditions. Results indicate that the optimal involvement in futures and spot markets is highly conditioned by the community's risk aversion and self-sufficiency levels. Moreover, the external hydrogen market has a direct effect on the community's internal price-tariff system, and depending on the market conditions, may worsen the utility of individual prosumers.
  • Publication
    Ignoring cross-correlated idiosyncratic components when extracting factors in dynamic factor models
    (2022-12-12) Fresoli, Diego Eduardo; Poncela Blanco, Maria Pilar; Ruiz Ortega, Esther; Universidad Carlos III de Madrid. Departamento de Estadística; Ministerio de Ciencia y Tecnología (España)
    In economics, Principal Components, its generalized version that takes into account heteroscedasticity, and Kalman filter and smoothing procedures are among the most popular procedures for factor extraction in the context of Dynamic Factor Models. This paper analyses the consequences on point and interval factor estimation of using these procedures when the idiosyncratic components are wrongly assumed to be cross-sectionally uncorrelated. We show that not taking into account the presence of cross-sectional dependence increases the uncertainty of point estimates of the factors. Furthermore, the Mean Square Errors computed using the usual expressions based on asymptotic approximations, are underestimated and may lead to prediction intervals with extremely low coverages.
  • Publication
    Measuring efficiency of Peruvian universities: a stochastic frontier analysis
    (2023-01-10) Orosco Gavilán, Juan Carlos; Lopes Moreira Da Veiga, María Helena; Wiper, Michael Peter; Universidad Carlos III de Madrid. Departamento de Estadística
    In comparison with other regions such as Europe or the USA, there have been relatively few studies of efficiency in the higher education sector in South America. The main objective of this paper is to examine the teaching efficiency of Peruvian, public universities over the period 2011-2018, usingstochastic frontier analysis. Our results suggest that efficiency depends on both the operating time of the university and on the scientific production. We also show that the majority of universities studied maintain their efficiency levels over time, whereas, most of the young universities started off as very inefficient, but have improved their efficiency over time.
  • Publication
    A Neural Network-Based Distributional Constraint Learning Methodology for Mixed-Integer Stochastic Optimization
    (2022-11-21) Alcántara Mata, Antonio; Ruiz Mora, Carlos; Universidad Carlos III de Madrid. Departamento de Estadística
    The use of machine learning methods helps to improve decision making in different fields. In particular, the idea of bridging predictions (machine learning models) and prescriptions (optimization problems) is gaining attention within the scientific community. One of the main ideas to address this trade-off is the so-called Constraint Learning (CL) methodology, where the structures of the machine learning model can be treated as a set of constraints to be embedded within the optimization problem, establishing therelationship between a direct decision variable x and a response variable y. However, most CL approaches have focused on making point predictions for a certain variable, not taking into account the statistical and external uncertainty faced in the modeling process. In this paper, we extend the CL methodology to deal with uncertainty in the response variable y. The novel Distributional Constraint Learning (DCL) methodology makes use of a piece-wise linearizable neural network-based model to estimate the parametersof the conditional distribution of y (dependent on decisions x and contextualinformation), which can be embedded within mixed-integer optimization problems. In particular, we formulate a stochastic optimization problem by sampling random values from the estimated distribution by using a linear set of constraints. In this sense, DCL combines both the high predictive performance of the neural network method and the possibility of generating scenarios to account for uncertainty within a tractable optimization model. The behavior of the proposed methodology is tested in a real-worldproblem in the context of electricity systems, where a Virtual Power Plant seeks to optimize its operation, subject to different forms of uncertainty, and with price-responsive consumers.
  • Publication
    Contagion in sequential financial markets: an experimental analysis
    (2022-10-04) Peeters, Ronald; Lopes Moreira Da Veiga, María Helena; Vorstaz, Marc; Universidad Carlos III de Madrid. Departamento de Estadística; Ministerio de Ciencia e Innovación (España)
    Within an experimental financial market, we study how information about the true dividend of an asset, which is available to some traders, is absorbed in the asset’s price when all traders have access to prices of another different asset. We consider two treatments: in one, the dividends of the two assets are independent; in the other, the dividend of the own asset depends positively on the dividend of the other asset. Since there is no aggregate uncertainty in the own market, observed prices in the other market should not affect own prices according to the rational expectations equilibrium. We find that own prices reasonably converge in both treatments towards the rational expectations equilibrium if the dividend of the own asset is high. In contrast, if the dividend of the own asset is low, we find that own prices are substantially higher (and therefore further away from rational expectations equilibrium) when asset prices are correlated. The prior information equilibrium predicts this treatment effect. Hence, a correlated asset structure can potentially obstruct the information transmission from the informed to the uninformed traders.
  • Publication
    Prescriptive selection of machine learning hyperparameters with applications in power markets: retailer's optimal trading
    (2022-10-03) Corredera, Alberto; Ruiz Mora, Carlos; Universidad Carlos III de Madrid. Departamento de Estadística
    We present a data-driven framework for optimal scenario selection in stochastic optimization with applications in power markets. The proposed methodology relies in the existence of auxiliary information and the use of machine learning techniques to narrow the set of possible realizations (scenarios) of the variables of interest. In particular, we implement a novel validation algorithm that allows optimizing each machine learning hyperparameter to further improve the prescriptive power of the resulting set of scenarios. Supervised machine learning techniques are examined, including kNN and decision trees, and the validation process is adapted to work with time-dependent datasets. Moreover, we extend the proposed methodology to work with unsupervised techniques with promising results. We test the proposed methodology in a realistic power market application: optimal trading strategy in forward and spot markets for an electricity retailer under uncertain spot prices. Results indicate that the retailer can greatly benefit from the proposed data-driven methodology and improve its market performance. Moreover, we perform an extensive set of numerical simulations to analyze under which conditions the best machine learning hyperparameters, in terms of prescriptive performance, differ from those that provide the best predictive accuracy.
  • Publication
    Multivariate Functional Outlier Detection using the FastMUOD Indices
    (2022-09-09) Ojo, Oluwasegun Taiwo; Fernández Anta, Antonio; Genton, Marc G.; Lillo Rodríguez, Rosa Elvira; Universidad Carlos III de Madrid. Departamento de Estadística
    We present definitions and properties of the fast massive unsupervised outlier detection (FastMUOD) indices, used for outlier detection (OD) in functional data. FastMUOD detects outliers by computing, for each curve, an amplitude, magnitude and shape index meant to target the corresponding types of outliers. Some methods adapting FastMUOD to outlier detection in multivariate functional data are then proposed. These include applying FastMUOD on the components of the multivariate data and using random projections. Moreover, these techniques are tested on various simulated and real multivariate functional datasets. Compared with the state of the art in multivariate functional OD, the use of random projections showed the most effective results with similar, and in some cases improved, OD performance.
  • Publication
    Revisiting Granger Causality of CO2 on Global Warming: a Quantile Factor Approach
    (2013-07-22) Chen, Liang; Dolado, Juan José; Gonzalo, Jesús; Ramos Ramirez, Andrey David; Universidad Carlos III de Madrid. Departamento de Estadística; Ministerio de Economía y Competitividad (España); Comunidad de Madrid
    The relationship between global warming and CO2 is a long-standing question in theclimate change literature. In this paper we revisit this topic through the lenses of a new class of factor models for high-dimensional panel data, labeled Quantile Factor Models (QFM). This technique allows us to extract quantile-dependent factors from the distributions of changes in temperatures across a wide range of stable weather stations in the Northern and Southern Hemispheres over a century (1917-2018). In particular, we test whether CO2 emissions/concentrations Granger-cause the underlying factors of the di erent quantiles of the distribution of changes in temperature, and find that they exhibit much higher predictivepower on large negative and medium (lower and middle quantiles) than on large positive changes (upper quantiles). These findings are novel in this literature and complement recent results by Gadea and Gonzalo (2020) who document the existence of steeper trends in lower temperature levels than in other parts of the distribution.