Time series segmentation procedures to detect, locate and estimate change-points

Thumbnail Image
Publication date
Defense date
Journal Title
Journal ISSN
Volume Title
Google Scholar
Research Projects
Organizational Units
Journal Issue
This thesis deals with the problem of modeling an univariate nonstationary time series by a set of approximately stationary processes. The observed period is segmented into intervals, also called partitions, blocks or segments, in which the time series behaves as approximately stationary. Thus, by segmenting a time series, we aim to obtain the periods of stability and homogeneity in the behavior of the process; identify the moments of change, called change-points; represent the regularities and features of each piece or block; and, use this information in order to determine the pattern in the nonstationary time series. When the time series exhibits multiple change-points, a more intricate and difficult issue is to use an efficient procedure to detect, locate and estimate them. Thus, the main goal of the thesis consists on describing, studying comparatively with simulated data, and applying to real data, a number of segmentation and/or change-points detection procedures, which involve both, different type of statistics indicating when the data is exhibiting a potential break, and, searching algorithms to locate multiple patterns variations. The thesis is structured in five chapters. Chapter 1 introduces the main concepts involved in the segmentation problem in the context of time series. First, a summary of the main statistics to detect a single change-point is presented. Second, we point out the multiple change-points searching algorithms presented in the literature and the linear models for representing time series, both in the parametric and the non-parametric approach. Third, we introduce the locally stationary and piecewise stationary processes. Finally, we show examples of piecewise and locally stationary simulated and real time series where the detection of change-point and segmentation seems to be important. Chapter 2 deals with the problem of detecting, locating and estimating a single or multiple changes in the parameters of a stationary process. We consider changes in the marginal mean, the marginal variance, and both the mean and the variance. This is done for both uncorrelated, or serial correlated processes. The main contributions of this chapter are: a) introducing a modification in the theoretical model proposed by Al Ibrahim et al. (2003) that is useful to look for changes in the mean and the autoregressive coefficients in piecewise autoregressive processes, by using a procedure based on the Bayesian information criterion; we allow also the presence of changes in the variance of the perturbation term; b) comparing this procedure with several procedures available in the literature which are based on cusum methods (Inclán and Tiao (1994), Lee et al. (2003)), minimum description length principle (Davis et al. (2006)), the time varying spectrum (Ombao et al. (2002)) and the likelihood ratio test (Killick et al. (2012)). For that, we compute the empirical size and power properties in several scenarios and; c)apply them to neurology and speech recognition datasets. Chapter 3 studies processes, with constant conditional mean and dynamic behavior in the conditional variance, which are also affected by structural changes. Thus, the goal is to explore, analyse and apply the change-point detection and estimation methods to the situation when the conditional variance of a univariate process is heteroskedastic and exhibits change-points. Procedures based on informational approach, cusum statistics, minimum description length and the spectrum assuming an heteroskedastic time series are presented. We propose a method to detect and locate change-points by using the BIC as an extension of its application in linear models. We analyse comparatively the size and power properties of the procedures presented for single and multiple change-point scenarios and illustrate their performance with the S&P 500 returns. Chapter 4 analyses the problem of detecting and estimating smooth change-points in the data, where the Linear Trend change-point (LTCP) model is considered to represent a smooth change. We propose a procedure based on the Bayesian information criterion to distinguish a smooth from an abrupt change-point. The likelihood function of the LTCP model is obtained, as well as the conditional maximum likelihood estimator of the parameters in the model. The proposed procedure is compared with the outliers analysis techniques (Fox (1972), Chang (1982), Chen and Liu (1993), Kaiser (1999), among others) performing simulation experiments. We also present an iterative procedure to detect multiple smooth and abrupt change-points. This procedure is illustrated with the number of deaths in traffic accidents in Spanish motorways. Finally, Chapter 5 summarizes the main results of the thesis and proposes some extensions for future research.
Análisis de series temporales, Probabilidad
Bibliographic citation