On optimal multiple changepoint algorithms for large data
TL;DR: Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data.
read more
Abstract: Many common approaches to detecting changepoints, for example based on statistical criteria such as penalised likelihood or minimum description length, can be formulated in terms of minimising a cost over segmentations. We focus on a class of dynamic programming algorithms that can solve the resulting minimisation problem exactly, and thus find the optimal segmentation under the given statistical criteria. The standard implementation of these dynamic programming methods have a computational cost that scales at least quadratically in the length of the time-series. Recently pruning ideas have been suggested that can speed up the dynamic programming algorithms, whilst still being guaranteed to be optimal, in that they find the true minimum of the cost function. Here we extend these pruning methods, and introduce two new algorithms for segmenting data: FPOP and SNIP. Empirical results show that FPOP is substantially faster than existing dynamic programming methods, and unlike the existing methods its computational efficiency is robust to the number of changepoints in the data. We evaluate the method for detecting copy number variations and observe that FPOP has a computational cost that is even competitive with that of binary segmentation, but can give much more accurate segmentations.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Selective review of offline change point detection methods
TL;DR: In this article, the authors present a selective survey of algorithms for the offline detection of multiple change points in multivariate time series, and a general yet structuring methodological strategy is adopted to organize this vast body of work.
953
Changepoint Detection in the Presence of Outliers
Paul Fearnhead,Guillem Rigaill +1 more
TL;DR: It is argued that loss functions that are bounded, such as the classical biweight loss, are particularly suitable for changepoint detection—as it is shown that only bounded loss functions are robust to arbitrarily extreme outliers.
152
Computationally efficient changepoint detection for a range of penalties
TL;DR: This work presents a method that enables us to find the solution path for all choices of penalty values across a continuous range and permits an evaluation of the various segmentations to identify a suitable penalty choice.
•Posted Content
An Evaluation of Change Point Detection Algorithms.
TL;DR: This study shows that binary segmentation and Bayesian online change point detection are among the best performing methods.
122
Univariate Mean Change Point Detection: Penalization, CUSUM and Optimality.
TL;DR: It is demonstrated that two computationally-efficient change point estimators, one based on the solution to an $\ell_0$-penalized least squares problem and the other on the popular wild binary segmentation algorithm, are both consistent and achieve a localization rate of the order $\frac{\sigma^2}{\kappa^2} \log(n)$.
References
A new look at the statistical model identification
TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.
Estimating the Dimension of a Model
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Estimating the dimension of a model
Gideon Schwarz
- 01 Jan 2005
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
40.6K
A Cluster Analysis Method for Grouping Means in the Analysis of Variance
A. J. Scott,M. Knott +1 more
TL;DR: In this paper, the authors used the techniques of cluster analysis to split the treatments into reasonably homogeneous groups and developed a likelihood ratio test for judging the significance of differences among the resulting groups.
3K
Circular binary segmentation for the analysis of array-based DNA copy number data.
TL;DR: A modification ofbinary segmentation is developed, which is called circular binary segmentation, to translate noisy intensity measurements into regions of equal copy number in DNA sequence copy number.