1,721,183 research outputs found
Seeded intervals and noise level estimation in change point detection: a discussion of Fryzlewicz (2020)
ISSN:1226-3192ISSN:2005-2863ISSN:1226-319
Optimistic Search: Change Point Estimation for Large-scale Data via Adaptive Logarithmic Queries
Model-based Boosting 2.0
We describe version 2.0 of the R add-on package mboost. The package implements boosting for optimizing general risk functions using component-wise (penalized) least squares estimates or regression trees as base-learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data
Seeded Binary Segmentation: A general methodology for fast and optimal change point detection
In recent years, there has been an increasing demand on efficient algorithms
for large scale change point detection problems. To this end, we propose seeded
binary segmentation, an approach relying on a deterministic construction of
background intervals, called seeded intervals, in which single change points
are searched. The final selection of change points based on the candidates from
seeded intervals can be done in various ways, adapted to the problem at hand.
Thus, seeded binary segmentation is easy to adapt to a wide range of change
point detection problems, let that be univariate, multivariate or even
high-dimensional.
We consider the univariate Gaussian change in mean setup in detail. For this
specific case we show that seeded binary segmentation leads to a near-linear
time approach (i.e. linear up to a logarithmic factor) independent of the
underlying number of change points. Furthermore, using appropriate selection
methods, the methodology is shown to be asymptotically minimax optimal. While
computationally more efficient, the finite sample estimation performance
remains competitive compared to state of the art procedures. Moreover, we
illustrate the methodology for high-dimensional settings with an inverse
covariance change point detection problem where our proposal leads to massive
computational gains while still exhibiting good statistical performance
Model-based Boosting 2.0
We describe version 2.0 of the R add-on package mboost. The package implements boosting for optimizing general risk functions using component-wise (penalized) least squares estimates or regression trees as base-learners for fitting generalized linear, additive and interaction models to potentially high-dimensional data
Optimistic search strategy: Change point detection for large-scale data via adaptive logarithmic queries
As a classical and ever reviving topic, change point detection is often formulated as a search for the maximum of a gain function describing improved fits when segmenting the data. Searching through all candidate split points on the grid for finding the best one requires (T) observations. If each evaluation is computationally demanding (e.g. in high-dimensional models), this can become infeasible. Instead, we propose optimistic search strategies with (\log T)$ evaluations exploiting specific structure of the gain function. Towards solid understanding of our strategies, we investigate in detail the classical univariate Gaussian change in mean setup. For some of our proposals we prove asymptotic minimax optimality for single and multiple change point scenarios. Our search strategies generalize far beyond the theoretically analyzed univariate setup. We illustrate, as an example, massive computational speedup in change point detection for high-dimensional Gaussian graphical models. More generally, we demonstrate empirically that our optimistic search methods lead to competitive estimation performance while heavily reducing run-time
Optimistic Search: Change Point Estimation for Large-scale Data via Adaptive Logarithmic Queries
Optimistic search strategy: Change point detection for large-scale data via adaptive logarithmic queries
As a classical and ever reviving topic, change point detection is often formulated as a search for the maximum of a gain function describing improved fits when segmenting the data. Searching through all candidate split points on the grid for finding the best one requires (T) observations. If each evaluation is computationally demanding (e.g. in high-dimensional models), this can become infeasible. Instead, we propose optimistic search strategies with (\log T)$ evaluations exploiting specific structure of the gain function. Towards solid understanding of our strategies, we investigate in detail the classical univariate Gaussian change in mean setup. For some of our proposals we prove asymptotic minimax optimality for single and multiple change point scenarios. Our search strategies generalize far beyond the theoretically analyzed univariate setup. We illustrate, as an example, massive computational speedup in change point detection for high-dimensional Gaussian graphical models. More generally, we demonstrate empirically that our optimistic search methods lead to competitive estimation performance while heavily reducing run-time
Seeded Binary Segmentation: A general methodology for fast and optimal change point detection
In recent years, there has been an increasing demand on efficient algorithms for large scale change point detection problems. To this end, we propose seeded binary segmentation, an approach relying on a deterministic construction of background intervals, called seeded intervals, in which single change points are searched. The final selection of change points based on the candidates from seeded intervals can be done in various ways, adapted to the problem at hand. Thus, seeded binary segmentation is easy to adapt to a wide range of change point detection problems, let that be univariate, multivariate or even high-dimensional. We consider the univariate Gaussian change in mean setup in detail. For this specific case we show that seeded binary segmentation leads to a near-linear time approach (i.e. linear up to a logarithmic factor) independent of the underlying number of change points. Furthermore, using appropriate selection methods, the methodology is shown to be asymptotically minimax optimal. While computationally more efficient, the finite sample estimation performance remains competitive compared to state of the art procedures. Moreover, we illustrate the methodology for high-dimensional settings with an inverse covariance change point detection problem where our proposal leads to massive computational gains while still exhibiting good statistical performance
- …
