TL;DR: Novel algorithms are suggested which are, in senses to be made precise, provably stable and yet designed to avoid the degree of interaction which hinders parallelization of standard algorithms.
Abstract: We introduce a general form of sequential Monte Carlo algorithm defined in terms of a parameterized resampling mechanism. We find that a suitably generalized notion of the Effective Sample Size (ESS), widely used to monitor algorithm degeneracy, appears naturally in a study of its convergence properties. We are then able to phrase sufficient conditions for time-uniform convergence in terms of algorithmic control of the ESS, in turn achievable by adaptively modulating the interaction between particles. This leads us to suggest novel algorithms which are, in senses to be made precise, provably stable and yet designed to avoid the degree of interaction which hinders parallelization of standard algorithms. As a byproduct, we prove time-uniform convergence of the popular adaptive resampling particle filter.
TL;DR: In this paper, a generalization of the multiplier resampling scheme proposed by Buhlmann and Ruppert (2013) along two directions is presented, which allows to transpose to the strongly mixing setting all of the existing multiplier tests on the unknown copula, including nonparametric tests for change point detection.
Abstract: Two key ingredients to carry out inference on the copula of multivariate observations are the empirical copula process and an appropriate resampling scheme for the latter. Among the existing techniques used for i.i.d. observations, the multiplier bootstrap of Remillard and Scaillet (2009) frequently appears to lead to inference procedures with the best finite-sample properties. Bucher and Ruppert (2013) recently proposed an extension of this technique to strictly stationary strongly mixing observations by adapting the dependent multiplier bootstrap of Buhlmann (1993, Section 3.3) to the empirical copula process. The main contribution of this work is a generalization of the multiplier resampling scheme proposed by Bucher and Ruppert (2013) along two directions. First, the resampling scheme is now genuinely sequential, thereby allowing to transpose to the strongly mixing setting all of the existing multiplier tests on the unknown copula, including nonparametric tests for change-point detection. Second, the resampling scheme is now fully automatic as a data-adaptive procedure is proposed which can be used to estimate the bandwidth (block length) parameter. A simulation study is used to investigate the nitesample performance of the resampling scheme and provides suggestions on how to choose several additional parameters. As by-products of this work, the weak convergence of the sequential empirical copula process is obtained under many serial dependence conditions, and the validity of a sequential version of the dependent multiplier bootstrap for empirical processes of Buhlmann is obtained under weaker conditions on the strong mixing coecients and the multipliers.
TL;DR: The main result of this paper states that the new algorithm, appropriately rescaled, converges weakly to a second order Langevin diffusion on Hilbert space; as a consequence the algorithm explores the approximate target measures on R^N in a number of steps which is independent of N.
Abstract: We describe a new MCMC method optimized for the sampling of probability measures on Hilbert space which have a density with respect to a Gaussian; such measures arise in the Bayesian approach to inverse problems, and in conditioned diffusions. Our algorithm is based on two key design principles: (i) algorithms which are well defined in infinite dimensions result in methods which do not suffer from the curse of dimensionality when they are applied to approximations of the infinite dimensional target measure on R^N; (ii) nonreversible algorithms can have better mixing properties compared to their reversible counterparts. The method we introduce is based on the hybrid Monte Carlo algorithm, tailored to incorporate these two design principles. The main result of this paper states that the new algorithm, appropriately rescaled, converges weakly to a second order Langevin diffusion on Hilbert space; as a consequence the algorithm explores the approximate target measures on R^N in a number of steps which is independent of N. We also present the underlying theory for the limiting nonreversible diffusion on Hilbert space, including characterization of the invariant measure, and we describe numerical simulations demonstrating that the proposed method has favourable mixing properties as an MCMC algorithm.
TL;DR: In this article, the authors show that ridge regression is asymptotically minimax and derive new closed form expressions for its as-ymptotic risk under squared error loss.
Abstract: We study asymptotic minimax problems for estimating a $d$-dimensional regression parameter over spheres of growing dimension ($d\to\infty$). Assuming that the data follows a linear model with Gaussian predictors and errors, we show that ridge regression is asymptotically minimax and derive new closed form expressions for its asymptotic risk under squared-error loss. The asymptotic risk of ridge regression is closely related to the Stieltjes transform of the Marcenko–Pastur distribution and the spectral distribution of the predictors from the linear model. Adaptive ridge estimators are also proposed (which adapt to the unknown radius of the sphere) and connections with equivariant estimation are highlighted. Our results are mostly relevant for asymptotic settings where the number of observations, $n$, is proportional to the number of predictors, that is, $d/n\to\rho\in(0,\infty)$.
TL;DR: In this paper, Bernoulli et al. provided a detailed asymptotic analysis of a class of smoothed rank-based cross-periodograms associated with the copula spectral density kernels.
Abstract: Quantile- and copula-related spectral concepts recently have been considered by various authors. Those spectra, in their most general form, provide a full characterization of the copulas associated with the pairs $(X_{t},X_{t-k})$ in a process $(X_{t})_{t\in\mathbb{Z}}$, and account for important dynamic features, such as changes in the conditional shape (skewness, kurtosis), time-irreversibility, or dependence in the extremes that their traditional counterparts cannot capture. Despite various proposals for estimation strategies, only quite incomplete asymptotic distributional results are available so far for the proposed estimators, which constitutes an important obstacle for their practical application. In this paper, we provide a detailed asymptotic analysis of a class of smoothed rank-based cross-periodograms associated with the copula spectral density kernels introduced in Dette et al. [Bernoulli 21 (2015) 781–831]. We show that, for a very general class of (possibly nonlinear) processes, properly scaled and centered smoothed versions of those cross-periodograms, indexed by couples of quantile levels, converge weakly, as stochastic processes, to Gaussian processes. A first application of those results is the construction of asymptotic confidence intervals for copula spectral density kernels. The same convergence results also provide asymptotic distributions (under serially dependent observations) for a new class of rank-based spectral methods involving the Fourier transforms of rank-based serial statistics such as the Spearman, Blomqvist or Gini autocovariance coefficients.
TL;DR: In this paper, two different approaches to stochastic integration in frictionless model free financial mathematics are presented: the first is in the spirit of Ito's integral and based on a certain topology induced by the outer measure corresponding to the minimal superhedging price; the second one is based on the controlled rough path integral.
Abstract: We present two different approaches to stochastic integration in frictionless model free financial mathematics. The first one is in the spirit of Ito's integral and based on a certain topology which is induced by the outer measure corresponding to the minimal superhedging price. The second one is based on the controlled rough path integral. We prove that every "typical price path" has a naturally associated Ito rough path, and justify the application of the controlled rough path integral in finance by showing that it is the limit of non-anticipating Riemann sums, a new result in itself. Compared to the first approach, rough paths have the disadvantage of severely restricting the space of integrands, but the advantage of being a Banach space theory. Both approaches are based entirely on financial arguments and do not require any probabilistic structure.
TL;DR: In this paper, the adaptive estimation of copula correlation matrix (Sigma) for the semi-parametric elliptical copula model is studied, where the correlations are connected to Kendall's tau through a sine function transformation.
Abstract: We study the adaptive estimation of copula correlation matrix $\Sigma$ for the semi-parametric elliptical copula model. In this context, the correlations are connected to Kendall’s tau through a sine function transformation. Hence, a natural estimate for $\Sigma$ is the plug-in estimator $\widehat{\Sigma}$ with Kendall’s tau statistic. We first obtain a sharp bound on the operator norm of $\widehat{\Sigma}-\Sigma$. Then we study a factor model of $\Sigma$, for which we propose a refined estimator $\widetilde{\Sigma}$ by fitting a low-rank matrix plus a diagonal matrix to $\widehat{\Sigma}$ using least squares with a nuclear norm penalty on the low-rank matrix. The bound on the operator norm of $\widehat{\Sigma}-\Sigma$ serves to scale the penalty term, and we obtain finite sample oracle inequalities for $\widetilde{\Sigma}$. We also consider an elementary factor copula model of $\Sigma$, for which we propose closed-form estimators. All of our estimation procedures are entirely data-driven.
TL;DR: Chan and Lai as mentioned in this paper showed that the asymptotics of Piterbarg's approximation on the Euclidean space are similar to Pickands' approximation on a Gaussian random field on the unit sphere.
Abstract: Let $X=\{X(x)\colon\ x\in\mathbb{S}^{N}\}$ be a real-valued, centered Gaussian random field indexed on the $N$-dimensional unit sphere $\mathbb{S}^{N}$. Approximations to the excursion probability $\mathbb{P}\{\sup_{x\in\mathbb{S}^{N}}X(x)\ge u\}$, as $u\to\infty$, are obtained for two cases: (i) $X$ is locally isotropic and its sample functions are non-smooth and; (ii) $X$ is isotropic and its sample functions are twice differentiable. For case (i), the excursion probability can be studied by applying the results in Piterbarg (Asymptotic Methods in the Theory of Gaussian Processes and Fields (1996) Amer. Math. Soc.), Mikhaleva and Piterbarg (Theory Probab. Appl. 41 (1997) 367–379) and Chan and Lai (Ann. Probab. 34 (2006) 80–121). It is shown that the asymptotics of $\mathbb{P}\{\sup_{x\in\mathbb{S}^{N}}X(x)\ge u\}$ is similar to Pickands’ approximation on the Euclidean space which involves Pickands’ constant. For case (ii), we apply the expected Euler characteristic method to obtain a more precise approximation such that the error is super-exponentially small.
TL;DR: In this article, a numerical scheme for solving a dynamic programming equation with Malliavin weights arising from the time-discretization of backward stochastic differential equations with the integration by parts-representation of the Z-component was proposed.
Abstract: We design a numerical scheme for solving a Dynamic Programming equation with Malliavin weights arising from the time-discretization of backward stochastic differential equations with the integration by parts-representation of the Z-component by [Ma-Zhang 2002]. When the sequence of conditional expectations is computed using empirical least-squares regressions, we establish, under general conditions, tight error bounds as the time-average of local regression errors only (up to logarithmic factors). We compute the algorithm complexity by a suitable optimization of the parameters, depending on the dimension and the smoothness of value functions, in the limit as the number of grid times goes to infinity. The estimates take into account the regularity of the terminal function.
TL;DR: In this article, the authors introduce a framework for simulating finite dimensional representations of (jump) diffusion sample paths over finite intervals, without discretisation error (exactly), in such a way that the sample path can be restored at any finite collection of time points.
Abstract: This paper introduces a framework for simulating finite dimensional representations of (jump) diffusion sample paths over finite intervals, without discretisation error (exactly), in such a way that the sample path can be restored at any finite collection of time points. Within this framework we extend existing exact algorithms and introduce novel adaptive approaches. We consider an application of the methodology developed within this paper which allows the simulation of upper and lower bounding processes which almost surely constrain (jump) diffusion sample paths to any specified tolerance. We demonstrate the efficacy of our approach by showing that with finite computation it is possible to determine whether or not sample paths cross various irregular barriers, simulate to any specified tolerance the first hitting time of the irregular barrier and simulate killed diffusion sample paths.
TL;DR: In this paper, the authors proposed a plug-in method based on a deconvolution density estimator, which is minimax optimal under minimal and natural conditions, and obtained optimal adaptive estimation by a data-driven bandwidth choice.
Abstract: Quantile estimation in deconvolution problems is studied comprehensively. In particular, the more realistic setup of unknown error distributions is covered. Our plug-in method is based on a deconvolution density estimator and is minimax optimal under minimal and natural conditions. This closes an important gap in the literature. Optimal adaptive estimation is obtained by a data-driven bandwidth choice. As a side result, we obtain optimal rates for the plug-in estimation of distribution functions with unknown error distributions. The method is applied to a real data example.
TL;DR: In this article, the scaling transition and distributional long-range dependence for stationary random fields with normalized partial sums on rectangles with sides growing at rates varying with the unit root were introduced.
Abstract: We introduce the notions of scaling transition and distributional long-range dependence for stationary random fields $Y$ on $\mathbb{Z}^{2}$ whose normalized partial sums on rectangles with sides growing at rates $O(n)$ and $O(n^{\gamma})$ tend to an operator scaling random field $V_{\gamma}$ on $\mathbb{R}^{2}$, for any $\gamma>0$. The scaling transition is characterized by the fact that there exists a unique $\gamma_{0}>0$ such that the scaling limits $V_{\gamma}$ are different and do not depend on $\gamma$ for $\gamma>\gamma_{0}$ and $\gamma<\gamma_{0}$. The existence of scaling transition together with anisotropic and isotropic distributional long-range dependence properties is demonstrated for a class of $\alpha$-stable $(1<\alpha\le2)$ aggregated nearest-neighbor autoregressive random fields on $\mathbb{Z}^{2}$ with a scalar random coefficient $A$ having a regularly varying probability density near the “unit root” $A=1$.
TL;DR: In this article, the authors studied the asymptotic expansion in total variation in the central limit theorem when the law of the basic random variable is locally lower-bounded by the Lebesgue measure.
Abstract: The aim of this paper is to study the asymptotic expansion in total variation in the central limit theorem when the law of the basic random variable is locally lower-bounded by the Lebesgue measure (or equivalently, has an absolutely continuous component): we develop the error in powers of $n^{-1/2}$ and give an explicit formula for the approximating measure.
TL;DR: In this paper, it was shown that a stationary, stochastically continuous, sum- or max-i.i.d. random process can be generated by a measure-preserving flow on a finite Borel measure space and that this flow is unique.
Abstract: Introduced is the notion of minimality for spectral representations of sum- and max-infinitely divisible processes and it is shown that the minimal spectral representation on a Borel space exists and is unique. This fact is used to show that a stationary, stochastically continuous, sum- or max-i.d. random process on $\mathbb{R}^{d}$ can be generated by a measure-preserving flow on a $\sigma$-finite Borel measure space and that this flow is unique. This development makes it possible to extend the classification program of Rosinski (Ann. Probab. 23 (1995) 1163–1187) with a unified treatment of both sum- and max-infinitely divisible processes. As a particular case, a characterization of stationary, stochastically continuous, union-infinitely divisible random measurable subsets of $\mathbb{R}^{d}$ is obtained. Introduced and classified are several new max-i.d. random field models including fields of Penrose type and fields associated to Poisson line processes.
TL;DR: In this article, the authors investigated the problem of optimal estimation in weak and strong topologies by choosing a unit ball in a reproducing kernel Hilbert space, and they showed that this choice is both of theoretical and computational interest.
Abstract: Given random samples drawn i.i.d. from a probability measure $\mathbb{P}$ (defined on say, $\mathbb{R}^{d}$), it is well-known that the empirical estimator is an optimal estimator of $\mathbb{P}$ in weak topology but not even a consistent estimator of its density (if it exists) in the strong topology (induced by the total variation distance). On the other hand, various popular density estimators such as kernel and wavelet density estimators are optimal in the strong topology in the sense of achieving the minimax rate over all estimators for a Sobolev ball of densities. Recently, it has been shown in a series of papers by Gine and Nickl that these density estimators on $\mathbb{R}$ that are optimal in strong topology are also optimal in $\Vert\cdot\Vert_{\mathcal{F} }$ for certain choices of $\mathcal{F}$ such that $\Vert\cdot\Vert_{\mathcal{F} }$ metrizes the weak topology, where $\Vert\mathbb{P} \Vert_{\mathcal{F} }:=\sup\{\int f\,\mathrm{d}\mathbb{P} \colon\ f\in\mathcal{F} \}$. In this paper, we investigate this problem of optimal estimation in weak and strong topologies by choosing $\mathcal{F}$ to be a unit ball in a reproducing kernel Hilbert space (say $\mathcal{F}_{H}$ defined over $\mathbb{R}^{d}$), where this choice is both of theoretical and computational interest. Under some mild conditions on the reproducing kernel, we show that $\Vert\cdot\Vert_{\mathcal{F}_{H}}$ metrizes the weak topology and the kernel density estimator (with $L^{1}$ optimal bandwidth) estimates $\mathbb{P}$ at dimension independent optimal rate of $n^{-1/2}$ in $\Vert\cdot\Vert_{\mathcal{F}_{H}}$ along with providing a uniform central limit theorem for the kernel density estimator.
TL;DR: In this paper, the authors prove functional central and non-central limit theorems for generalized variations of the anisotropic d-parameter fractional Brownian sheet (fBs) for any natural number d.
Abstract: We prove functional central and non-central limit theorems for generalized variations of the anisotropic d-parameter fractional Brownian sheet (fBs) for any natural number d. Whether the central or the non-central limit theorem applies depends on the Hermite rank of the variation functional and on the smallest component of the Hurst parameter vector of the fBs. The limiting process in the former result is another fBs, independent of the original fBs, whereas the limit given by the latter result is an Hermite sheet, which is driven by the same white noise as the original fBs. As an application, we derive functional limit theorems for power variations of the fBs and discuss what is a proper way to interpolate them to ensure functional
TL;DR: In this article, the authors proposed an alternative class of stochastic volatility models with heavy-tailed volatilities and examined their extreme value behavior, which allows for a much more flexible extremal dependence between consecutive observations and can thus describe the observed clustering of financial returns more realistically.
Abstract: Stochastic volatility processes with heavy-tailed innovations are a well-known model for financial time series. In these models, the extremes of the log returns are mainly driven by the extremes of the i.i.d. innovation sequence which leads to a very strong form of asymptotic independence, that is, the coefficient of tail dependence is equal to $1/2$ for all positive lags. We propose an alternative class of stochastic volatility models with heavy-tailed volatilities and examine their extreme value behavior. In particular, it is shown that, while lagged extreme observations are typically asymptotically independent, their coefficient of tail dependence can take on any value between $1/2$ (corresponding to exact independence) and 1 (related to asymptotic dependence). Hence, this class allows for a much more flexible extremal dependence between consecutive observations than classical SV models and can thus describe the observed clustering of financial returns more realistically. The extremal dependence structure of lagged observations is analyzed in the framework of regular variation on the cone $(0,\infty)^{d}$. As two auxiliary results which are of interest on their own we derive a new Breiman-type theorem about regular variation on $(0,\infty)^{d}$ for products of a random matrix and a regularly varying random vector and a statement about the joint extremal behavior of products of i.i.d. regularly varying random variables.
TL;DR: In this article, a functional limit theorem for the partial maxima of a long memory stable sequence produces a limiting process that can be described as a β-power time change in the classical Frechet extremal process, for β in a subinterval of the unit interval.
Abstract: A functional limit theorem for the partial maxima of a long memory stable sequence produces a limiting process that can be described as a β-power time change in the classical Frechet extremal process, for β in a subinterval of the unit interval. Any such power time change in the extremal process for 0 < β < 1 produces a process with stationary max-increments. This deceptively simple time change hides the much more delicate structure of the resulting process as a self-affine random sup measure. We uncover this structure and show that in a certain range of the parameters this random measure arises as a limit of the partial maxima of the same long memory stable sequence, but in a different space. These results open a way to construct a whole new class of self-similar Frechet processes with stationary max-increments.
TL;DR: In this paper, the large-sample distribution of Wald statistics at parameter points at which the gradient of the tested constraint vanishes was studied, and it was shown that when based on an asymptotically normal estimator, the Wald statistic converges to a rational function of a normal vector.
Abstract: Motivated by the problem of testing tetrad constraints in factor analysis, we study the large-sample distribution of Wald statistics at parameter points at which the gradient of the tested constraint vanishes. When based on an asymptotically normal estimator, the Wald statistic converges to a rational function of a normal random vector. The rational function is determined by a homogeneous polynomial and a covariance matrix. For quadratic forms and bivariate monomials of arbitrary degree, we show unexpected relationships to chi-square distributions that explain conservative behavior of certain Wald tests. For general monomials, we offer a conjecture according to which the reciprocal of a certain quadratic form in the reciprocals of dependent normal random variables is chi-square distributed.
TL;DR: In this article, it was shown that under regularity conditions, the rate of convergence in probability of any function can be accelerated by n−1/2 for any function, where n−2 is the dimension of the kernel estimator.
Abstract: Let $(X_1,\ldots,X_n)$ be an i.i.d. sequence of random variables in $\R^d$, $d\geq 1$. We show that, for any function $\varphi:\R^d\r \R$, under regularity conditions, \begin{align*} n^{1/2} \left(n^{-1} \sum_{i=1}^n \frac{\varphi(X_i)}{\w f^{(i)}(X_i)}-\int_{} \varphi(x)dx \right) \overset{\P}{\lr} 0, \end{align*} where $\w f^{(i)}$ is the classical leave-one-out kernel estimator of the density of $X_1$. This result is striking because it speeds up traditional rates, in root $n$, derived from the central limit theorem when $\w f^{(i)}=f$. Although this paper highlights some applications, we mainly address theoretical issues related to the later result. In particular, we derive upper bounds for the rate of convergence in probability. Those bounds depend on the regularity of the functions $\varphi$ and $f$, the dimension $d$ and the bandwidth of the kernel estimator. Moreover those bounds are shown to be accurate since they are used as renormalizing sequence in two central limit theorems each reflecting different degrees of smoothness of $\varphi$. In addition, as an application to regression modelling with random design, we provide the asymptotic normality of the estimation of the linear functionals of a regression function. Because of the above result, the asymptotic variance does not depend on the regression function. Finally, we debate the choice of the bandwidth for integral approximation and we highlight the good behaviour of our procedure through simulations.
TL;DR: In this paper, a family of multivariate distribution functions that arises from ordering, idiosyncratically distorting, and finally multiplying the arguments are given, and sufficient conditions on the involved distortions are given.
Abstract: We characterize a comprehensive family of d-variate exogenous shock models. Analytically, we consider a family of multivariate distribution functions that arises from ordering, idiosyncratically distorting, and finally multiplying the arguments. Necessary and sufficient conditions on the involved distortions to yield a multivariate distribution function are given. Probabilistically, the attainable set of distribution functions corresponds to a large class of exchangeable exogenous shock models. Besides, the vector of exceedance times of an increasing additive stochastic process across independent exponential trigger variables is shown to constitute an interesting subclass of the considered distributions and yields a second probabilistic model. The alternative construction is illustrated in terms of two examples.
TL;DR: In this paper, the authors consider the case where each node carries an independent random variable uniformly distributed on the hypercube and study the number of paths from one vertex to another vertex along which the values on the nodes form an increasing sequence.
Abstract: Motivated by an evolutionary biology question, we study the following problem: we consider the hypercube $\{0,1\}^{L}$ where each node carries an independent random variable uniformly distributed on $[0,1]$, except $(1,1,\ldots,1)$ which carries the value $1$ and $(0,0,\ldots,0)$ which carries the value $x\in[0,1]$. We study the number $\Theta$ of paths from vertex $(0,0,\ldots,0)$ to the opposite vertex $(1,1,\ldots,1)$ along which the values on the nodes form an increasing sequence. We show that if the value on $(0,0,\ldots,0)$ is set to $x=X/L$ then $\Theta/L$ converges in law as $L\to\infty$ to $\mathrm{e}^{-X}$ times the product of two standard independent exponential variables. As a first step in the analysis, we study the same question when the graph is that of a tree where the root has arity $L$, each node at level 1 has arity $L-1$, …, and the nodes at level $L-1$ have only one offspring which are the leaves of the tree (all the leaves are assigned the value 1, the root the value $x\in[0,1]$).
TL;DR: In this paper, the authors considered the regularity of solutions to backward stochastic differential equations with Lipschitz generators driven by a Brownian motion and a Poisson random measure associated with a Levy process.
Abstract: We consider the $L_{2}$-regularity of solutions to backward stochastic differential equations (BSDEs) with Lipschitz generators driven by a Brownian motion and a Poisson random measure associated with a Levy process $(X_{t})_{t\in[0,T]}$. The terminal condition may be a Borel function of finitely many increments of the Levy process which is not necessarily Lipschitz but only satisfies a fractional smoothness condition. The results are obtained by investigating how the special structure appearing in the chaos expansion of the terminal condition is inherited by the solution to the BSDE.
TL;DR: In this paper, central limit theorems for the sum, over a spatial region, of observations from a linear process on a $d$-dimensional lattice are established for the cases of positive strong dependence, short range dependence, and negative dependence.
Abstract: Central limit theorems are established for the sum, over a spatial region, of observations from a linear process on a $d$-dimensional lattice. This region need not be rectangular, but can be irregularly-shaped. Separate results are established for the cases of positive strong dependence, short range dependence, and negative dependence. We provide approximations to asymptotic variances that reveal differential rates of convergence under the three types of dependence. Further, in contrast to the one dimensional (i.e., the time series) case, it is shown that the form of the asymptotic variance in dimensions $d>1$ critically depends on the geometry of the sampling region under positive strong dependence and under negative dependence and that there can be non-trivial edge-effects under negative dependence for $d>1$. Precise conditions for the presence of edge effects are also given.
TL;DR: In this article, the authors considered greedy algorithms to solve the problem of constructing a prediction in the presence of potentially large estimation error and showed that the resulting estimators are consistent under weak conditions.
Abstract: In many prediction problems, it is not uncommon that the number of variables used to construct a forecast is of the same order of magnitude as the sample size, if not larger. We then face the problem of constructing a prediction in the presence of potentially large estimation error. Control of the estimation error is either achieved by selecting variables or combining all the variables in some special way. This paper considers greedy algorithms to solve this problem. It is shown that the resulting estimators are consistent under weak conditions. In particular, the derived rates of convergence are either minimax or improve on the ones given in the literature allowing for dependence and unbounded regressors. Some versions of the algorithms provide fast solution to problems such as Lasso.
TL;DR: The Forward Search as mentioned in this paper is an iterative algorithm for avoiding outliers in a regression analysis suggested by Hadi and Simonoff (J. Amer. Statist. 88 (1993) 1264-1272), see also Atkinson and Riani (Robust Diagnostic Regression Analysis (2000) Springer).
Abstract: The Forward Search is an iterative algorithm for avoiding outliers in a regression analysis suggested by Hadi and Simonoff (J. Amer. Statist. Assoc. 88 (1993) 1264–1272), see also Atkinson and Riani (Robust Diagnostic Regression Analysis (2000) Springer). The algorithm constructs subsets of “good” observations so that the size of the subsets increases as the algorithm progresses. It results in a sequence of regression estimators and forward residuals. Outliers are detected by monitoring the sequence of forward residuals. We show that the sequences of regression estimators and forward residuals converge to Gaussian processes. The proof involves a new iterated martingale inequality, a theory for a new class of weighted and marked empirical processes, the corresponding quantile process theory, and a fixed point argument to describe the iterative aspect of the procedure.
TL;DR: The combinatorial structure of conditionally-i.i.d. sequences of negative binomial processes with a common beta process base measure is characterized and the key Markov kernels needed to use a NB-IBP representation in a Markov Chain Monte Carlo algorithm targeting a posterior distribution are described.
Abstract: We characterize the combinatorial structure of conditionally-i.i.d. sequences of negative binomial processes with a common beta process base measure. In Bayesian nonparametric applications, such processes have served as models for latent multisets of features underlying data. Analogously, random subsets arise from conditionally-i.i.d. sequences of Bernoulli processes with a common beta process base measure, in which case the combinatorial structure is described by the Indian buffet process. Our results give a count analogue of the Indian buffet process, which we call a negative binomial Indian buffet process. As an intermediate step toward this goal, we provide a construction for the beta negative binomial process that avoids a representation of the underlying beta process base measure. We describe the key Markov kernels needed to use a NB-IBP representation in a Markov Chain Monte Carlo algorithm targeting a posterior distribution.
TL;DR: In this paper, distributional identities for a Levy process, its quadratic variation process, and its maximal jump processes are derived, and used to make small time (as $t\downarrow0$) asymptotic comparisons between them.
Abstract: Distributional identities for a Levy process $X_{t}$, its quadratic variation process $V_{t}$ and its maximal jump processes, are derived, and used to make “small time” (as $t\downarrow0$) asymptotic comparisons between them. The representations are constructed using properties of the underlying Poisson point process of the jumps of $X$. Apart from providing insight into the connections between $X$, $V$, and their maximal jump processes, they enable investigation of a great variety of limiting behaviours. As an application, we study “self-normalised” versions of $X_{t}$, that is, $X_{t}$ after division by $\sup_{0
TL;DR: In this paper, the authors present a central limit theorem for a pre-averaged version of the realized covariance estimator for the quadratic covariation of a discretely observed semimartingale with noise.
Abstract: This paper presents a central limit theorem for a pre-averaged version of the realized covariance estimator for the quadratic covariation of a discretely observed semimartingale with noise. The semimartingale possibly has jumps, while the observation times show irregularity, non-synchronicity, and some dependence on the observed process. It is shown that the observation times’ effect on the asymptotic distribution of the estimator is only through two characteristics: the observation frequency and the covariance structure of the noise. This is completely different from the case of the realized covariance in a pure semimartingale setting.
TL;DR: In this paper, the authors study Bayes factor consistency for non-nested linear models with a growing number of parameters, and compare the asymptotic behaviors between the proposed hyperprior and the intrinsic Bayes Factor in the literature.
Abstract: Zellner’s $g$-prior is a popular prior choice for the model selection problems in the context of normal regression models. Wang and Sun [J. Statist. Plann. Inference 147 (2014) 95–105] recently adopt this prior and put a special hyper-prior for $g$, which results in a closed-form expression of Bayes factor for nested linear model comparisons. They have shown that under very general conditions, the Bayes factor is consistent when two competing models are of order $O(n^{\tau})$ for $\tau <1$ and for $\tau=1$ is almost consistent except a small inconsistency region around the null hypothesis. In this paper, we study Bayes factor consistency for nonnested linear models with a growing number of parameters. Some of the proposed results generalize the ones of the Bayes factor for the case of nested linear models. Specifically, we compare the asymptotic behaviors between the proposed Bayes factor and the intrinsic Bayes factor in the literature.