TL;DR: In this article, a survey of efficient techniques for estimating the probabilities of certain rare events in queueing and reliability models is presented, where the rare events of interest are long waiting times or buffer overflows in queuing systems and system failure events in reliability models of highly dependable computing systems.
Abstract: This paper surveys efficient techniques for estimating, via simulation, the probabilities of certain rare events in queueing and reliability models. The rare events of interest are long waiting times or buffer overflows in queueing systems, and system failure events in reliability models of highly dependable computing systems. The general approach to speeding up such simulations is to accelerate the occurrence of the rare events by using importance sampling. In importance sampling, the system is simulated using a new set of input probability distributions, and unbiased estimates are recovered by multiplying the simulation output by a likelihood ratio. Our focus is on describing asymptotically optimal importance sampling techniques. Using asymptotically optimal importance sampling, the number of samples required to get accurate estimates grows slowly compared to the rate at which the probability of the rare event approaches zero. In practice, this means that run lengths can be reduced by many orders of magnitude, compared to standard simulation. In certain cases, asymptotically optimal importance sampling results in estimates having bounded relative error. With bounded relative error, only a fixed number of samples are required to get accurate estimates, no matter how rare the event of interest is. The queueing systems studied include simple queues (e.g., GI/GI/1), Jackson networks, discrete time queues with multiple autocorrelated arrival processes that arise in the analysis of Asynchronous Transfer Mode communications switches, and tree structured networks of such switches. Both Markovian and non-Markovian reliability models are treated.
TL;DR: In this paper, the role of leave-one-out cross-validation for bandwidth selection in nonparametric smoothing problems is discussed and a plug-in estimator of the asymptotically optimal bandwidth is proposed.
TL;DR: In this article, an asymptotic formula for the mean integrated squared error (MISE) of nonlinear wavelet-based density estimators is provided, which is available for densities which are smooth in only a piecewise sense.
Abstract: We provide an asymptotic formula for the mean integrated squared error (MISE) of nonlinear wavelet-based density estimators. We show that, unlike the analogous situation for kernel density estimators, this MISE formula is relatively unaffected by assumptions of continuity. In particular, it is available for densities which are smooth in only a piecewise sense. Another difference is that in the wavelet case the classical MISE formula is valid only for sufficiently small values of the bandwidth. For larger bandwidths MISE assumes a very different form and hardly varies at all with changing bandwidth. This remarkable property guarantees a high level of robustness against oversmoothing, not encountered in the context of kernel methods. We also use the MISE formula to describe an asymptotically optimal empirical bandwidth selection rule.
TL;DR: The problem of performing a global combine (summation) operation on a distributed memory computer using a two-dimensional mesh interconnect with wormhole routing using algorithms that are asymptotically optimal for short vectors and for long vectors is considered.
TL;DR: It is proved in a general probabilistic framework that thevalue of the optimal solution and the value of the worst solution are asymptotically almost surely the same provided as N and m become large.
Abstract: Consider a class of optimization problems for which the cardinality of the set of feasible solutions is m and the size of every feasible solution is N. We prove in a general probabilistic framework that the value of the optimal solution and the value of the worst solution are asymptotically almost surely (a.s.) the same provided as N and m become large. This result implies that for such a class of combinatorial optimization problems almost ecery algorithm finds asymptotically optimal solution! The quadratic assignment problem, a location problem on graphs, and a pattern matching problem fall into this class
TL;DR: A proof of the lower bound is presented, which is based on Alexander's technique but is technically simpler and more accessible, and three variants of the proof are presented to provide more intuitive insight into the “large-discrepancy” phenomenon.
Abstract: For eachd?2, it is possible to placen points ind-space so that, given any two-coloring of the points, a half-space exists within which one color outnumbers the other by as much ascn1/2?1/2d, for some constantc>0 depending ond. This result was proven in a slightly weaker form by Beck and the bound was later tightened by Alexander. It was recently shown to be asymptotically optimal by Matousek. We present a proof of the lower bound, which is based on Alexander's technique but is technically simpler and more accessible. We present three variants of the proof, for three diffrent cases, to provide more intuitive insight into the "large-discrepancy" phenomenon. We also give geometric and probabilistic interpretations of the technique.
TL;DR: In this article, the central limit theorem for a class of stochastic difference equations established under weak conditions on disturbances and observations is used for tracking random drifting parameters of a linear regression system.
Abstract: This paper addresses the problem of tracking random drifting parameters of a linear regression system. The asymptotic properties of several estimation algorithms in the limit of slow drift are studied. The basic tool is the central limit theorem for a class of stochastic difference equations established under weak conditions on disturbances and observations. The estimates of the rate of convergence obtained in the paper allow the asymptotically optimal algorithms to be developed.
TL;DR: In this article, the discrepancy principle and generalised maximum likelihood (GML) methods for choosing the crucial regularization parameter λ were considered and the asymptotic properties as n → ∞ of the expected estimates λD and λM were investigated.
Abstract: Let fnλ be the regularised solution of a general, linear operator equation, K f0 = g, from discrete, noisy data yi = g(xi ) + ei, i = 1, …, n, where ei are uncorrelated random errors with variance σ2. In this paper, we consider the two well–known methods – the discrepancy principle and generalised maximum likelihood (GML), for choosing the crucial regularisation parameter λ. We investigate the asymptotic properties as n → ∞ of the “expected” estimates λD and λM corresponding to these two methods respectively. It is shown that if f0 is sufficiently smooth, then λD is weakly asymptotically optimal (ao) with respect to the risk and an L2 norm on the output error. However, λD oversmooths for all sufficiently large n and also for all sufficiently small σ2. If f0 is not too smooth relative to the regularisation space W, then λD can also be weakly ao with respect to a whole class of loss functions involving stronger norms on the input error. For the GML method, we show that if f0 is smooth relative to W (for example f0 xs2208 Wθ, 2, θ > m, if W = Wm, 2), then λM is asymptotically sub-optimal and undersmoothing with respect to all of the loss functions above.
TL;DR: In this paper, two new criteria of optimality for detecting and isolating abrupt changes in systems with random disturbances have been established, and two classes of change detection and isolation algorithms have been defined.
Abstract: Addresses the problem of detecting and isolating abrupt changes in systems with random disturbances. In this paper the author establishes two new criteria of optimality for this problem. The author defines two classes of change detection and isolation algorithms, gives an asymptotically optimal solution to the above problems, investigates the statistical properties of the proposed algorithm and proves optimality theorems for each case.
TL;DR: In this paper, the authors compare two parameterizations of a commonly used instrumental variables estimator (Hansen (1982)) to one that is asymptotically optimal in a class of estimators that includes the conventional one.
Abstract: Using a dynamic linear equation that has a conditionally homoskedastic moving average disturbance, we compare two parameterizations of a commonly used instrumental variables estimator (Hansen (1982)) to one that is asymptotically optimal in a class of estimators that includes the conventional one (Hansen (1985)). We find that for some plausible data generating processes, the optimal one is distinctly more efficient asymptotically. Simulations indicate that in samples of size typically available, asymptotic theory describes the distribution of the parameter estimates reasonably well, but that test statistics sometimes are poorly sized.
TL;DR: In this paper, a class of designs that are locally parallelogram lattices but whose densities can vary is proposed, and the asymptotic variance of the cubature error for these designs is obtained for a set of isotropic random fields and a sequence of cubature rules within this class is found.
Abstract: For predicting ∫ G v(x)Z(x) dx, where v is a fixed known function and Z is a stationary random field, a good sampling design should have a greater density of observations where v is relatively large in absolute value. Designs using this idea when G = [0, 1] have been studied for some time. For G a region in two dimensions, very little is known about the statistical properties of cubature rules based on designs with varying density. This work proposes a class of designs that are locally parallelogram lattices but whose densities can vary. The asymptotic variance of the cubature error for these designs is obtained for a class of isotropic random fields and an asymptotically optimal sequence of cubature rules within this class is found. I conjecture that this sequence of cubature rules is asymptotically optimal with respect to all cubature rules.
TL;DR: Following Rissanen the authors consider the statistical model {P θ | as a code-book, θ indexing the codes, to obtain a single code, by encoding some θ and then encoding data x with the code corresponding to this θ.
Abstract: Following Rissanen we consider the statistical model {P θ | as a code-book, θ indexing the codes. To obtain a single code, we first encode some θ and then encode our data x with the code corresponding to this θ. Rissanen's minimum description length principle recommends using the value of θ minimizing the total code length as an estimate of θ given the data x. For some standard statistical models we find easily computable estimators which respect this principle when θ is encoded with the asymptotically optimal coding scheme due to Levin and Chaitin.
TL;DR: The optimal algorithm proposed in the present paper gives a solution against the problems of the initial situation and the infinite depth tree from another new point of view.
Abstract: The optimal universal code for FSMX sources (Rissanen 1981) with respect to Bayes redundancy criterion (Davison 1973) is deduced under the condition that the model, the probabilistic parameters and the initial state are unknown. The algorithm is not only Bayes optimal for FSMX sources but also asymptotically optimal for stationary ergodic sources. Moreover the algorithm is regarded as a generalisation of the Ziv-Lempel algorithm. In the basic CTW algorithm, the algorithm needs the initial context x/sub 1-d/x/sub 2-d/...x/sub 0/ where: a finite constant d is the depth of the context tree, for calculating the coding probability of x/sub 1/. For the problem of the initial situation and the infinite depth tree, the extensions to the CTW algorithm have been proposed in Willems (1994). The optimal algorithm proposed in the present paper gives a solution against these problems from another new point of view.
TL;DR: It is shown that as rate increases the problem of asymptotically optimal scalar quantization has polynomial-time encoding complexity if the distribution function corresponding to the one-third power of the source density is polynogeneous-time (or space) computable in the Turing sense.
Abstract: It is shown that as rate increases the problem of asymptotically optimal scalar quantization has polynomial-time (or space) encoding complexity if the distribution function corresponding to the one-third power of the source density is polynomial-time (or space) computable in the Turing sense.
TL;DR: This thesis studies and compares the performance of several distributed channel assignment algorithms (CAAs) in a cellular system and shows that the Timid Algorithm is asymptotically optimal, in the limiting case of a large number of channels.
Abstract: In this thesis, we study and compare the performance of several distributed channel assignment algorithms (CAAs) in a cellular system. The CAA which is used to assign a channel to a new call greatly influences the amount of traffic the system can support. We are interested in the design and analysis of algorithms which perform well, but at the same time are relatively easy to implement. In this thesis, we have analyzed the performance of a very simple CAA which we call the Timid Algorithm, in the limiting case of a large number of channels. We have been able to show that, under a plausible mathematical hypothesis, the algorithm is asymptotically optimal, where "asymptotically" refers to a system with a large number of channels. This is very surprising as there are algorithms of much higher complexity which provably do not have this property.
The Timid Algorithm is asymptotically optimal, but it requires a large number of channels for a satisfactory performance. We looked at some algorithms which retain the simplicity of the Timid algorithm but which can be expected to give a good performance even with a smaller number of channels. We called one such algorithm the Modified DCAA. We present some simulation results which show that this algorithm gives a reasonably good performance even when the number of channels is small. One of the ways to increase the capacity of a cellular system is through the use of micro-cells. The Modified DCAA, because of its distributed nature and low complexity, is particularly suitable for such microcellular systems.
We also present a method for computing the upper bound on the performance of any CAA in a cellular system with adjacent channel constraints. The method, although computationally intensive, may be useful for determining how close an algorithm's performance is to the optimal performance.
Finally, we discuss ways of obtaining the set of "allowable" states for a system. We also present some "measurement-based" algorithms and compare their performance with "prediction-based" algorithms.
TL;DR: In this article, the problem of minimizing Egc(Zt) is considered asymptotically, where Z0,Z1,… is a perturbed random walk and gc are convex functions which are minimized at values that approach ∞ as c ↓ 0.
Abstract: The problem of minimizing Egc(Zt) is considered asymptotically, where Z0,Z1,… is a perturbed random walk and gc are convex functions which are minimized at values that approach ∞ as c ↓ 0. It is shown that a first passage time is asymptotically optimal, and the boundary for this time is characterized in terms of gc and the limiting distribution of the excess over the boundary. Applications to change point problems and power one tests are presented.
TL;DR: It is proved in a recent sequence of papers that in the worst case a greedy algorithm produces a superstring that is at most f3 times worse than optimal, and it is shown that with high probability limn.....oo n~~:'n :::: llmn_oo nfo~11 : ::: k where n is the number of original strings, and H is the entropy of the underlying alphabet.
Abstract: Wojciech Szpankowskit Department of Computer Science Purdue Unlversity W. Lafayette, IN 47907 U.S.A. spa@cs.purdue.edu There has recently been a resurgence of interest in the shortest common superstring problem due to important applications in molecular biology (e.g., recombination of DNA) and data compression. The problem is NP-hard, but it has been known for some time that greedy algorithms work well for this problem. More precisely, it was proved in a recent sequence of papers that in the worst case a greedy algorithm produces a superstring that is at most f3 times (2 ~ f3 ~ 3) worse than optimal. We analyze the problem in a probabilistic framework, and consider the total overlaps O~pl and O~~ prod uced respectively by the optimal algorithm and a greedy one which turn out to be asymptotically equivalent. More precisely, we show that with high probability limn.....oo n~~:'n :::: llmn_oo nfo~11 :::: k where n is the number of original strings, and H is the entropy of the underlying alphabet. Our result holds under a condition that the lengths of all strings are not too short. Finally, we provide several generalizations and extensions of our basic result..
TL;DR: In this article, a two-parameter generalization of Jaccard's index of similarity is proposed as a class of measures for testing the homogeneity of two independent multinomial samples.
Abstract: A two-parameter generalization of Jaccard's index of similarity is proposed as a class of measures for testing the homogeneity of two independent multinomial samples. The power approach used in modern asymptotic theory of decomposable statistics is applied to the asymptotic analysis of these measures. The asymptotic analysis is amplified by numerical tabulation yielding an asymptotically optimal similarity test in this class of measures.
TL;DR: In this paper, the optimal number of runs for experiments with many binary factors is determined using a Bayesian decision theoretic formulation in which factors are independent with finite Fisher information, and the authors derived the Stein's identity for posterior risks.
TL;DR: This work synthesized asymptotically optimal detectors of weak and strong signals with random noninformative parameters against a background of non-Gaussian passive clutter to find an appropriate structure of an optimal detector of signals with an unknown initial phase.
Abstract: We have synthesized asymptotically optimal detectors of weak and strong signals with random noninformative parameters against a background of non-Gaussian passive clutter. A possibility of using an adaptive approach to synthesis has been considered and an appropriate structure of an optimal detector of signals with an unknown initial phase has been found.
TL;DR: In this scheme, each user is split into M subusers and the decoder consist of M stripping stages using only single-user decoders, making it suitable for bursty user environment where the set of active users is not known in advanced.
Abstract: | We introduce a new multiaccess coding strategy called stripping CDMA for a LOOK AWGN channel. Stripping CDMA can be viewed as a generalization of the regular CDMA scheme to one that can achieve arbitrarily high bandwidth eeciency. It can also be viewed as a modiication of the pure stripping scheme to one that is suited for multiaccess channel with bursty users. In this scheme, each user is split into M subusers and the decoder consist of M stripping stages using only single-user decoders. All transmitters of all users are identical, making it suitable for bursty user environment where the set of active users is not known in advanced. No synchronization between users at the phase, symbol, or frame level is needed. Although the scheme is only optimal asymptotically in M, it is close to optimal for small M, depending on the bandwidth eeciency. Performance of this scheme using existing non-optimal convolutional code is also investigated.
TL;DR: In this article, a sequence of empirical Bayes procedures for selecting the good population among ∏1, ∏,∏ k, ∏ ∏ k is presented, and it is shown that these procedures are asymptotically optimal and that the order of associated convergence rates is O(n-r/4).
Abstract: Let ∏1,…,∏k denote k independent populations, where a random observation from population ∏ i has a uniform distribution over the interval (0,θ i ) and θ i is a realization of a random variable having an unknown prior distribution G i . Population ∏ i is said to be a good population if θ i ≥θ0, where θ0 is a given, positive number. This paper provides a sequence of empirical Bayes procedures for selecting the good populationsamong ∏1,…,∏ k . It is shown that these procedures are asymptotically optimal and that the order of associated convergence rates is O(n-r/4) for some r, 0
TL;DR: In this paper, a necessary condition for the asymptotic normality of the sample quantile estimator is defined, i.e., f(Q(p)) = F(Qp))>0, whereQp is thep-th quantile of the distribution function F(x).
Abstract: A necessary condition for the asymptotic normality of the sample quantile estimator isf(Q(p))=F′(Q(p))>0, whereQ(p) is thep-th quantile of the distribution functionF(x). In this paper, we estimate a quantile by a kernel quantile estimator when this condition is violated. We have shown that the kernel quantile estimator is asymptotically normal in some nonstandard cases. The optimal convergence rate of the mean squared error for the kernel estimator is obtained with respect to the asymptotically optimal bandwidth. A law of the iterated logarithm is also established.
TL;DR: In this paper, an empirical Bayes testing approach is proposed to classify the observed variables as coming from one population or the other as belonging to one or another unknown interval, where the support of the unknown prior distribution is the union of two unknown intervals.
Abstract: Let X1,X2…be i.i.d. observations from a mixture density. The support of the unknown prior distribution is the union of two unknown intervals. The paper deals with an empirical Bayes testing approach (⊘≤ c against>c where c is an unknown parameter to be estimated) in order to classify the observed variables as coming from one population or the other as ⊘ belongs to one or the other unknown interval. Two methods are proposed in which asymptotically optimal decision rules are constructed avoiding the estimation of the unknown prior. The first method deals with the case of exponential families and is a generalization of the method of Johns and Van Ryzin (1971, 1972) whereas the second one deals with families that are closed under convolution and is a Fourier method. The application of the Fourier method to some densities (i.e. contaminated Gaussian distributions, exponential distribution, double-exponential distribution) which are interesting in view of applications and which cannot be studied by means of the...
TL;DR: The average-case performance of linear-space search algorithms, including depth-first branch-and-bound (DFBnB), iterative-deepening (ID), and recursive best-first search (RBFS) is studied, and analytic results are used to explain a surprising anomaly in the performance of these algorithms, and to predict the existence of a complexity transition in the Asymmetric Traveling Salesman Problem.
TL;DR: In this paper, the authors address the empirical bandwidth choice problem in cases where the range of dependence may be virtually arbitrarily long and provide surprising evidence that, even for some strongly dependent data sequences, the asymptotically optimal bandwidth for independent data is a good choice.
Abstract: We address the empirical bandwidth choice problem in cases where the range of dependence may be virtually arbitrarily long. Assuming that the observed data derive from an unknown function of a Gaussian process, it is argued that, unlike more traditional contexts of statistical inference, in density estimation there is no clear role for the classical distinction between short- and long-range dependence. Indeed, the "boundaries" that separate different modes of behaviour for optimal bandwidths and mean squared errors are determined more by kernel order than by traditional notions of strength of dependence, for example, by whether or not the sum of the covariances converges. We provide surprising evidence that, even for some strongly dependent data sequences, the asymptotically optimal bandwidth for independent data is a good choice. A plug-in empirical bandwidth selector based on this observation is suggested. We determine the properties of this choice for a wide range of different strengths of dependence. Properties of cross-validation are also addressed.
TL;DR: In this paper, the authors established local asymptotic normality (LAN) of the loglikelihood process pertaining to the vector(Z>>\sn−i+1∶n>>\s)consuming the upper k-th largest order statistics in the sample, if the family β is in a neighborhood of the generalized Pareto distributions.
Abstract: Consider an iid sampleZ
1,...,Z
n
with common distribution functionF on the real line, whose upper tail belongs to a parametric family {F
β: β∈⊝}. We establish local asymptotic normality (LAN) of the loglikelihood process pertaining to the vector(Z
n−i+1∶n
)
of the upperk=k(n)→
n→∞∞ order statistics in the sample, if the family {F
β:β∈⊝} is in a neighborhood of the family of generalized Pareto distributions. It turns out that, except in one particular location case, thekth-largest order statisticZ
n−k+1∶n
is the central sequence generating LAN. This implies thatZ
n−k+1∶n
is asymptotically sufficient and that asymptotically optimal tests for the underlying parameter β can be based on the single order statisticZ
n−k+1∶n
. The rate at whichZ
n−k+1∶n
becomes asymptotically sufficient is however quite poor.
TL;DR: It is shown that the proposed algorithm is asymptotically optimal in this class of sequential change detection/isolation algorithms and the theoretical results are applied to the case of additive changes in linear stochastic models.
Abstract: The purpose of this paper is to give a new statistical approach to the change diagnosis (detection/isolation) problem. The change detection problem has received extensive research attention; however, the change isolation problem has, for the most part, been ignored. We consider a stochastic dynamical system with abrupt changes and investigate the multiple hypotheses extension of Lorden's (1971) results. We introduce a joint criterion of optimality for the detection/isolation problem and then design a change detection/isolation algorithm. We also investigate the statistical properties of this algorithm. We prove a lower bound for the criterion in a class of sequential change detection/isolation algorithms. It is shown that the proposed algorithm is asymptotically optimal in this class. The theoretical results are applied to the case of additive changes in linear stochastic models. >
TL;DR: The performance of an importance sampling estimator for a rare-event probability in tandem Jackson networks is analyzed, showing that in certain parameter regions the estimator has an asymptotic efficiency property, but that in other regions it does not.
Abstract: We analyze the performance of an importance sampling estimator for a rare-event probability in tandem Jackson networks. The rare event we consider corresponds to the network population reaching K before returning to o, starting from o, with K large. The estimator we study is based on interchanging the arrival rate and the smallest service rate and is therefore a generalization of the asymptotically optimal estimator for an M/M/1 queue. We examine its asymptotic performance for large K, showing that in certain parameter regions the estimator has an asymptotic efficiency property, but that in other regions it does not. The setting we consider is perhaps the simplest case of a rare-event simulation problem in which boundaries on the state space play a significant role.