Top 109 papers published in the topic of Asymptotically optimal algorithm in 2010

Showing papers on "Asymptotically optimal algorithm published in 2010"

Posted Content•

Optimal Distributed Online Prediction using Mini-Batches

[...]

Ofer Dekel¹, Ran Gilad-Bachrach¹, Ohad Shamir¹, Lin Xiao¹•Institutions (1)

07 Dec 2010-arXiv: Learning

TL;DR: In this paper, the authors present the distributed mini-batch algorithm, a method of converting many serial gradient-based online prediction algorithms into distributed algorithms that is asymptotically optimal for smooth convex loss functions and stochastic inputs.

...read moreread less

Abstract: Online prediction methods are typically presented as serial algorithms running on a single processor. However, in the age of web-scale prediction problems, it is increasingly common to encounter situations where a single processor cannot keep up with the high rate at which inputs arrive. In this work, we present the \emph{distributed mini-batch} algorithm, a method of converting many serial gradient-based online prediction algorithms into distributed algorithms. We prove a regret bound for this method that is asymptotically optimal for smooth convex loss functions and stochastic inputs. Moreover, our analysis explicitly takes into account communication latencies between nodes in the distributed environment. We show how our method can be used to solve the closely-related distributed stochastic optimization problem, achieving an asymptotically linear speed-up over multiple processors. Finally, we demonstrate the merits of our approach on a web-scale online prediction problem.

...read moreread less

416 citations

Proceedings Article•

An Asymptotically Optimal Bandit Algorithm for Bounded Support Models.

[...]

Junya Honda¹, Akimichi Takemura¹•Institutions (1)

University of Tokyo¹

1 Jan 2010

TL;DR: Deterministic Minimum Empirical Divergence policy is proposed and proved that DMED achieves the asymptotic bound and the index used in DMED for choosing an arm can be computed easily by a convex optimization technique.

...read moreread less

Abstract: Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playing a slot machine with multiple arms. We study stochastic bandit problem where each arm has a reward distribution supported in a known bounded interval, e.g. [0, 1]. In this model, Auer et al. (2002) proposed practical policies called UCB and derived finite-time regret of UCB policies. However, policies achieving the asymptotic bound given by Burnetas and Katehakis (1996) have been unknown for the model. We propose Deterministic Minimum Empirical Divergence (DMED) policy and prove that DMED achieves the asymptotic bound. Furthermore, the index used in DMED for choosing an arm can be computed easily by a convex optimization technique. Although we do not derive a finite-time regret, we confirm by simulations that DMED achieves a regret close to the asymptotic bound in finite time.

...read moreread less

162 citations

Posted Content•

An Optimal Lower Bound on the Communication Complexity of Gap-Hamming-Distance

[...]

Amit Chakrabarti¹, Oded Regev•Institutions (1)

Dartmouth College¹

17 Sep 2010-arXiv: Computational Complexity

TL;DR: An optimal $\Omega(n)$ lower bound on the randomized communication complexity of the much-studied gap-hamming-distance problem is proved and essentially optimal multipass space lower bounds in the data stream model are obtained for a number of fundamental problems, including the estimation of frequency moments.

...read moreread less

Abstract: We prove an optimal $\Omega(n)$ lower bound on the randomized communication complexity of the much-studied Gap-Hamming-Distance problem. As a consequence, we obtain essentially optimal multi-pass space lower bounds in the data stream model for a number of fundamental problems, including the estimation of frequency moments. The Gap-Hamming-Distance problem is a communication problem, wherein Alice and Bob receive $n$-bit strings $x$ and $y$, respectively. They are promised that the Hamming distance between $x$ and $y$ is either at least $n/2+\sqrt{n}$ or at most $n/2-\sqrt{n}$, and their goal is to decide which of these is the case. Since the formal presentation of the problem by Indyk and Woodruff (FOCS, 2003), it had been conjectured that the naive protocol, which uses $n$ bits of communication, is asymptotically optimal. The conjecture was shown to be true in several special cases, e.g., when the communication is deterministic, or when the number of rounds of communication is limited. The proof of our aforementioned result, which settles this conjecture fully, is based on a new geometric statement regarding correlations in Gaussian space, related to a result of C. Borell (1985). To prove this geometric statement, we show that random projections of not-too-small sets in Gaussian space are close to a mixture of translated normal variables.

...read moreread less

107 citations

Journal Article•10.1007/S00446-010-0097-1•

An optimal maximal independent set algorithm for bounded-independence graphs

[...]

Johannes Schneider¹, Roger Wattenhofer¹•Institutions (1)

ETH Zurich¹

10 Mar 2010-Distributed Computing

TL;DR: A novel distributed algorithm for the maximal independent set problem that solves the connected dominating set problem for unit disk graphs in O(log* n) time, exponentially faster than the state-of-the-art algorithm.

...read moreread less

Abstract: We present a novel distributed algorithm for the maximal independent set problem (This is an extended journal version of Schneider and Wattenhofer in Twenty-seventh annual ACM SIGACT-SIGOPS symposium on principles of distributed computing, 2008). On bounded-independence graphs our deterministic algorithm finishes in O(log* n) time, n being the number of nodes. In light of Linial’s Ω(log* n) lower bound our algorithm is asymptotically optimal. Furthermore, it solves the connected dominating set problem for unit disk graphs in O(log* n) time, exponentially faster than the state-of-the-art algorithm. With a new extension our algorithm also computes a δ + 1 coloring and a maximal matching in O(log* n) time, where δ is the maximum degree of the graph.

...read moreread less

84 citations

Journal Article•10.1214/09-AOS775•

On optimality of the Shiryaev–Roberts procedure for detecting a change in distribution

[...]

Aleksey S. Polunchenko, Alexander G. Tartakovsky

01 Dec 2010-Annals of Statistics

TL;DR: In this article, the authors provide a counterexample which shows that Pollak's procedure is not optimal and that there is a strictly optimal procedure which is nothing but the Shiryaev-Roberts procedure that starts with a specially designed deterministic point.

...read moreread less

Abstract: In 1985, for detecting a change in distribution, Pollak introduced a specific minimax performance metric and a randomized version of the Shiryaev–Roberts procedure where the zero initial condition is replaced by a random variable sampled from the quasi-stationary distribution of the Shiryaev–Roberts statistic. Pollak proved that this procedure is third-order asymptotically optimal as the mean time to false alarm becomes large. The question of whether Pollak’s procedure is strictly minimax for any false alarm rate has been open for more than two decades, and there were several attempts to prove this strict optimality. In this paper, we provide a counterexample which shows that Pollak’s procedure is not optimal and that there is a strictly optimal procedure which is nothing but the Shiryaev–Roberts procedure that starts with a specially designed deterministic point.

...read moreread less

80 citations

Proceedings Article•10.1109/CCC.2010.27•

The Gaussian Surface Area and Noise Sensitivity of Degree-d Polynomial Threshold Functions

[...]

Daniel M. Kane¹•Institutions (1)

Stanford University¹

9 Jun 2010

TL;DR: It is proved that asymptotically optimal bounds on the Gaussian noise sensitivity of degree-d polynomial threshold functions translate into optimal limits on theGaussian surface area of such functions, and therefore imply new bounds onThe running time of agnostic learning algorithms.

...read moreread less

Abstract: We prove asymptotically optimal bounds on the Gaussian noise sensitivity of degree-d polynomial threshold functions. These bounds translate into optimal bounds on the Gaussian surface area of such functions, and therefore imply new bounds on the running time of agnostic learning algorithms.

...read moreread less

71 citations

Posted Content•

An Optimal Family of Exponentially Accurate One-Bit Sigma-Delta Quantization Schemes

[...]

Percy Deift, C. Sinan Güntürk, Felix Krahmer

28 Jan 2010-arXiv: Information Theory

TL;DR: In this paper, the authors study the minimization problem that corresponds to optimizing the error decay rate for this class of feedback filters and find asymptotically optimal solutions of the original problem.

...read moreread less

Abstract: Sigma-Delta modulation is a popular method for analog-to-digital conversion of bandlimited signals that employs coarse quantization coupled with oversampling. The standard mathematical model for the error analysis of the method measures the performance of a given scheme by the rate at which the associated reconstruction error decays as a function of the oversampling ratio $\lambda$. It was recently shown that exponential accuracy of the form $O(2^{-r\lambda})$ can be achieved by appropriate one-bit Sigma-Delta modulation schemes. By general information-entropy arguments $r$ must be less than 1. The current best known value for $r$ is approximately 0.088. The schemes that were designed to achieve this accuracy employ the "greedy" quantization rule coupled with feedback filters that fall into a class we call "minimally supported". In this paper, we study the minimization problem that corresponds to optimizing the error decay rate for this class of feedback filters. We solve a relaxed version of this problem exactly and provide explicit asymptotics of the solutions. From these relaxed solutions, we find asymptotically optimal solutions of the original problem, which improve the best known exponential error decay rate to $r \approx 0.102$. Our method draws from the theory of orthogonal polynomials; in particular, it relates the optimal filters to the zero sets of Chebyshev polynomials of the second kind.

...read moreread less

36 citations

Journal Article•10.1016/J.PEVA.2009.09.007•

Analysis of SITA policies

[...]

Eitan Bachmat¹, Hagit Sarfati¹•Institutions (1)

Ben-Gurion University of the Negev¹

01 Feb 2010-Performance Evaluation

TL;DR: This work analyzes the performance of Size Interval Task Assignment (SITA) policies, for multi-host assignment in a non-preemptive environment, and determines asymptotically optimal cutoff values and provides asymPTotic formulas for average waiting time and slowdown.

...read moreread less

33 citations

Journal Article•10.1007/S00184-008-0227-Y•

One-step ahead adaptive D -optimal design on a finite design space is asymptotically optimal

[...]

Luc Pronzato¹•Institutions (1)

University of Nice Sophia Antipolis¹

01 Mar 2010-Metrika

TL;DR: In this paper, the consistency of parameter estimators in adaptive designs generated by a one-step ahead D-optimal algorithm was studied and it was shown that when the design space is finite, under mild conditions the least-squares estimator in a nonlinear regression model is strongly consistent and the information matrix evaluated at the current estimated value of the parameters strongly converges to the D-optimality matrix.

...read moreread less

Abstract: We study the consistency of parameter estimators in adaptive designs generated by a one-step ahead D-optimal algorithm. We show that when the design space is finite, under mild conditions the least-squares estimator in a nonlinear regression model is strongly consistent and the information matrix evaluated at the current estimated value of the parameters strongly converges to the D-optimal matrix for the unknown true value of the parameters. A similar property is shown to hold for maximum-likelihood estimation in Bernoulli trials (dose-response experiments). Some examples are presented.

...read moreread less

32 citations

Journal Article•10.1109/TNET.2010.2043368•

Path diversity over packet switched networks: performance analysis and rate allocation

[...]

Shervan Fashandi¹, Shahab Oveis Gharan¹, Amir K. Khandani¹•Institutions (1)

University of Waterloo¹

01 Oct 2010-IEEE ACM Transactions on Networking

TL;DR: Forward error correction (FEC) is applied across multiple independent paths to enhance the end-to-end reliability and it is proved that the probability of irrecoverable loss (PE) decays exponentially with the number of paths.

...read moreread less

Abstract: Path diversity works by setting up multiple parallel connections between the endpoints using the topological path redundancy of the network. In this paper, forward error correction (FEC) is applied across multiple independent paths to enhance the end-to-end reliability. We prove that the probability of irrecoverable loss (PE) decays exponentially with the number of paths. Furthermore, the rate allocation (RA) problem across independent paths is studied. Our objective is to find the optimal RA, i.e., the allocation that minimizes PE. The RA problem is solved for a large number of paths. Moreover, it is shown that in such asymptotically optimal RA, each path is assigned a positive rate iff its quality is above a certain threshold. Finally, using memoization technique, a heuristic suboptimal algorithm with polynomial runtime is proposed for RA over a finite number of paths. This algorithm converges to the asymptotically optimal RA when the number of paths is large. For a practical number of paths, the simulation results demonstrate the close-to-optimal performance of the proposed algorithm .

...read moreread less

31 citations

Posted Content•

Asymptotically Optimal Randomized Rumor Spreading

[...]

Benjamin Doerr¹, Mahmoud Fouz²•Institutions (2)

Max Planck Society¹, Saarland University²

08 Nov 2010-arXiv: Data Structures and Algorithms

TL;DR: To the best of the knowledge, this is the first randomized push algorithm that achieves an asymptotically optimal running time and can be extended to also deal with adversarial node failures.

...read moreread less

Abstract: We propose a new protocol solving the fundamental problem of disseminating a piece of information to all members of a group of n players. It builds upon the classical randomized rumor spreading protocol and several extensions. The main achievements are the following: Our protocol spreads the rumor to all other nodes in the asymptotically optimal time of (1 + o(1)) \log_2 n. The whole process can be implemented in a way such that only O(n f(n)) calls are made, where f(n)= \omega(1) can be arbitrary. In contrast to other protocols suggested in the literature, our algorithm only uses push operations, i.e., only informed nodes take active actions in the network. To the best of our knowledge, this is the first randomized push algorithm that achieves an asymptotically optimal running time.

...read moreread less

Book Chapter•10.1002/9780470824443.CH3•

Asymptotically Optimal Tests

[...]

John C. W. Rayner¹, Olivier Thas², Donald John Best¹•Institutions (2)

University of Newcastle¹, Ghent University²

29 Jan 2010

Journal Article•10.1007/S10463-008-0185-1•

Oracle inequality for conditional density estimation and an actuarial example

[...]

Sam Efromovich¹•Institutions (1)

University of Texas at Dallas¹

01 Apr 2010-Annals of the Institute of Statistical Mathematics

TL;DR: In this paper, a nonparametric data-driven estimator that matches the performance of an oracle is proposed, which adapts to an unknown design of predictors, performs a dimension reduction if the response does not depend on the predictor, and minimax over a vast set of anisotropic bivariate function classes.

...read moreread less

Abstract: Conditional density estimation in a parametric regression setting, where the problem is to estimate a parametric density of the response given the predictor, is a classical and prominent topic in regression analysis. This article explores this problem in a nonparametric setting where no assumption about shape of an underlying conditional density is made. For the first time in the literature, it is proved that there exists a nonparametric data-driven estimator that matches performance of an oracle which: (i) knows the underlying conditional density, (ii) adapts to an unknown design of predictors, (iii) performs a dimension reduction if the response does not depend on the predictor, (iv) is minimax over a vast set of anisotropic bivariate function classes. All these results are established via an oracle inequality which is on par with ones known in the univariate density estimation literature. Further, the asymptotically optimal estimator is tested on an interesting actuarial example which explores a relationship between credit scoring and premium for basic auto-insurance for 54 undergraduate college students.

...read moreread less

Journal Article•10.1109/TNET.2009.2032230•

Asymptotically optimal data dissemination in multichannel wireless sensor networks: single radios suffice

[...]

David Starobinski¹, Weiyao Xiao¹•Institutions (1)

Boston University¹

01 Jun 2010-IEEE ACM Transactions on Networking

TL;DR: This work focuses on two special classes of network topologies of practical interest, namely single-hop clusters and multihop cluster chains, and derives the structure of policies that achieve an asymptotically optimal average delay, in networks with large number of nodes.

...read moreread less

Abstract: We analyze the performance limits of data dissemination with multichannel, single radio sensors under random packet loss. We formulate the problem of minimizing the average delay of data dissemination as a stochastic shortest path problem and show that, for an arbitrary topology network, an optimal control policy can be found in a finite number of steps, using value iteration or Dijkstra's algorithm. However, the computational complexity of this solution is generally prohibitive. We thus focus on two special classes of network topologies of practical interest, namely single-hop clusters and multihop cluster chains. For these topologies, we derive the structure of policies that achieve an asymptotically optimal average delay, in networks with large number of nodes. Our analysis reveals that a single radio in each node suffices to achieve performance gain directly proportional to the total number of channels available. Through simulation, we show that the derived policies perform close to optimal even for networks with small and moderate numbers of nodes and can be implemented with limited overhead.

...read moreread less

Book Chapter•10.1016/B978-0-12-374726-6.00012-6•

Second-order methods based on color

[...]

Arie Yeredor

1 Jan 2010

TL;DR: This work has shown that under the assumption of Gaussian sources, the ML estimate takes a relatively simple form and is indeed based on SOS alone, but under the Gaussianity assumption, it is also possible to apply optimal weighting to the joint-diagonalization based approach, thus obtaining estimates which are asymptotically optimal, and are thus asymPTotically equivalent to theML estimate.

...read moreread less

Abstract: Publisher Summary Random processes with second-order temporal statistical structures are classically divided into two broad classes—wide-sense stationary (WSS) processes and nonstationary processes. WSS processes are characterized by the property that the statistical correlation between any two samples thereof depends only on the time-difference between the sampling instants. Such processes exhibit many significant properties in the time and frequency domains, which can be conveniently exploited. The second-order statistics (SOS)-based separation approaches for WSS sources can roughly be divided into two categories—approaches exploiting the special structure of the correlation matrices through (approximate) joint diagonalization and approaches based on the principle of maximum likelihood (ML). ML estimation is based on more than SOS in general, but under the assumption of Gaussian sources, the ML estimate takes a relatively simple form and is indeed based on SOS alone. Using the Gaussianity assumption, it is also possible to apply optimal weighting to the joint-diagonalization based approach, thus obtaining estimates which are asymptotically optimal, and are thus asymptotically equivalent to the ML estimate.

...read moreread less

Journal Article•10.1109/TIT.2010.2048487•

Optical Orthogonal Signature Pattern Codes With Maximum Collision Parameter $2$ and Weight $4$

[...]

Masanori Sawa¹•Institutions (1)

Nagoya University¹

01 Jul 2010-IEEE Transactions on Information Theory

TL;DR: It is proved that for a multiple n of 4, there exists no optimal OOSPC of size 6 n with weight 4 and maximum collision parameter 2, together with a report which shows a gap between optimal OOCs and optimal O OSPCs when 6 and n are not coprime.

...read moreread less

Abstract: An optical orthogonal signature pattern code (OOSPC) finds application in transmitting 2-D images through multicore fiber in code-division multiple-access (CDMA) communication systems. Observing a one-to-one correspondence between an OOSPC and a certain combinatorial subject, called a packing design, we present a construction of optimal OOSPCs with weight 4 and maximum collision parameter 2, which generalizes a well-known Kohler construction of optimal optical orthogonal codes (OOC) with weight 4 and maximum collision parameter 2. Using this new construction enables one to obtain infinitely many optimal OOSPCs, whose existence was previously unknown. We prove that for a multiple n of 4, there exists no optimal OOSPC of size 6 n with weight 4 and maximum collision parameter 2, together with a report which shows a gap between optimal OOCs and optimal OOSPCs when 6 and n are not coprime. We also present a recursive construction of OOSPCs which are asymptotically optimal with respect to the Johnson bound. As a by-product, we obtain an asymptotically optimal (m, n, 4, 2)-OOSPC for all positive integers m and n.

...read moreread less

Journal Article•10.1007/S10489-010-0228-1•

Optimal sampling for estimation with constrained resources using a learning automaton-based solution for the nonlinear fractional knapsack problem

[...]

Ole-Christoffer Granmo¹, B. John Oommen²•Institutions (2)

University of Agder¹, Carleton University²

01 Aug 2010-Applied Intelligence

TL;DR: This paper considers the problem of allocating limited sampling resources in a “real-time” manner, with the explicit purpose of estimating multiple binomial proportions, and presents a completely new on-line Learning Automata (LA) system, namely, the Hierarchy of Twofold Resource Allocation Automaton (H-TRAA), both of which are asymptotically optimal.

...read moreread less

Abstract: While training and estimation for Pattern Recognition (PR) have been extensively studied, the question of achieving these when the resources are both limited and constrained is relatively open. This is the focus of this paper. We consider the problem of allocating limited sampling resources in a "real-time" manner, with the explicit purpose of estimating multiple binomial proportions (the extension of these results to non-binomial proportions is, in our opinion, rather straightforward). More specifically, the user is presented with `n' training sets of data points, S 1,S 2,?,S n , where the set S i has N i points drawn from two classes {? 1,? 2}. A random sample in set S i belongs to ? 1 with probability u i and to ? 2 with probability 1?u i , with {u i }, i=1,2,?n, being the quantities to be learnt. The problem is both interesting and non-trivial because while both n and each N i are large, the number of samples that can be drawn is bounded by a constant, c. A web-related problem which is based on this model (Snaprud et al., The Accessibility for All Conference, 2003) is intriguing because the sampling resources can only be allocated optimally if the binomial proportions are already known. Further, no non-automaton solution has ever been reported if these proportions are unknown and must be sampled. Using the general LA philosophy as a paradigm to tackle this real-life problem, our scheme improves a current solution in an online manner, through a series of informed guesses which move towards the optimal solution. We solve the problem by first modelling it as a Stochastic Non-linear Fractional Knapsack Problem. We then present a completely new on-line Learning Automata (LA) system, namely, the Hierarchy of Twofold Resource Allocation Automata (H-TRAA), whose primitive component is a Twofold Resource Allocation Automaton (TRAA), both of which are asymptotically optimal. Furthermore, we demonstrate empirically that the H-TRAA provides orders of magnitude faster convergence compared to the Learning Automata Knapsack Game (LAKG) which represents the state-of-the-art. Finally, in contrast to the LAKG, the H-TRAA scales sub-linearly. Based on these results, we believe that the H-TRAA has also tremendous potential to handle demanding real-world applications, particularly those dealing with the world wide web.

...read moreread less

Journal Article•10.1109/TIT.2010.2080411•

$n$ -Channel Asymmetric Entropy-Constrained Multiple-Description Lattice Vector Quantization

[...]

Jan Ostergaard¹, Richard Heusdens², Jesper Jensen•Institutions (2)

Aalborg University¹, Delft University of Technology²

01 Dec 2010-IEEE Transactions on Information Theory

TL;DR: The design and analysis of an index-assignment (IA)-based multiple-description coding scheme for the n-channel asymmetric case and it is shown that in the limit of large lattice vector dimensions, points on the inner bound of Pradhan can be achieved.

...read moreread less

Abstract: This paper is about the design and analysis of an index-assignment (IA)-based multiple-description coding scheme for the n-channel asymmetric case. We use entropy constrained lattice vector quantization and restrict attention to simple reconstruction functions, which are given by the inverse IA function when all descriptions are received or otherwise by a weighted average of the received descriptions. We consider smooth sources with finite differential entropy rate and MSE fidelity criterion. As in previous designs, our construction is based on nested lattices which are combined through a single IA function. The results are exact under high-resolution conditions and asymptotically as the nesting ratios of the lattices approach infinity. For any n, the design is asymptotically optimal within the class of IA-based schemes. Moreover, in the case of two descriptions and finite lattice vector dimensions greater than one, the performance is strictly better than that of existing designs. In the case of three descriptions, we show that in the limit of large lattice vector dimensions, points on the inner bound of Pradhan can be achieved. Furthermore, for three descriptions and finite lattice vector dimensions, we show that the IA-based approach yields, in the symmetric case, a smaller rate loss than the recently proposed source-splitting approach.

...read moreread less

Journal Article•

Optimal inverse Beta (3,3) transformation in kernel density estimation

[...]

Catalina Bolancé¹•Institutions (1)

University of Barcelona¹

01 Jan 2010-Sort-statistics and Operations Research Transactions

TL;DR: A double transformation kernel density estimator that is suitable for heavy-tailed distributions that performs better than existing alternatives and an application to insurance claim cost data is included.

...read moreread less

Abstract: A double transformation kernel density estimator that is suitable for heavy-tailed distributions is presented. Using a double transformation, an asymptotically optimal bandwidth parameter can be calculated when minimizing the expression of the asymptotic mean integrated squared error of the transformed variable. Simulation results are presented showing that this approach performs better than existing alternatives. An application to insurance claim cost data is included.

...read moreread less

Journal Article•10.1145/1868237.1868249•

I/O-efficient batched union-find and its applications to terrain analysis

[...]

Pankaj K. Agarwal¹, Lars Arge², Ke Yi³•Institutions (3)

Duke University¹, Aarhus University², Hong Kong University of Science and Technology³

08 Dec 2010-ACM Transactions on Algorithms

TL;DR: In this article, an I/O-efficient algorithm for the batched (off-line) version of the union-find problem is presented and a simple and practical SORT-I/O algorithm for this problem is described, which is implemented.

...read moreread less

Abstract: In this article we present an I/O-efficient algorithm for the batched (off-line) version of the union-find problem. Given any sequence of N union and find operations, where each union operation joins two distinct sets, our algorithm uses O(SORT(N)) = O(N/B logM/BN/B) I/Os, where M is the memory size and B is the disk block size. This bound is asymptotically optimal in the worst case. If there are union operations that join a set with itself, our algorithm uses O(SORT(N) + MST(N)) I/Os, where MST(N) is the number of I/Os needed to compute the minimum spanning tree of a graph with N edges. We also describe a simple and practical O(SORT(N) log(N/M))-I/O algorithm for this problem, which we have implemented.We are interested in the union-find problem because of its applications in terrain analysis. A terrain can be abstracted as a height function defined over R2, and many problems that deal with such functions require a union-find data structure. With the emergence of modern mapping technologies, huge amount of elevation data is being generated that is too large to fit in memory, thus I/O-efficient algorithms are needed to process this data efficiently. In this article, we study two terrain-analysis problems that benefit from a union-find data structure: (i) computing topological persistence and (ii) constructing the contour tree. We give the first O(SORT(N))-I/O algorithms for these two problems, assuming that the input terrain is represented as a triangular mesh with N vertices.

...read moreread less

Posted Content•

The Local Lemma Is Tight for SAT

[...]

Heidi Gebauer¹, Tibor Szabó², Gábor Tardos³•Institutions (3)

ETH Zurich¹, Free University of Berlin², Simon Fraser University³

03 Jun 2010-arXiv: Combinatorics

TL;DR: In this article, the authors constructed unsatisfiable k-CNF formulas where every clause has k distinct literals and every variable appears in at most (2/e + o(1))2^{k}/k clauses, and the lower bound on l(k) obtained from the Local Lemma is asymptotically optimal.

...read moreread less

Abstract: We construct unsatisfiable k-CNF formulas where every clause has k distinct literals and every variable appears in at most (2/e + o(1))2^{k}/k clauses. The Lopsided Local Lemma, applied with assignment of random values according to counterintuitive probabilities, shows that our result is asymptotically best possible. The determination of this extremal function is particularly important as it represents the value where the k-SAT problem exhibits its complexity hardness jump: from having every instance being a YES-instance it becomes NP-hard just by allowing each variable to occur in one more clause. The asymptotics of other related extremal functions are also determined. Let l(k) denote the maximum number, such that every k-CNF formula with each clause containing k distinct literals and each clause having a common variable with at most l(k) other clauses, is satisfiable. We establish that the lower bound on l(k) obtained from the Local Lemma is asymptotically optimal, i.e., l(k) = (1/e + o(1))2^{k}. The construction of our unsatisfiable CNF-formulas is based on the binary tree approach of [16] and thus the constructed formulas are in the class MU(1)of minimal unsatisfiable formulas having one more clauses than variables. To obtain the asymptotically optimal binary trees we consider a continuous approximation of the problem, set up a differential equation and estimate its solution. The trees are then obtained through a discretization of this solution. The binary trees constructed also give asymptotically precise answers for seemingly unrelated problems like the European Tenure Game introduced by Doerr [9] and the search problem with bounded number of consecutive lies, considered in a problem of the 2012 IMO contest. As yet another consequence we slightly improve two bounds related to the Neighborhood Conjecture of Beck.

...read moreread less

Posted Content•

The Local Lemma is asymptotically tight for SAT

[...]

Heidi Gebauer¹, Tibor Szabó², Gábor Tardos•Institutions (2)

Zürcher Fachhochschule¹, Free University of Berlin²

03 Jun 2010-arXiv: Combinatorics

TL;DR: In this paper, the authors construct unsatisfiable k-CNF formulas where every clause has k distinct literals and every variable appears in at most (2/e + o(1))*2^k/k clauses.

...read moreread less

Abstract: The Local Lemma is a fundamental tool of probabilistic combinatorics and theoretical computer science, yet there are hardly any natural problems known where it provides an asymptotically tight answer. The main theme of our paper is to identify several of these problems, among them a couple of widely studied extremal functions related to certain restricted versions of the k-SAT problem, where the Local Lemma does give essentially optimal answers. As our main contribution, we construct unsatisfiable k-CNF formulas where every clause has k distinct literals and every variable appears in at most (2/e + o(1))*2^k/k clauses. The Lopsided Local Lemma shows that this is asymptotically best possible. The determination of this extremal function is particularly important as it represents the value where the corresponding k-SAT problem exhibits a complexity hardness jump: from having every instance being a YES-instance it becomes NP-hard just by allowing each variable to occur in one more clause. The construction of our unsatisfiable CNF-formulas is based on the binary tree approach of [16] and thus the constructed formulas are in the class MU(1) of minimal unsatisfiable formulas having one more clauses than variables. The main novelty of our approach here comes in setting up an appropriate continuous approximation of the problem. This leads us to a differential equation, the solution of which we are able to estimate. The asymptotically optimal binary trees are then obtained through a discretization of this solution. The importance of the binary trees constructed is also underlined by their appearance in many other scenarios. In particular, they give asymptotically precise answers for seemingly unrelated problems like the European Tenure Game introduced by Doerr [9] and a search problem allowing a limited number of consecutive lies.

...read moreread less

Journal Article•10.2139/SSRN.1649983•

A Class of Simple Distribution-Free Rank-Based Unit Root Tests

[...]

Marc Hallin¹, Ramon van den Akker², Bas J. M. Werker²•Institutions (2)

Université libre de Bruxelles¹, Tilburg University²

30 Jun 2010-Social Science Research Network

TL;DR: In this paper, a class of distribution-free rank-based tests for the null hypothesis of a unit root is proposed, indexed by the choice of a reference density g, which needs not coincide with the unknown actual innovation density f. The validity of these tests, in terms of exact finite sample size, is guaranteed by distribution-freeness.

...read moreread less

Abstract: We propose a class of distribution-free rank-based tests for the null hypothesis of a unit root. This class is indexed by the choice of a reference density g, which needs not coincide with the unknown actual innovation density f. The validity of these tests, in terms of exact finite sample size, is guaranteed, irrespective of the actual underlying density, by distribution-freeness. Those tests are locally and asymptotically optimal under a particular asymptotic scheme, for which we provide a complete analysis of asymptotic relative efficiencies. Rather than asymptotic optimality, however, we emphasize finite-sample performances, and show that our rank-based tests perform significantly better than the traditional Dickey-Fuller tests.

...read moreread less

Book Chapter•10.1007/978-3-642-13284-1_8•

Space-optimal rendezvous of mobile agents in asynchronous trees

[...]

Daisuke Baba¹, Tomoko Izumi², Fukuhito Ooshita¹, Hirotsugu Kakugawa¹, Toshimitsu Masuzawa¹ - Show less +1 more•Institutions (2)

Osaka University¹, Ritsumeikan University²

7 Jun 2010

TL;DR: This work investigates the relation between the time complexity and the space complexity for the rendezvous problem with k agents in asynchronous tree networks and presents an asymptotically time-optimal rendezvous algorithm that each agent uses only O(logn) bits of memory.

...read moreread less

Abstract: We investigate the relation between the time complexity and the space complexity for the rendezvous problem with k agents in asynchronous tree networks. The rendezvous problem requires that all the agents in the system have to meet at a single node within finite time. First, we consider asymptotically time-optimal algorithms and investigate the minimum memory requirement per agent for asymptotically time-optimal algorithms. We show that there exists a tree with n nodes in which Ω(n) bits of memory per agent is required to solve the rendezvous problem in O(n) time (asymptotically time-optimal). Then, we present an asymptotically time-optimal rendezvous algorithm. This algorithm can be executed if each agent has O(n) bits of memory. From this lower/upper bound, this algorithm is asymptotically space-optimal on the condition that the time complexity is asymptotically optimal. Finally, we consider asymptotically space-optimal algorithms while allowing slowdown in time required to achieve rendezvous. We present an asymptotically space-optimal algorithm that each agent uses only O(logn) bits of memory. This algorithm terminates in O(Δn8) time where Δ is the maximum degree of the tree.

...read moreread less

Journal Article•10.1016/J.APM.2009.06.014•

A new heuristic for open shop total completion time problem

[...]

Lixin Tang¹, Danyu Bai¹•Institutions (1)

Northeastern University (China)¹

01 Mar 2010-Applied Mathematical Modelling

TL;DR: It is proved that the heuristic is asymptotically optimal when the job number goes to infinity.

...read moreread less

Journal Article•10.1051/PS:2008026•

Asymptotically optimal quantization schemes for Gaussian processes on Hilbert spaces

[...]

Harald Luschgy¹, Gilles Pagès², Benedikt Wilbertz²•Institutions (2)

University of Trier¹, University of Paris²

01 May 2010-Esaim: Probability and Statistics

TL;DR: In this paper, the authors describe quantization designs which lead to asymptotically and order optimal functional quantizers for Gaussian processes in a Hilbert space setting, where regular variation of the eigenvalues of the covariance operator plays a crucial role to achieve these rates.

...read moreread less

Abstract: We describe quantization designs which lead to asymptotically and order optimal functional quantizers for Gaussian processes in a Hilbert space setting. Regular variation of the eigenvalues of the covariance operator plays a crucial role to achieve these rates. For the development of a constructive quantization scheme we rely on the knowledge of the eigenvectors of the covariance operator in order to transform the problem into a finite dimensional quantization problem of normal distributions.

...read moreread less

Posted Content•

n-Channel Asymmetric Entropy-Constrained Multiple-Description Lattice Vector Quantization

[...]

Jan Ostergaard¹, Richard Heusdens², Jesper Jensen•Institutions (2)

Aalborg University¹, Delft University of Technology²

30 Nov 2010-arXiv: Information Theory

TL;DR: In this article, an index-assignment (IA) based multiple-description coding scheme for the n-channel asymmetric case was proposed and the results were exact under high-resolution conditions and asymptotically as the nesting ratios of the lattices approach infinity.

...read moreread less

Abstract: This paper is about the design and analysis of an index-assignment (IA) based multiple-description coding scheme for the n-channel asymmetric case. We use entropy constrained lattice vector quantization and restrict attention to simple reconstruction functions, which are given by the inverse IA function when all descriptions are received or otherwise by a weighted average of the received descriptions. We consider smooth sources with finite differential entropy rate and MSE fidelity criterion. As in previous designs, our construction is based on nested lattices which are combined through a single IA function. The results are exact under high-resolution conditions and asymptotically as the nesting ratios of the lattices approach infinity. For any n, the design is asymptotically optimal within the class of IA-based schemes. Moreover, in the case of two descriptions and finite lattice vector dimensions greater than one, the performance is strictly better than that of existing designs. In the case of three descriptions, we show that in the limit of large lattice vector dimensions, points on the inner bound of Pradhan et al. can be achieved. Furthermore, for three descriptions and finite lattice vector dimensions, we show that the IA-based approach yields, in the symmetric case, a smaller rate loss than the recently proposed source-splitting approach.

...read moreread less

Proceedings Article•10.1109/ALLERTON.2010.5706921•

Distributed detection over time varying networks: Large deviations analysis

[...]

Dragana Bajovic¹, Dusan Jakovetic¹, Joao Xavier¹, Bruno Sinopoli², Jose M. F. Moura² - Show less +1 more•Institutions (2)

Instituto Superior Técnico¹, Carnegie Mellon University²

1 Sep 2010

TL;DR: In this article, the authors apply large deviations theory to study the asymptotic performance of running consensus distributed detection in sensor networks, and show through large deviations that, under stated assumptions on the network connectivity and sensors' observations, the running consensus detection asymPTotically approaches in performance the optimal centralized detection.

...read moreread less

Abstract: We apply large deviations theory to study asymptotic performance of running consensus distributed detection in sensor networks. Running consensus is a stochastic approximation type algorithm, recently proposed. At each time step k, the state at each sensor is updated by a local averaging of the sensor's own state and the states of its neighbors (consensus) and by accounting for the new observations (innovation).We assume Gaussian, spatially correlated observations. We allow the underlying network be time varying, provided that the graph that collects the union of links that are online at least once over a finite time window is connected. This paper shows through large deviations that, under stated assumptions on the network connectivity and sensors' observations, the running consensus detection asymptotically approaches in performance the optimal centralized detection. That is, the Bayes probability of detection error (with the running consensus detector) decays exponentially to zero as k → ∞ at the Chernoff information rate-the best achievable rate of the asymptotically optimal centralized detector

...read moreread less

Posted Content•

Asymptotic Bayes optimality under sparsity for generally distributed effect sizes under the alternative

[...]

Florian Frommlet, Arijit Chakrabarti, Magdalena Murawska, Małgorzata Bogdan

26 May 2010-arXiv: Statistics Theory

TL;DR: Modifications of Bayesian Information Criterion are considered, controlling either FWER or FDR, and conditions are provided under which these selection criteria are ABOS, and the performance of these criteria is examined in a brief simulation study.

...read moreread less

Abstract: Recent results concerning asymptotic Bayes-optimality under sparsity (ABOS) of multiple testing procedures are extended to fairly generally distributed effect sizes under the alternative. An asymptotic framework is considered where both the number of tests m and the sample size m go to infinity, while the fraction p of true alternatives converges to zero. It is shown that under mild restrictions on the loss function nontrivial asymptotic inference is possible only if n increases to infinity at least at the rate of log m. Based on this assumption precise conditions are given under which the Bonferroni correction with nominal Family Wise Error Rate (FWER) level alpha and the Benjamini- Hochberg procedure (BH) at FDR level alpha are asymptotically optimal. When n is proportional to log m then alpha can remain fixed, whereas when n increases to infinity at a quicker rate, then alpha has to converge to zero roughly like n^(-1/2). Under these conditions the Bonferroni correction is ABOS in case of extreme sparsity, while BH adapts well to the unknown level of sparsity. In the second part of this article these optimality results are carried over to model selection in the context of multiple regression with orthogonal regressors. Several modifications of Bayesian Information Criterion are considered, controlling either FWER or FDR, and conditions are provided under which these selection criteria are ABOS. Finally the performance of these criteria is examined in a brief simulation study.

...read moreread less

Posted Content•

I/O Efficient Algorithms for Matrix Computations

[...]

Sraban Kumar Mohanty

07 Jun 2010-arXiv: Data Structures and Algorithms

TL;DR: It is shown that techniques like rescheduling of computational steps, appropriate choosing of the blocking parameters and incorporating of more matrix-matrix operations, can be used to improve the I/O and seek complexities of matrix computations.

...read moreread less

Abstract: We analyse some QR decomposition algorithms, and show that the I/O complexity of the tile based algorithm is asymptotically the same as that of matrix multiplication. This algorithm, we show, performs the best when the tile size is chosen so that exactly one tile fits in the main memory. We propose a constant factor improvement, as well as a new recursive cache oblivious algorithm with the same asymptotic I/O complexity. We design Hessenberg, tridiagonal, and bidiagonal reductions that use banded intermediate forms, and perform only asymptotically optimal numbers of I/Os; these are the first I/O optimal algorithms for these problems. In particular, we show that known slab based algorithms for two sided reductions all have suboptimal asymptotic I/O performances, even though they have been reported to do better than the traditional algorithms on the basis of empirical evidence. We propose new tile based variants of multishift QR and QZ algorithms that under certain conditions on the number of shifts, have better seek and I/O complexities than all known variants. We show that techniques like rescheduling of computational steps, appropriate choosing of the blocking parameters and incorporating of more matrix-matrix operations, can be used to improve the I/O and seek complexities of matrix computations.

...read moreread less

...

Expand