Book Chapter10.1007/978-3-642-02441-2_3
Generalized Substring Compression
Orgad Keller,Tsvi Kopelowitz,Shir Landau,Moshe Lewenstein +3 more
- 18 Jun 2009
- pp 26-38
23
TL;DR: This work focuses its attention on generalized substring compression and presents the first non-trivial correct algorithm for this problem and inherently proposes a method for finding the bounded longest common prefix of substrings, which may be of independent interest.
read more
Abstract: In substring compression one is given a text to preprocess so that, upon request, a compressed substring is returned. Generalized substring compression is the same with the following twist. The queries contain an additional context substring (or a collection of context substrings) and the answers are the substring in compressed format, where the context substring is used to make the compression more efficient.
We focus our attention on generalized substring compression and present the first non-trivial correct algorithm for this problem. In our algorithm we inherently propose a method for finding the bounded longest common prefix of substrings, which may be of independent interest. In addition, we propose an efficient algorithm for substring compression which makes use of range searching for minimum queries.
We present several tradeoffs for both problems. For compressing the substring S [i . . j ] (possibly with the substring S [*** . . β ] as a context), best query times we achieve are O (C ) and $O\big(C\log\big(\frac{j-i}{C}\big)\big)$ for substring compression query and generalized substring compression query, respectively, where C is the number of phrases encoded.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Extracting powers and periods in a word from its runs structure
Maxime Crochemore,Costas S. Iliopoulos,Marcin Kubica,Jakub Radoszewski,Wojciech Rytter,Tomasz Waleń +5 more
TL;DR: Lyndon words are used and the Lyndon structure of runs are introduced as a useful tool when computing powers and in problems related to periods some versions of the Manhattan skyline problem are used.
84
Generalized substring compression
TL;DR: An efficient algorithm for substring compression which makes use of range successor queries and a new method for finding the bounded longest common prefix of substrings, which may be of independent interest are proposed.
29
•Posted Content
Minimal Suffix and Rotation of a Substring in Optimal Time
TL;DR: In this paper, the substring minimal suffix queries are used to determine the lexicographically minimal non-empty suffix of a substring specified by the location of its occurrence in the text.
19
Internal Dictionary Matching
Panagiotis Charalampopoulos,Tomasz Kociumaka,Manal Mohamed,Jakub Radoszewski,Jakub Radoszewski,Wojciech Rytter,Tomasz Waleń +6 more
TL;DR: Data structures answering queries concerning the occurrences of patterns from a given dictionary in fragments of a given string T of length n are introduced and tight—up to subpolynomial factors—upper and lower bounds for the case of a dynamic dictionary are provided.
Faster Range LCP Queries
Manish Patil,Rahul Shah,Sharma V. Thankachan +2 more
- 07 Oct 2013
TL;DR: This paper describes a linear space data structure with O(( j - i)1/2log e (j - i)) query time, where e > 0 is any constant and improves the linear space and O((j -i)loglogn) query time solution by Amir et.
14
References
A universal algorithm for sequential data compression
Jacob Ziv,A. Lempel +1 more
TL;DR: The compression ratio achieved by the proposed universal code uniformly approaches the lower bounds on the compression ratios attainable by block-to-variable codes and variable- to-block codes designed to match a completely specified source.
Linear pattern matching algorithms
Peter Weiner
- 15 Oct 1973
TL;DR: A linear time algorithm for obtaining a compacted version of a bi-tree associated with a given string is presented and indicated how to solve several pattern matching problems, including some from [4] in linear time.
2.1K
A Space-Economical Suffix Tree Construction Algorithm
TL;DR: A new algorithm is presented for constructing auxiliary digital search trees to aid in exact-match substring searching that has the same asymptotic running time bound as previously published algorithms, but is more economical in space.
1.7K
On-line construction of suffix trees
TL;DR: An on-line algorithm is presented for constructing the suffix tree for a given string in time linear in the length of the string, developed as a linear-time version of a very simple algorithm for (quadratic size) suffixtries.
1.6K
Fast algorithms for finding nearest common ancestors
Dov Harel,Robert E. Tarjan +1 more
TL;DR: An algorithm for a random access machine with uniform cost measure (and a bound of $\Omega (\log n)$ on the number of bits per word) that requires time per query and preprocessing time is presented, assuming that the collection of trees is static.
1.3K
Related Papers (5)
Graham Cormode,S. Muthukrishnan +1 more
- 23 Jan 2005
Jian Pei,W. C-H Wu,Mi-Yen Yeh +2 more
- 08 Apr 2013
Shunsuke Inenaga,Hideo Bannai +1 more
- 01 Dec 2009
Philip Bille,Inge Li Gørtz +1 more
- 27 Jun 2011