Improving PPM Algorithm Using Dictionaries

doi:10.1109/DCC.2011.63

Open AccessProceedings Article10.1109/DCC.2011.63

Improving PPM Algorithm Using Dictionaries

Yichuan Hu, +3 more

- 29 Mar 2011

- pp 459-459

3

TL;DR: In this paper, a character-based PPM text compression algorithm for natural languages is proposed, in which nonwords and prefixes of words are encoded using characterbased context models and suffix of words using dictionary models.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.1109/DCC.2011.63

Improving PPM Algorithm Using Dictionaries

Yichuan Hu, +3 more

- 29 Mar 2011

TL;DR: In this paper, a character-based PPM text compression algorithm for natural languages is proposed, in which nonwords and prefixes of words are encoded using characterbased context models and suffix of words using dictionary models.

...read moreread less

3

•Posted Content

Improving PPM Algorithm Using Dictionaries

Yichuan Hu, +4 more

- 17 Dec 2010

- arXiv: Information Theory

TL;DR: This work proposes a method to improve traditional character-based PPM text compression algorithm for natural languages by using dictionary models, which can encode multiple characters as a whole, and thus enhance the compression efficiency.

...read moreread less

3

Journal Article•10.1016/J.CSI.2014.05.005

Rapid lossless compression of short text messages

Kenan Kalajdzic, +2 more

- 01 Jan 2015

- Computer Standards & Interfaces

TL;DR: b64pack is an efficient method for compression of short text messages based on standards which facilitate easy deployment and interoperability and is faster than compress, gzip and bzip2 by orders of magnitudes.

...read moreread less

References

Journal Article•10.1109/TCOM.1984.1096090

Data Compression Using Adaptive Coding and Partial String Matching

John G. Cleary, +1 more

- 01 Apr 1984

- IEEE Transactions on Communications

TL;DR: This paper describes how the conflict can be resolved with partial string matching, and reports experimental results which show that mixed-case English text can be coded in as little as 2.2 bits/ character with no prior knowledge of the source.

...read moreread less

1.4K

•Journal Article•10.1109/26.61469

Implementing the PPM data compression scheme

Alistair Moffat

- 01 Nov 1990

- IEEE Transactions on Communications

TL;DR: It is shown that the estimates made by Cleary and Witten of the resources required to implement the PPM scheme can be revised to allow for a tractable and useful implementation.

...read moreread less

477

Journal Article•10.1145/76894.76896

Modeling for text compression

Tim Bell, +2 more

- 01 Dec 1989

- ACM Computing Surveys

TL;DR: This paper surveys successful strategies for adaptive modeling that are suitable for use in practical text compression systems, and falls into three main classes: finite-context modeling, in which the last few characters are used to condition the probability distribution for the next one.

...read moreread less

343

Proceedings Article•10.1109/DCC.2002.999958

PPM: one step to practicality

D. Shkarin

- 02 Apr 2002

TL;DR: The PPM algorithm implementation that has a complexity comparable with widespread practical compression schemes based on LZ77, LZ78 and BWT algorithms is devoted.

...read moreread less

202

Proceedings Article•10.1109/DCC.1996.488310

The entropy of English using PPM-based models

William J. Teahan, +1 more

- 31 Mar 1996

TL;DR: The importance of training text for PPM is demonstrated, showing that its performance can be improved by "adjusting" the alphabet used, and the results based on these improvements are given.

...read moreread less

90