An analysis of the Burrows—Wheeler transform
Top Cited Papers
- 1 May 2001
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in Journal of the ACM
- Vol. 48 (3), 407-430
- https://doi.org/10.1145/382780.382782
Abstract
The Burrows—Wheeler Transform (also known as Block-Sorting) is at the base of compression algorithms that are the state of the art in lossless data compression. In this paper, we analyze two algorithms that use this technique. The first one is the original algorithm described by Burrows and Wheeler, which, despite its simplicity outperforms the Gzip compressor. The second one uses an additional run-length encoding step to improve compression. We prove that the compression ratio of both algorithms can be bounded in terms of the kth order empirical entropy of the input string for any k ≥ 0. We make no assumptions on the input and we obtain bounds which hold in the worst case that is for every possible input string. All previous results for Block-Sorting algorithms were concerned with the average compression ratio and have been established assuming that the input comes from a finite-order Markov source.Keywords
This publication has 11 references indexed in Scilit:
- Compression of Low Entropy Strings with Lempel--Ziv AlgorithmsSIAM Journal on Computing, 2000
- Unbounded Length Contexts for PPMThe Computer Journal, 1997
- The Burrows-Wheeler Transform for Block Sorting Text Compression: Principles and ImprovementsThe Computer Journal, 1996
- Analysis of arithmetic coding for data compressionInformation Processing & Management, 1992
- Practical Implementations of Arithmetic CodingPublished by Springer Nature ,1992
- Implementing the PPM data compression schemeIEEE Transactions on Communications, 1990
- Data Compression Using Dynamic Markov ModellingThe Computer Journal, 1987
- Design and analysis of dynamic Huffman codesJournal of the ACM, 1987
- A locally adaptive data compression schemeCommunications of the ACM, 1986
- A Method for the Construction of Minimum-Redundancy CodesProceedings of the IRE, 1952