Skip to Main Content

Digital Revolution (II) - Compression Codes and Technologies

Compression methods and their applications

Feature Column Archive


6. Compression methods and their applications

Although Huffman's approach to data compression is very appealing it suffers from the problem that it requires one to have all of the data to be compressed in hand in order to get the relative frequencies of items to be compressed. In many situations it is not practical to do this. Systems of compression which are more flexible were desirable. Using other remarkably simple ideas Jakob Ziv, Abraham Lempel and Terry Welch developed other very valuable compression schemes. Ziv-Lempel actually developed two different compression schemes, each interesting in its own right. One of these was modified by Terry Welch into a widely used standard for compression, now known as LZW (Lempel, Ziv and Welch). In fact, the two papers of Ziv and Lempel spawned a whole family of compression schemes that are incorporated in a great variety of applications ranging from modems and archivers to databases.

Very briefly, the ideas involved in the Ziv, Lempel, Welch family of compression algorithms involve the use of either of the following steps:

1. A dynamical dictionary of strings with associated code words is constructed. When a string appears later in the material to be compressed, instead of recording the long string, the shorter code word associated with the string is used.


2. As the information to be compressed is processed, if a sufficiently long string appears which has occurred earlier, instead of repeating this string, instructions are given as to where in the text the earlier occurrence of the string begins and how many characters to repeat starting at this location.

A number of additional clever ideas have been developed as compression schemes and, again, are widely applied. Here are some of the major families of such compression schemes and very brief attempts to describe the ideas behind them:

Run Length Encoding (RLE)

This system uses the idea that when a very long string of identical symbols, say, H appear, one can replace this long string by saying in a way that can be distinguished from the text symbols themselves, H appears 1000 times.

Run length encoding is especially valuable in the compression of images and changing images. It lies behind many of the methods being used to compress video information.

Arithmetic coding

This scheme, for which IBM holds many of the patents, involves associating a real number with a text in a clever manner. Instead of using code words for pieces of a long string to be compressed, this method creates a long string, but one that is shorter than the original very long string. Typically two passes are involved where, during the first pass, relative frequencies for the original string are compiled.

Fractal compression (IFS, Iterated function systems)

In this method of compression one reconstructs, say, an image, by dividing it into pieces, and each piece is restored by displaying the result of iterating a function on some smaller object.


Wavelet compression techniques build on new ideas which use specialized functions to sample continuous functions.

Tailoring the different compression methods to the many rapidly changing environments in which compression is of potential value has led to applications of data compression to such diverse areas as:

*satellite imagery

* mini discs

* MP3 technology

* fax

* digital cameras

* DVD technology

* modems

* wireless telephony

* database design

* storage and transmission of CT and MRI scans

* mammography

* digital images, high definition television (HDTV), and video games

to mention a few!!

Without mathematics there would be no really effective compression systems, and without effective compression, many of the wonderful technologies we take for granted, and new technologies to be developed in the future, would not be possible. Truly, mathematics is a hidden partner of these technologies. Furthermore, the range of mathematical topics drawn on in compression ideas is staggering. Ideas from linear algebra, probability and statistics, graph theory, automata, calculus, Fourier series and analysis, and abstract algebra all come into play.



  1. Introduction
  2. Information and compression
  3. Encoding and decoding
  4. Lossy and non-lossy compression
  5. Huffman codes
  6. Compression methods and their applications
  7. Compression and intellectual property
  8. References

© Copyright 2003, American MathematicalSociety
Privacy Statement
  Search the AMS     Powered by Google