Subscribe

Saturday, January 3, 2009

DOWNLOAD Compression Algorithms for Real Programmers

Compression Algorithms for Real Programmers

The science of compressing data is the art of creating shorthand representations for the data—that is, automatically ? ndingabbreviations; i.e. yadda yadda yadda, etc. All of the algorithms can be described with a simple phrase: Look for repetition, and replace the repetition with a shorter representation. This repetition is usually fairly easy to ? nd. The letters “rep” are repeated eight times in this paragraph alone. If they were replaced with, say, the asterix character (*), then two characters would be saved eight times. It' s not much, but it' s a start. The algorithms succeed when they have a good model for the underlying data. They can even fail when the model does a bad job of matching the data. The model of looking for three letters like “rep” works well in some sentences, but it fails in others. The art of designing the algorithm is really the art of ? ndinga good model of the data that can also be ? tto the data ef? ciently. The algorithms in this book are different attempts to ? nda good,
automatic way of identifying repetitive patterns and removing them from a ? le. Some work well on text data, while others are tuned to images or audio ? les. All of them, however, are far from perfect. If an algorithm has a strength, then it will also have a weakness. The best algorithm for some data is often the worst for other types of data. To paraphrase Abraham Lincoln: You can compress all of the types of ? lessome of the time and some of the types of ? lesall of the time, but you can't compress all of the types of ? lesall of the time.

Free Ebook
DOWNLOAD

No comments: