An incompressible
string is a string with
Kolmogorov complexity
In algorithmic information theory (a subfield of computer science and mathematics), the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program (in a predetermined programming language) that prod ...
equal to its length, so that it has no shorter encodings.
[V. Chandru and M.R.Rao, '' Algorithms and Theory of Computation Handbook'', CRC Press 1999, p29-30.]
Example
Suppose we have the string 12349999123499991234, and we are using a
compression method that works by putting a special character into the string (say '@') followed by a value that points to an entry in a
lookup table
In computer science, a lookup table (LUT) is an array that replaces runtime computation with a simpler array indexing operation. The process is termed as "direct addressing" and LUTs differ from hash tables in a way that, to retrieve a value v w ...
(or dictionary) of repeating values. Let's imagine we have an algorithm that examines the string in 4 character chunks. Looking at our string, our algorithm might pick out the values 1234 and 9999 to place into its dictionary. Let's say 1234 is entry 0 and 9999 is entry 1. Now the string can become:
@0@1@0@1@0
Obviously, this is much shorter, although storing the dictionary itself will cost some space. However, the more repeats there are in the string, the better the compression will be.
Our algorithm can do better though, if it can view the string in chunks larger than 4 characters. Then it can put 12349999 and 1234 into the dictionary, giving us:
@0@0@1
Even shorter. Now consider another string:
1234999988884321
This string is incompressible by our algorithm. The only repeats that occur are 88 and 99. If we were to store 88 and 99 in our dictionary, we would produce:
1234@1@1@0@04321
Unfortunately this is just as long as the original string, because our placeholders for items in the dictionary are 2 characters long, and the items they replace are the same length. Hence, this string is incompressible by our algorithm.
References
{{DEFAULTSORT:Incompressible String
Lossless compression algorithms
String (computer science)