
In
mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
, the Kolakoski sequence, sometimes also known as the Oldenburger–Kolakoski sequence,
is an
infinite sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is call ...
of symbols that is the sequence of run lengths in its own
run-length encoding
Run-length encoding (RLE) is a form of lossless data compression in which ''runs'' of data (consecutive occurrences of the same data value) are stored as a single occurrence of that data value and a count of its consecutive occurrences, rather th ...
.
It is named after the
recreational mathematician William Kolakoski (1944–97) who described it in 1965,
but it was previously discussed by
Rufus Oldenburger in 1939.
Definition

The initial terms of the Kolakoski sequence are:
:1,2,2,1,1,2,1,2,2,1,2,2,1,1,2,1,1,2,2,1,2,1,1,2,1,2,2,1,1,...
Each symbol occurs in a "run" (a sequence of equal elements) of either one or two consecutive terms, and writing down the lengths of these runs gives exactly the same sequence:
:1,
2,2,
1,1,2,1,
2,2,1,
2,2,
1,1,2,
1,1,
2,2,1,2,
1,1,2,1,
2,2,
1,1,2,
1,1,2,1,
2,2,1,
2,2,
1,1,2,1,
2,2,...
:1, 2 , 2 ,1,1, 2 ,1, 2 , 2 ,1, 2 , 2 ,1,1, 2 ,1,1, 2 , 2 ,1, 2 ,1,1, 2 ,1, 2 , 2 ,1,1, 2 ,...
The description of the Kolakoski sequence is therefore reversible. If ''K'' stands for "the Kolakoski sequence", description #1 logically implies description #2 (and vice versa):
:1. The terms of ''K'' are generated by the runs (i.e., run-lengths) of ''K''
:2. The runs of ''K'' are generated by the terms of ''K''
Accordingly, one can say that each term of the Kolakoski sequence generates a run of one or two future terms. The first 1 of the sequence generates a run of "1", i.e. itself; the first 2 generates a run of "22", which includes itself; the second 2 generates a run of "11"; and so on. Each number in the sequence is the ''length'' of the next run to be generated, and the ''element'' to be generated alternates between 1 and 2:
:1,2 (length of sequence ''l'' = 2; sum of terms ''s'' = 3)
:1,2,2 (''l'' = 3, ''s'' = 5)
:1,2,2,1,1 (''l'' = 5, ''s'' = 7)
:1,2,2,1,1,2,1 (''l'' = 7, ''s'' = 10)
:1,2,2,1,1,2,1,2,2,1 (''l'' = 10, ''s'' = 15)
:1,2,2,1,1,2,1,2,2,1,2,2,1,1,2 (''l'' = 15, ''s'' = 23)
As can be seen, the length of the sequence at each stage is equal to the sum of terms in the previous stage. This animation illustrates the process:

These self-generating properties, which remain if the sequence is written without the initial 1, mean that the Kolakoski sequence can be described as a
fractal
In mathematics, a fractal is a Shape, geometric shape containing detailed structure at arbitrarily small scales, usually having a fractal dimension strictly exceeding the topological dimension. Many fractals appear similar at various scale ...
, or mathematical object that encodes its own representation on other scales.
Bertran Steinsky has created a recursive formula for the ''i''-th term of the sequence.
Research
Aperiodicity and cube-freeness
The sequence is not eventually
periodic, that is, its terms do not have a general repeating pattern (cf.
irrational number
In mathematics, the irrational numbers are all the real numbers that are not rational numbers. That is, irrational numbers cannot be expressed as the ratio of two integers. When the ratio of lengths of two line segments is an irrational number, ...
s like
π and
). More generally, the sequence is cube-free, i.e., has no substring of the form
with
some nonempty finite string.
Density
It seems plausible that the density of 1s in the Kolakoski -sequence is 1/2, but this conjecture remains unproved.
Václav Chvátal
Václav (Vašek) Chvátal () is a Professor Emeritus in the Department of Computer Science and Software Engineering at Concordia University in Montreal, Quebec, Canada, and a visiting professor at Charles University in Prague. He has published ex ...
has proved that the upper density of 1s is less than 0.50084.
Nilsson has used the same method with far greater computational power to obtain the bound 0.500080.
Although calculations of the first 3×10
8 values of the sequence appeared to show its density converging to a value slightly different from 1/2,
later calculations that extended the sequence to its first 10
13 values show the deviation from a density of 1/2 growing smaller, as one would expect if the limiting density actually is 1/2.
Connection with tag systems
The Kolakoski sequence can also be described as the result of a simple cyclic
tag system
In the theory of computation, a tag system is a deterministic model of computation published by Emil Leon Post in 1943 as a simple form of a Post canonical system. A tag system may also be viewed as an abstract machine, called a Post tag machine ( ...
. However, as this system is a 2-tag system rather than a 1-tag system (that is, it replaces pairs of symbols by other sequences of symbols, rather than operating on a single symbol at a time) it lies in the region of parameters for which tag systems are
Turing complete
Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist. He was highly influential in the development of theoretical comput ...
, making it difficult to use this representation to reason about the sequence.
Algorithms
The Kolakoski sequence may be generated by an
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
that, in the ''i''-th iteration, reads the value ''x''
''i'' that has already been output as the ''i''-th value of the sequence (or, if no such value has been output yet, sets ''x''
''i'' = ''i''). Then, if ''i'' is odd, it outputs ''x''
''i'' copies of the number 1, while if ''i'' is even, it outputs ''x''
''i'' copies of the number 2.
Thus, the first few steps of the algorithm are:
#The first value has not yet been output, so set ''x''
1 = 1, and output 1 copy of the number 1
#The second value has not yet been output, so set ''x''
2 = 2, and output 2 copies of the number 2
#The third value ''x''
3 was output as 2 in the second step, so output 2 copies of the number 1.
#The fourth value ''x''
4 was output as 1 in the third step, so output 1 copy of the number 2. Etc.
This algorithm takes
linear time
In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations ...
, but because it needs to refer back to earlier positions in the sequence it needs to store the whole sequence, taking linear space. An alternative algorithm that generates multiple copies of the sequence at different speeds, with each copy of the sequence using the output of the previous copy to determine what to do at each step, can be used to generate the sequence in linear time and only
logarithmic space.
See also
*
Golomb sequence In mathematics, the Golomb sequence, named after Solomon W. Golomb (but also called Silverman's sequence), is a monotonically increasing integer sequence where ''an'' is the number of times that ''n'' occurs in the sequence, starting with ''a''1 = ...
— another self-generating sequence based on run-length
*
Gijswijt's sequence
*
Look-and-say sequence
In mathematics, the look-and-say sequence is the integer sequence, sequence of integers beginning as follows:
: 1, 11, 21, 1211, 111221, 312211, 13112221, 1113213211, 31131211131221, ... .
To generate a member of the sequence from the previous ...
Notes
Further reading
*
*
*
*
*
*
*
External links
*
Kolakoski Constant to 25000 digits as computed by Olivier Gerard in April 1998* {{cbignore
Integer sequences
Parity (mathematics)
Fractals