Solomonoff's theory of inductive inference proves that, under its common sense assumptions (axioms), the best possible scientific model is the shortest algorithm that generates the empirical data under consideration. In addition to the choice of data, other assumptions are that, to avoid the post-hoc fallacy, the programming language must be chosen prior to the data and that the environment being observed is generated by an unknown algorithm. This is also called a theory of
induction. Due to its basis in the dynamical (
state-space model) character of
Algorithmic Information Theory, it encompasses
''statistical'' as well as dynamical information criteria for model selection. It was introduced by
Ray Solomonoff, based on
probability theory
Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
and
theoretical computer science
Theoretical computer science is a subfield of computer science and mathematics that focuses on the Abstraction, abstract and mathematical foundations of computation.
It is difficult to circumscribe the theoretical areas precisely. The Associati ...
.
In essence, Solomonoff's induction derives the
posterior probability
The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posteri ...
of any
computable
Computability is the ability to solve a problem by an effective procedure. It is a key topic of the field of computability theory within mathematical logic and the theory of computation within computer science. The computability of a problem is cl ...
theory, given a sequence of observed data. This posterior probability is derived from
Bayes' rule and some ''universal'' prior, that is, a prior that assigns a positive probability to any computable theory.
Solomonoff proved that this induction is
incomputable (or more precisely, lower semi-computable), but noted that "this incomputability is of a very benign kind", and that it "in no way inhibits its use for practical prediction" (as it can be approximated from below more accurately with more computational resources).
It is only "incomputable" in the benign sense that no scientific consensus is able to prove that the best current
scientific theory
A scientific theory is an explanation of an aspect of the universe, natural world that can be or that has been reproducibility, repeatedly tested and has corroborating evidence in accordance with the scientific method, using accepted protocol (s ...
is the best of all possible theories. However, Solomonoff's theory does provide an objective criterion for deciding among the current scientific theories explaining a given set of observations.
Solomonoff's induction naturally formalizes
Occam's razor
In philosophy, Occam's razor (also spelled Ockham's razor or Ocham's razor; ) is the problem-solving principle that recommends searching for explanations constructed with the smallest possible set of elements. It is also known as the principle o ...
[JJ McCall. Induction: From Kolmogorov and Solomonoff to De Finetti and Back to Kolmogorov – Metroeconomica, 2004 – Wiley Online Library.][D Stork. Foundations of Occam's razor and parsimony in learning from ricoh.com – NIPS 2001 Workshop, 2001][A.N. Soklakov. Occam's razor as a formal basis for a physical theor]
from arxiv.org
– Foundations of Physics Letters, 2002 – Springer[M Hutter. On the existence and convergence of computable universal prior]
arxiv.org
– Algorithmic Learning Theory, 2003 – Springer by assigning larger prior credences to theories that require a shorter algorithmic description.
Origin
Philosophical
The theory is based in philosophical foundations, and was founded by
Ray Solomonoff around 1960. It is a mathematically formalized combination of
Occam's razor
In philosophy, Occam's razor (also spelled Ockham's razor or Ocham's razor; ) is the problem-solving principle that recommends searching for explanations constructed with the smallest possible set of elements. It is also known as the principle o ...
and the
Principle of Multiple Explanations
Epicurus (, ; ; 341–270 BC) was an Greek philosophy, ancient Greek philosopher who founded Epicureanism, a highly influential school of philosophy that asserted that philosophy's purpose is to attain as well as to help others attain tranqui ...
.
[Ming Li and Paul Vitanyi, ''An Introduction to Kolmogorov Complexity and Its Applications.'' Springer-Verlag, N.Y., 2008p 339 ff.] All
computable
Computability is the ability to solve a problem by an effective procedure. It is a key topic of the field of computability theory within mathematical logic and the theory of computation within computer science. The computability of a problem is cl ...
theories which perfectly describe previous observations are used to calculate the probability of the next observation, with more weight put on the shorter computable theories.
Marcus Hutter's
universal artificial intelligence builds upon this to calculate the
expected value
In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
of an action.
Principle
Solomonoff's induction has been argued to be the computational formalization of pure
Bayesianism.
To understand, recall that Bayesianism derives the posterior probability