
Information geometry is an interdisciplinary field that applies the techniques of
differential geometry
Differential geometry is a Mathematics, mathematical discipline that studies the geometry of smooth shapes and smooth spaces, otherwise known as smooth manifolds. It uses the techniques of Calculus, single variable calculus, vector calculus, lin ...
to study
probability theory
Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
and
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
. It studies
statistical manifolds, which are
Riemannian manifold
In differential geometry, a Riemannian manifold is a geometric space on which many geometric notions such as distance, angles, length, volume, and curvature are defined. Euclidean space, the N-sphere, n-sphere, hyperbolic space, and smooth surf ...
s whose points correspond to
probability distributions
In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spac ...
.
Introduction
Historically, information geometry can be traced back to the work of
C. R. Rao, who was the first to treat the
Fisher matrix as a
Riemannian metric
In differential geometry, a Riemannian manifold is a geometric space on which many geometric notions such as distance, angles, length, volume, and curvature are defined. Euclidean space, the N-sphere, n-sphere, hyperbolic space, and smooth surf ...
. The modern theory is largely due to
Shun'ichi Amari, whose work has been greatly influential on the development of the field.
Classically, information geometry considered a parametrized
statistical model
A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repre ...
as a
Riemannian, conjugate connection, statistical, and dually flat manifolds. Unlike usual smooth manifolds with tensor metric and Levi-Civita connection, these take into account conjugate connection, torsion, and Amari-Chentsov metric. All presented above geometric structures find application in
information theory
Information theory is the mathematical study of the quantification (science), quantification, Data storage, storage, and telecommunications, communication of information. The field was established and formalized by Claude Shannon in the 1940s, ...
and
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
. For such models, there is a natural choice of Riemannian metric, known as the
Fisher information metric
In information geometry, the Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, ''i.e.'', a smooth manifold whose points are probability distributions. It can be used to calculate the ...
. In the special case that the statistical model is an
exponential family, it is possible to induce the statistical manifold with a Hessian metric (i.e a Riemannian metric given by the potential of a convex function). In this case, the manifold naturally inherits two flat
affine connections, as well as a canonical
Bregman divergence. Historically, much of the work was devoted to studying the associated geometry of these examples. In the modern setting, information geometry applies to a much wider context, including non-exponential families,
nonparametric statistics
Nonparametric statistics is a type of statistical analysis that makes minimal assumptions about the underlying distribution of the data being studied. Often these models are infinite-dimensional, rather than finite dimensional, as in parametric s ...
, and even abstract statistical manifolds not induced from a known statistical model. The results combine techniques from
information theory
Information theory is the mathematical study of the quantification (science), quantification, Data storage, storage, and telecommunications, communication of information. The field was established and formalized by Claude Shannon in the 1940s, ...
,
affine differential geometry,
convex analysis
Convex analysis is the branch of mathematics devoted to the study of properties of convex functions and convex sets, often with applications in convex optimization, convex minimization, a subdomain of optimization (mathematics), optimization theor ...
and many other fields. One of the most perspective information geometry approaches find applications in
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
. For example, the developing of information-geometric optimization methods (mirror descent and natural gradient descent).
The standard references in the field are Shun’ichi Amari and Hiroshi Nagaoka's book, ''Methods of Information Geometry'', and the more recent book by Nihat Ay and others. A gentle introduction is given in the survey by Frank Nielsen. In 2018, the journal ''Information Geometry'' was released, which is devoted to the field.
Contributors
The history of information geometry is associated with the discoveries of at least the following people, and many others.
*
Ronald Fisher
Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
*
Harald Cramér
*
Calyampudi Radhakrishna Rao
*
Harold Jeffreys
Sir Harold Jeffreys, FRS (22 April 1891 – 18 March 1989) was a British geophysicist who made significant contributions to mathematics and statistics. His book, ''Theory of Probability'', which was first published in 1939, played an importan ...
*
Solomon Kullback
*
Jean-Louis Koszul
*
Richard Leibler
*
Claude Shannon
Claude Elwood Shannon (April 30, 1916 – February 24, 2001) was an American mathematician, electrical engineer, computer scientist, cryptographer and inventor known as the "father of information theory" and the man who laid the foundations of th ...
*
Imre Csiszár
*
Nikolai Chentsov (also written as N. N. Čencov)
*
Bradley Efron
*
Shun'ichi Amari
*
Ole Barndorff-Nielsen
* Frank Nielsen
*
Damiano Brigo
*
A. W. F. Edwards
Anthony William Fairbank Edwards, Fellow of the Royal Society, FRS One or more of the preceding sentences incorporates text from the royalsociety.org website where: (born 1935) is a British statistician, geneticist and evolutionary biologist. Ed ...
*
Grant Hillier
*
Kees Jan van Garderen
Applications
As an interdisciplinary field, information geometry has been used in various applications.
Here an incomplete list:
* Statistical inference
* Time series and linear systems
* Filtering problem
* Quantum systems
* Neural networks
* Machine learning
* Statistical mechanics
* Biology
* Statistics
* Mathematical finance
See also
*
Ruppeiner geometry
*
Kullback–Leibler divergence
In mathematical statistics, the Kullback–Leibler (KL) divergence (also called relative entropy and I-divergence), denoted D_\text(P \parallel Q), is a type of statistical distance: a measure of how much a model probability distribution is diff ...
*
Stochastic geometry
*
Stochastic differential geometry
*
Projection filters
References
External links
Information Geometry journal by Springer
Information Geometryoverview by Cosma Rohilla Shalizi, July 2010
Information Geometrynotes by
John C. Baez, John Baez, November 2012
Information geometry for neural networks(pdf ) by Daniel Wagenaar
{{Differentiable computing