HOME

TheInfoList



OR:

Algebraic statistics is the use of
algebra Algebra is a branch of mathematics that deals with abstract systems, known as algebraic structures, and the manipulation of expressions within those systems. It is a generalization of arithmetic that introduces variables and algebraic ope ...
to advance
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
. Algebra has been useful for
experimental design The design of experiments (DOE), also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. ...
, parameter estimation, and
hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...
. Traditionally, algebraic statistics has been associated with the design of experiments and
multivariate analysis Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., '' multivariate random variables''. Multivariate statistics concerns understanding the differ ...
(especially
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...
). In recent years, the term "algebraic statistics" has been sometimes restricted, sometimes being used to label the use of
algebraic geometry Algebraic geometry is a branch of mathematics which uses abstract algebraic techniques, mainly from commutative algebra, to solve geometry, geometrical problems. Classically, it studies zero of a function, zeros of multivariate polynomials; th ...
and
commutative algebra Commutative algebra, first known as ideal theory, is the branch of algebra that studies commutative rings, their ideal (ring theory), ideals, and module (mathematics), modules over such rings. Both algebraic geometry and algebraic number theo ...
in statistics.


The tradition of algebraic statistics

In the past, statisticians have used algebra to advance research in statistics. Some algebraic statistics led to the development of new topics in algebra and combinatorics, such as
association scheme The theory of association schemes arose in statistics, in the theory of design of experiments, experimental design for the analysis of variance. In mathematics, association schemes belong to both algebra and combinatorics. In algebraic combinatori ...
s.


Design of experiments

For example, Ronald A. Fisher, Henry B. Mann, and Rosemary A. Bailey applied
Abelian group In mathematics, an abelian group, also called a commutative group, is a group in which the result of applying the group operation to two group elements does not depend on the order in which they are written. That is, the group operation is commu ...
s to the
design of experiments The design of experiments (DOE), also known as experiment design or experimental design, is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. ...
. Experimental designs were also studied with
affine geometry In mathematics, affine geometry is what remains of Euclidean geometry when ignoring (mathematicians often say "forgetting") the metric notions of distance and angle. As the notion of '' parallel lines'' is one of the main properties that is i ...
over
finite fields In mathematics, a finite field or Galois field (so-named in honor of Évariste Galois) is a field that contains a finite number of elements. As with any field, a finite field is a set on which the operations of multiplication, addition, subt ...
and then with the introduction of
association scheme The theory of association schemes arose in statistics, in the theory of design of experiments, experimental design for the analysis of variance. In mathematics, association schemes belong to both algebra and combinatorics. In algebraic combinatori ...
s by R. C. Bose. Orthogonal arrays were introduced by C. R. Rao also for experimental designs.


Algebraic analysis and abstract statistical inference

Invariant measures on
locally compact group In mathematics, a locally compact group is a topological group ''G'' for which the underlying topology is locally compact and Hausdorff. Locally compact groups are important because many examples of groups that arise throughout mathematics are lo ...
s have long been used in
statistical theory The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistica ...
, particularly in
multivariate analysis Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., '' multivariate random variables''. Multivariate statistics concerns understanding the differ ...
. Beurling's factorization theorem and much of the work on (abstract)
harmonic analysis Harmonic analysis is a branch of mathematics concerned with investigating the connections between a function and its representation in frequency. The frequency representation is found by using the Fourier transform for functions on unbounded do ...
sought better understanding of the Wold
decomposition Decomposition is the process by which dead organic substances are broken down into simpler organic or inorganic matter such as carbon dioxide, water, simple sugars and mineral salts. The process is a part of the nutrient cycle and is ess ...
of stationary stochastic processes, which is important in
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...
statistics. Encompassing previous results on probability theory on algebraic structures,
Ulf Grenander Ulf Grenander (23 July 1923 – 12 May 2016) was a Swedish statistician and professor of applied mathematics at Brown University. His early research was in probability theory, stochastic processes, time series analysis, and statistical theory (pa ...
developed a theory of "abstract inference". Grenander's abstract inference and his theory of patterns are useful for
spatial statistics Spatial statistics is a field of applied statistics dealing with spatial data. It involves stochastic processes (random fields, point processes), sampling, smoothing and interpolation, regional ( areal unit) and lattice ( gridded) data, poin ...
and
image analysis Image analysis or imagery analysis is the extraction of meaningful information from images; mainly from digital images by means of digital image processing techniques. Image analysis tasks can be as simple as reading barcode, bar coded tags or a ...
; these theories rely on
lattice theory A lattice is an abstract structure studied in the mathematical subdisciplines of order theory and abstract algebra. It consists of a partially ordered set in which every pair of elements has a unique supremum (also called a least upper bou ...
.


Partially ordered sets and lattices

Partially ordered vector spaces and vector lattices are used throughout statistical theory.
Garrett Birkhoff Garrett Birkhoff (January 19, 1911 – November 22, 1996) was an American mathematician. He is best known for his work in lattice theory. The mathematician George Birkhoff (1884–1944) was his father. Life The son of the mathematician Ge ...
metrized the positive cone using Hilbert's projective metric and proved Jentsch's theorem using the
contraction mapping In mathematics, a contraction mapping, or contraction or contractor, on a metric space (''M'', ''d'') is a function ''f'' from ''M'' to itself, with the property that there is some real number 0 \leq k < 1 such that for all ''x'' and ...
theorem In mathematics and formal logic, a theorem is a statement (logic), statement that has been Mathematical proof, proven, or can be proven. The ''proof'' of a theorem is a logical argument that uses the inference rules of a deductive system to esta ...
. Birkhoff's results have been used for maximum entropy
estimation Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is d ...
(which can be viewed as
linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements and objective are represented by linear function#As a polynomia ...
in infinite dimensions) by Jonathan Borwein and colleagues. Vector lattices and conical measures were introduced into statistical decision theory by Lucien Le Cam.


Recent work using commutative algebra and algebraic geometry

In recent years, the term "algebraic statistics" has been used more restrictively, to label the use of
algebraic geometry Algebraic geometry is a branch of mathematics which uses abstract algebraic techniques, mainly from commutative algebra, to solve geometry, geometrical problems. Classically, it studies zero of a function, zeros of multivariate polynomials; th ...
and
commutative algebra Commutative algebra, first known as ideal theory, is the branch of algebra that studies commutative rings, their ideal (ring theory), ideals, and module (mathematics), modules over such rings. Both algebraic geometry and algebraic number theo ...
to study problems related to
discrete random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. The term 'random variable' in its mathematical definition refers ...
s with finite state spaces. Commutative algebra and algebraic geometry have applications in statistics because many commonly used classes of discrete random variables can be viewed as
algebraic varieties Algebraic varieties are the central objects of study in algebraic geometry, a sub-field of mathematics. Classically, an algebraic variety is defined as the set of solutions of a system of polynomial equations over the real or complex numbers. ...
.


Introductory example

Consider a
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
''X'' which can take on the values 0, 1, 2. Such a variable is completely characterized by the three probabilities :p_i=\mathrm(X=i),\quad i=0,1,2 and these numbers satisfy :\sum_^2 p_i = 1 \quad \mbox\quad 0\leq p_i \leq 1. Conversely, any three such numbers unambiguously specify a random variable, so we can identify the random variable ''X'' with the tuple (p_0,p_1,p_2)\in\R^3. Now suppose ''X'' is a binomial random variable with parameter ''q'' and ''n = 2'', i.e. ''X'' represents the number of successes when repeating a certain experiment two times, where each experiment has an individual success probability of ''q''. Then :p_i=\mathrm(X=i)=q^i (1-q)^ and it is not hard to show that the tuples (p_0,p_1,p_2) which arise in this way are precisely the ones satisfying :4 p_0 p_2-p_1^2=0.\ The latter is a
polynomial equation In mathematics, an algebraic equation or polynomial equation is an equation of the form P = 0, where ''P'' is a polynomial with coefficients in some field (mathematics), field, often the field of the rational numbers. For example, x^5-3x+1=0 is a ...
defining an algebraic variety (or surface) in \R^3, and this variety, when intersected with the
simplex In geometry, a simplex (plural: simplexes or simplices) is a generalization of the notion of a triangle or tetrahedron to arbitrary dimensions. The simplex is so-named because it represents the simplest possible polytope in any given dimension. ...
given by :\sum_^2 p_i = 1 \quad \mbox\quad 0\leq p_i \leq 1, yields a piece of an
algebraic curve In mathematics, an affine algebraic plane curve is the zero set of a polynomial in two variables. A projective algebraic plane curve is the zero set in a projective plane of a homogeneous polynomial in three variables. An affine algebraic plane cu ...
which may be identified with the set of all 3-state Bernoulli variables. Determining the parameter ''q'' amounts to locating one point on this curve; testing the hypothesis that a given variable ''X'' is Bernoulli amounts to testing whether a certain point lies on that curve or not.


Application of algebraic geometry to statistical learning theory

Algebraic geometry has also recently found applications to
statistical learning theory Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis. Statistical learning theory deals with the statistical inference problem of finding a predictive function based on da ...
, including a
generalization A generalization is a form of abstraction whereby common properties of specific instances are formulated as general concepts or claims. Generalizations posit the existence of a domain or set of elements, as well as one or more common characteri ...
of the Akaike information criterion to singular statistical models.


References

* R. A. Bailey
''Association Schemes: Designed Experiments, Algebra and Combinatorics''Cambridge University Press
Cambridge, 2004. 387pp. . (Chapters from preliminary draft are available on-line) * * * H. B. Mann. 1949. ''Analysis and Design of Experiments: Analysis of Variance and Analysis-of-Variance Designs''. Dover. * * * * L. Pachter and B. Sturmfels. ''Algebraic Statistics for Computational Biology.'' Cambridge University Press 2005. * G. Pistone, E. Riccomango, H. P. Wynn. ''Algebraic Statistics.'' CRC Press, 2001. * Drton, Mathias, Sturmfels, Bernd, Sullivant, Seth. ''Lectures on Algebraic Statistics'', Springer 2009. * Watanabe, Sumio. ''Algebraic Geometry and Statistical Learning Theory'', Cambridge University Press 2009. * Paolo Gibilisco, Eva Riccomagno, Maria-Piera Rogantin, Henry P. Wynn. ''Algebraic and Geometric Methods in Statistics'', Cambridge 2009.


External links


Algebraic Statistics

Journal of Algebraic Statistics

Archives of Journal of Algebraic Statistics
{{DEFAULTSORT:Algebraic Statistics Statistical theory