In
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
and
computational geometry
Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems ar ...
, the notion of centerpoint is a generalization of the
median
In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...
to data in higher-dimensional
Euclidean space. Given a set of points in ''d''-dimensional space, a centerpoint of the set is a point such that any hyperplane that goes through that point divides the set of points in two roughly equal subsets: the smaller part should have at least a 1/(''d'' + 1) fraction of the points. Like the median, a centerpoint need not be one of the data points. Every non-empty set of points (with no duplicates) has at least one centerpoint.
Related concepts
Closely related concepts are the Tukey depth of a point (the minimum number of sample points on one side of a hyperplane through the point) and a Tukey median of a point set (a point maximizing the Tukey depth). A centerpoint is a point of depth at least ''n''/(''d'' + 1), and a Tukey median must be a centerpoint, but not every centerpoint is a Tukey median. Both terms are named after
John Tukey
John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distributi ...
.
For a different generalization of the median to higher dimensions, see
geometric median.
Existence
A simple proof of the existence of a centerpoint may be obtained using
Helly's theorem. Suppose there are ''n'' points, and consider the family of closed
half-spaces that contain more than ''dn''/(''d'' + 1) of the points. Fewer than ''n''/(''d'' + 1) points are excluded from any one of these halfspaces, so the intersection of any subset of ''d'' + 1 of these halfspaces must be nonempty. By Helly's theorem, it follows that the intersection of all of these halfspaces must also be nonempty. Any point in this intersection is necessarily a centerpoint.
Algorithms
For points in the
Euclidean plane
In mathematics, the Euclidean plane is a Euclidean space of dimension two. That is, a geometric setting in which two real quantities are required to determine the position of each point ( element of the plane), which includes affine notions of ...
, a centerpoint may be constructed in
linear time
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
. In any dimension ''d'', a Tukey median (and therefore also a centerpoint) may be constructed in time O(''n''
''d'' − 1 + ''n'' log ''n'').
A
randomized algorithm
A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performan ...
that repeatedly replaces sets of ''d'' + 2 points by their
Radon point can be used to compute an
approximation
An approximation is anything that is intentionally similar but not exactly equality (mathematics), equal to something else.
Etymology and usage
The word ''approximation'' is derived from Latin ''approximatus'', from ''proximus'' meaning ''very ...
to a centerpoint of any point set, in the sense that its Tukey depth is linear in the sample set size, in an amount of time that is polynomial in both the number of points and the dimension.
[.]
References
Citations
Sources
* .
* .
* .
* {{citation
, last1 = Jadhav , first1 = S.
, last2 = Mukhopadhyay , first2 = A.
, doi = 10.1007/BF02574382
, issue = 1
, journal =
Discrete and Computational Geometry
, pages = 291–312
, title = Computing a centerpoint of a finite planar set of points in linear time
, volume = 12
, year = 1994, doi-access = free
.
Euclidean geometry
Multi-dimensional geometry
Means