HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
and computational geometry, the notion of centerpoint is a generalization of the
median The median of a set of numbers is the value separating the higher half from the lower half of a Sample (statistics), data sample, a statistical population, population, or a probability distribution. For a data set, it may be thought of as the “ ...
to data in higher-dimensional
Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, in Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics there are ''Euclidean spaces ...
. Given a set of points in ''d''-dimensional space, a centerpoint of the set is a point such that any hyperplane that goes through that point divides the set of points in two roughly equal subsets: the smaller part should have at least a 1/(''d'' + 1) fraction of the points. Like the median, a centerpoint need not be one of the data points. Every non-empty set of points (with no duplicates) has at least one centerpoint.


Related concepts

Closely related concepts are the
Tukey depth In statistics and computational geometry, the Tukey depth or half-space depth is a measure of the depth of a point in a fixed set of points. The concept is named after its inventor, John Tukey. Given a set of ''n'' points \mathcal_n = \ in ''d' ...
of a point (the minimum number of sample points on one side of a hyperplane through the point) and a Tukey median of a point set (a point maximizing the Tukey depth). A centerpoint is a point of depth at least ''n''/(''d'' + 1), and a Tukey median must be a centerpoint, but not every centerpoint is a Tukey median. Both terms are named after
John Tukey John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician and statistician, best known for the development of the fast Fourier Transform (FFT) algorithm and box plot. The Tukey range test, the Tukey lambda distributi ...
. For a different generalization of the median to higher dimensions, see
geometric median In geometry, the geometric median of a discrete point set in a Euclidean space is the point minimizing the sum of distances to the sample points. This generalizes the median, which has the property of minimizing the sum of distances or absolute ...
.


Existence

A simple proof of the existence of a centerpoint may be obtained using
Helly's theorem Helly's theorem is a basic result in discrete geometry on the intersection of convex sets. It was discovered by Eduard Helly in 1913,. but not published by him until 1923, by which time alternative proofs by and had already appeared. Helly's ...
. Suppose there are ''n'' points, and consider the family of closed half-spaces that contain more than ''dn''/(''d'' + 1) of the points. Fewer than ''n''/(''d'' + 1) points are excluded from any one of these halfspaces, so the intersection of any subset of ''d'' + 1 of these halfspaces must be nonempty. By Helly's theorem, it follows that the intersection of all of these halfspaces must also be nonempty. Any point in this intersection is necessarily a centerpoint.


Algorithms

For points in the
Euclidean plane In mathematics, a Euclidean plane is a Euclidean space of Two-dimensional space, dimension two, denoted \textbf^2 or \mathbb^2. It is a geometric space in which two real numbers are required to determine the position (geometry), position of eac ...
, a centerpoint may be constructed in
linear time In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations ...
. In any dimension ''d'', a Tukey median (and therefore also a centerpoint) may be constructed in time O(''n''''d'' − 1 + ''n'' log ''n''). A
randomized algorithm A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performan ...
that repeatedly replaces sets of ''d'' + 2 points by their Radon point can be used to compute an
approximation An approximation is anything that is intentionally similar but not exactly equal to something else. Etymology and usage The word ''approximation'' is derived from Latin ''approximatus'', from ''proximus'' meaning ''very near'' and the prefix ...
to a centerpoint of any point set, in the sense that its Tukey depth is linear in the sample set size, in an amount of time that is polynomial in the dimension.


References


Citations


Sources

* . * . * . * . * {{citation , last1=Har-Peled , first1=S. , last2=Jones , first2=M. , date=2020-12-31 , title=Journey to the Center of the Point Set , url=https://doi.org/10.1145/3431285 , journal=ACM Transactions on Algorithms , volume=17 , issue=1 , pages=9:1–9:21 , doi=10.1145/3431285 , issn=1549-6325 , url-access=subscription . Euclidean geometry Multi-dimensional geometry Means Point (geometry)