In
combinatorial mathematics and
extremal set theory, the Sauer–Shelah lemma states that every
family of sets
In set theory and related branches of mathematics, a family (or collection) can mean, depending upon the context, any of the following: set, indexed set, multiset, or class. A collection F of subsets of a given set S is called a family of su ...
with small
VC dimension
VC may refer to:
Military decorations
* Victoria Cross, a military decoration awarded by the United Kingdom and other Commonwealth nations
** Victoria Cross for Australia
** Victoria Cross (Canada)
** Victoria Cross for New Zealand
* Victorious ...
consists of a small number of sets. It is named after
Norbert Sauer and
Saharon Shelah
Saharon Shelah (; , ; born July 3, 1945) is an Israeli mathematician. He is a professor of mathematics at the Hebrew University of Jerusalem and Rutgers University in New Jersey.
Biography
Shelah was born in Jerusalem on July 3, 1945. He is th ...
, who published it independently of each other in 1972. The same result was also published slightly earlier and again independently, by
Vladimir Vapnik and
Alexey Chervonenkis, after whom the VC dimension is named. In his paper containing the lemma, Shelah gives credit also to
Micha Perles
Micha Asher Perles () is an Israeli mathematician working in geometry, a professor emeritus at the Hebrew University. He earned his Ph.D. in 1964 from the Hebrew University, under the supervision of Branko Grünbaum.
His contributions include:
*Th ...
, and for this reason the lemma has also been called the Perles–Sauer–Shelah lemma and the Sauer–Shelah–Perles lemma.
Buzaglo et al. call this lemma "one of the most fundamental results on VC-dimension", and it has applications in many areas. Sauer's motivation was in the
combinatorics
Combinatorics is an area of mathematics primarily concerned with counting, both as a means and as an end to obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many ...
of set systems, while Shelah's was in
model theory
In mathematical logic, model theory is the study of the relationship between theory (mathematical logic), formal theories (a collection of Sentence (mathematical logic), sentences in a formal language expressing statements about a Structure (mat ...
and that of Vapnik and Chervonenkis was in
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
. It has also been applied in
discrete geometry
Discrete geometry and combinatorial geometry are branches of geometry that study combinatorial properties and constructive methods of discrete geometric objects. Most questions in discrete geometry involve finite or discrete sets of basic geom ...
and
graph theory
In mathematics and computer science, graph theory is the study of ''graph (discrete mathematics), graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of ''Vertex (graph ...
.
Definitions and statement
If
is a family of sets and
is a set, then
is said to be
shattered by
if every subset of
(including the
empty set
In mathematics, the empty set or void set is the unique Set (mathematics), set having no Element (mathematics), elements; its size or cardinality (count of elements in a set) is 0, zero. Some axiomatic set theories ensure that the empty set exi ...
and
itself) can be obtained as the intersection
of
with some set
in the family. The
VC dimension
VC may refer to:
Military decorations
* Victoria Cross, a military decoration awarded by the United Kingdom and other Commonwealth nations
** Victoria Cross for Australia
** Victoria Cross (Canada)
** Victoria Cross for New Zealand
* Victorious ...
of
is the largest
cardinality
The thumb is the first digit of the hand, next to the index finger. When a person is standing in the medical anatomical position (where the palm is facing to the front), the thumb is the outermost digit. The Medical Latin English noun for thum ...
of a set shattered by
.
In terms of these definitions, the Sauer–Shelah lemma states that if the VC dimension of
is
, and the union of
has
elements, then
can consist of at most
sets, as expressed using
big O notation
Big ''O'' notation is a mathematical notation that describes the asymptotic analysis, limiting behavior of a function (mathematics), function when the Argument of a function, argument tends towards a particular value or infinity. Big O is a memb ...
. Equivalently, if the number of sets in the family,
, obeys the inequality
then
shatters a set of size
.
The bound of the lemma is tight: Let the family
be composed of all subsets of
with size less than
. Then the number of sets in
is exactly
but it does not shatter any set of size
.
The number of shattered sets
A strengthening of the Sauer–Shelah lemma, due to , states that every finite set family
shatters at least
sets. This immediately implies the Sauer–Shelah lemma, because only
of the subsets of an
-item universe have cardinality less than
. Thus, when
there are not enough small sets to be shattered, so one of the shattered sets must have cardinality at least
.
For a restricted type of shattered set, called an order-shattered set, the number of shattered sets always equals the cardinality of the set family.
Proof
Pajor's variant of the Sauer–Shelah lemma may be proved by
mathematical induction
Mathematical induction is a method for mathematical proof, proving that a statement P(n) is true for every natural number n, that is, that the infinitely many cases P(0), P(1), P(2), P(3), \dots all hold. This is done by first proving a ...
; the proof has variously been credited to
Noga Alon
Noga Alon (; born 1956) is an Israeli mathematician and a professor of mathematics at Princeton University noted for his contributions to combinatorics and theoretical computer science, having authored hundreds of papers.
Education and career
...
or to
Ron Aharoni
Ron Aharoni (; born 1952) is an Israeli mathematician, working in finite and infinite combinatorics. Aharoni is a professor at the Technion – Israel Institute of Technology, where he received his Ph.D. in mathematics in 1979. With Nash-William ...
and Ron Holzman.
;Base: Every family of only one set shatters the empty set.
;Step: Assume the lemma is true for all families of size less than
and let
be a family of two or more sets. Let
be an element that belongs to some but not all of the sets in
. Split
into two subfamilies, of the sets that contain
and the sets that do not contain
. By the induction assumption, these two subfamilies shatter two collections of sets whose sizes add to at least
. None of these shattered sets contain
, since a set that contains
cannot be shattered by a family in which all sets contain
or all sets do not contain
. Some of the shattered sets may be shattered by both subfamilies. When a set
is shattered by only one of the two subfamilies, it contributes one unit both to the number of shattered sets of the subfamily and to the number of shattered sets of
. When a set
is shattered by both subfamilies, both
and
are shattered by
, so
contributes two units to the number of shattered sets of the subfamilies and of
. Therefore, the number of shattered sets of
is at least equal to the number shattered by the two subfamilies of
, which is at least
.
A different proof of the Sauer–Shelah lemma in its original form, by
Péter Frankl
Péter Frankl (born 26 March 1953 in Kaposvár, Somogy County, Hungary) is a mathematician, street performer, columnist and educator, active in Japan. Frankl studied mathematics at Eötvös Loránd University in Budapest and submitted his PhD ...
and
János Pach
János Pach (born May 3, 1954) is a mathematician and computer scientist working in the fields of combinatorics and discrete and computational geometry.
Biography
Pach was born and grew up in Hungary. He comes from a noted academic family: his f ...
, is based on
linear algebra
Linear algebra is the branch of mathematics concerning linear equations such as
:a_1x_1+\cdots +a_nx_n=b,
linear maps such as
:(x_1, \ldots, x_n) \mapsto a_1x_1+\cdots +a_nx_n,
and their representations in vector spaces and through matrix (mathemat ...
and the
inclusion–exclusion principle
In combinatorics, the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union (set theory), union of two finite sets; symbolically expressed as
: , A \cup B, ...
. This proof extends to other settings such as families of vector spaces and, more generally,
geometric lattices.
Applications
The original application of the lemma, by Vapnik and Chervonenkis, was in showing that every probability distribution can be approximated (with respect to a family of events of a given VC dimension) by a finite set of sample points whose
cardinality
The thumb is the first digit of the hand, next to the index finger. When a person is standing in the medical anatomical position (where the palm is facing to the front), the thumb is the outermost digit. The Medical Latin English noun for thum ...
depends only on the VC dimension of the family of events. In this context, there are two important notions of approximation, both parameterized by a number
: a set
of samples, and a probability distribution on
, is said to be an
-approximation of the original distribution if the probability of each event with respect to
differs from its original probability by at most
. A set
of (unweighted) samples is said to be an
-net if every event with probability at least
includes at least one point of
. An
-approximation must also be an
-net but not necessarily vice versa.
Vapnik and Chervonenkis used the lemma to show that set systems of VC dimension
always have
-approximations of cardinality
Later authors including and similarly showed that there always exist
-nets of cardinality
, and more precisely of cardinality at most
The main idea of the proof of the existence of small
-nets is to choose a random sample
of cardinality
and a second independent random sample
of cardinality
, and to bound the probability that
is missed by some large event
by the probability that
is missed and simultaneously the intersection of
with
is larger than its median value. For any particular
, the probability that
is missed while
is larger than its median is very small,
and the Sauer–Shelah lemma (applied to
) shows that only a small number of distinct events
need to be considered, so by the
union bound
In probability theory, Boole's inequality, also known as the union bound, says that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the indiv ...
, with nonzero probability,
is an
-net.
In turn,
-nets and
-approximations, and the likelihood that a random sample of large enough cardinality has these properties, have important applications in
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
, in the area of
probably approximately correct learning
In computational learning theory, probably approximately correct (PAC) learning is a framework for mathematical analysis of machine learning. It was proposed in 1984 by Leslie Valiant.L. Valiant. A theory of the learnable.' Communications of the ...
. In
computational geometry, they have been applied to
range searching
In computer science, the range searching problem consists of processing a set ''S'' of objects, in order to determine which objects from ''S'' intersect with a query object, called the ''range''. For example, if ''S'' is a set of points correspond ...
,
derandomization
A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic or procedure. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performan ...
, and
approximation algorithm
In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned sol ...
s.
use generalizations of the Sauer–Shelah lemma to prove results in
graph theory
In mathematics and computer science, graph theory is the study of ''graph (discrete mathematics), graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of ''Vertex (graph ...
such as that the number of
strong orientation
In graph theory, a strong orientation of an undirected graph is an assignment of a direction to each edge (an Orientation (graph theory), orientation) that makes it into a strongly connected graph.
Strong orientations have been applied to the des ...
s of a given graph is sandwiched between its numbers of
connected
Connected may refer to:
Film and television
* ''Connected'' (2008 film), a Hong Kong remake of the American movie ''Cellular''
* '' Connected: An Autoblogography About Love, Death & Technology'', a 2011 documentary film
* ''Connected'' (2015 TV ...
and
2-edge-connected subgraphs.
See also
*
Growth function The growth function, also called the shatter coefficient or the shattering number, measures the richness of a set family or class of functions. It is especially used in the context of statistical learning theory, where it is used to study propertie ...
References
{{DEFAULTSORT:Sauer-Shelah lemma
Families of sets
Lemmas
Extremal combinatorics