The growth function, also called the shatter coefficient or the shattering number, measures the richness of a
set family. It is especially used in the context of
statistical learning theory
Statistical learning theory is a framework for machine learning drawing from the fields of statistics and functional analysis. Statistical learning theory deals with the statistical inference problem of finding a predictive function based on d ...
, where it measures the complexity of a hypothesis class.
The term 'growth function' was coined by Vapnik and Chervonenkis in their 1968 paper, where they also proved many of its properties.
It is a basic concept in
machine learning
Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence.
Machine ...
.
[, especially Section 3.2]
Definitions
Set-family definition
Let
be a
set family (a set of sets) and
a set. Their ''intersection'' is defined as the following set-family:
:
The ''intersection-size'' (also called the ''index'') of
with respect to
is
. If a set
has
elements then the index is at most
. If the index is exactly 2
''m'' then the set
is said to be shattered by
, because
contains all the subsets of
, i.e.:
:
The growth function measures the size of
as a function of
. Formally:
:
Hypothesis-class definition
Equivalently, let
be a hypothesis-class (a set of binary functions) and
a set with
elements. The ''restriction'' of
to
is the set of binary functions on
that can be derived from
:
[
:
The growth function measures the size of as a function of :][
:
]
Examples
1. The domain is the real line .
The set-family contains all the half-line
In geometry, a line is an infinitely long object with no width, depth, or curvature. Thus, lines are one-dimensional objects, though they may exist in two, three, or higher dimension spaces. The word ''line'' may also refer to a line segment ...
s (rays) from a given number to positive infinity, i.e., all sets of the form for some .
For any set of real numbers, the intersection contains sets: the empty set, the set containing the largest element of , the set containing the two largest elements of , and so on. Therefore: .[ The same is true whether contains open half-lines, closed half-lines, or both.
2. The domain is the segment ]