Multiset (C )
   HOME

TheInfoList



OR:

In mathematics, a multiset (or bag, or mset) is a modification of the concept of a set that, unlike a set, allows for multiple instances for each of its elements. The number of instances given for each element is called the multiplicity of that element in the multiset. As a consequence, an infinite number of multisets exist which contain only elements and , but vary in the multiplicities of their elements: * The set contains only elements and , each having multiplicity 1 when is seen as a multiset. * In the multiset , the element has multiplicity 2, and has multiplicity 1. * In the multiset , and both have multiplicity 3. These objects are all different when viewed as multisets, although they are the same set, since they all consist of the same elements. As with sets, and in contrast to
tuple In mathematics, a tuple is a finite ordered list (sequence) of elements. An -tuple is a sequence (or ordered list) of elements, where is a non-negative integer. There is only one 0-tuple, referred to as ''the empty tuple''. An -tuple is defi ...
s, order does not matter in discriminating multisets, so and denote the same multiset. To distinguish between sets and multisets, a notation that incorporates square brackets is sometimes used: the multiset can be denoted by . The cardinality of a multiset is the sum of the multiplicities of all its elements. For example, in the multiset the multiplicities of the members , , and are respectively 2, 3, and 1, and therefore the cardinality of this multiset is 6. Nicolaas Govert de Bruijn coined the word ''multiset'' in the 1970s, according to
Donald Knuth Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer sc ...
. However, the concept of multisets predates the coinage of the word ''multiset'' by many centuries. Knuth himself attributes the first study of multisets to the Indian mathematician Bhāskarāchārya, who described permutations of multisets around 1150. Other names have been proposed or used for this concept, including ''list'', ''bunch'', ''bag'', ''heap'', ''sample'', ''weighted set'', ''collection'', and ''suite''.


History

Wayne Blizard traced multisets back to the very origin of numbers, arguing that "in ancient times, the number ''n'' was often represented by a collection of ''n'' strokes, tally marks, or units." These and similar collections of objects are multisets, because strokes, tally marks, or units are considered indistinguishable. This shows that people implicitly used multisets even before mathematics emerged. Practical needs for this structure have caused multisets to be rediscovered several times, appearing in literature under different names. For instance, they were important in early AI languages, such as QA4, where they were referred to as ''bags,'' a term attributed to Peter Deutsch. A multiset has been also called an aggregate, heap, bunch, sample, weighted set, occurrence set, and fireset (finitely repeated element set). Although multisets were used implicitly from ancient times, their explicit exploration happened much later. The first known study of multisets is attributed to the Indian mathematician Bhāskarāchārya circa 1150, who described permutations of multisets. The work of Marius Nizolius (1498–1576) contains another early reference to the concept of multisets.
Athanasius Kircher Athanasius Kircher (2 May 1602 – 27 November 1680) was a German Jesuit scholar and polymath who published around 40 major works, most notably in the fields of comparative religion, geology, and medicine. Kircher has been compared to fe ...
found the number of multiset permutations when one element can be repeated. Jean Prestet published a general rule for multiset permutations in 1675. John Wallis explained this rule in more detail in 1685. Multisets appeared explicitly in the work of Richard Dedekind. Other mathematicians formalized multisets and began to study them as precise mathematical structures in the 20th century. For example, Whitney (1933) described ''generalized sets'' ("sets" whose
characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts: * The indicator function of a subset, that is the function ::\mathbf_A\colon X \to \, :which for a given subset ''A'' of ''X'', has value 1 at points ...
s may take any
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
value - positive, negative or zero). Monro (1987) investigated the
category Category, plural categories, may refer to: Philosophy and general uses *Categorization, categories in cognitive science, information science and generally * Category of being * ''Categories'' (Aristotle) * Category (Kant) * Categories (Peirce) ...
Mul of multisets and their morphisms, defining a ''multiset'' as a set with an equivalence relation between elements "of the same ''sort''", and a ''morphism'' between multisets as a function which respects ''sorts''. He also introduced a ''multinumber'': a function ''f''(''x'') from a multiset to the
natural number In mathematics, the natural numbers are those numbers used for counting (as in "there are ''six'' coins on the table") and ordering (as in "this is the ''third'' largest city in the country"). Numbers used for counting are called ''cardinal ...
s, giving the ''multiplicity'' of element ''x'' in the multiset. Monro argued that the concepts of multiset and multinumber are often mixed indiscriminately, though both are useful.


Examples

One of the simplest and most natural examples is the multiset of
prime A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
factors of a natural number . Here the underlying set of elements is the set of prime factors of . For example, the number
120 120 may refer to: *120 (number), the number * AD 120, a year in the 2nd century AD *120 BC, a year in the 2nd century BC *120 film, a film format for still photography * ''120'' (film), a 2008 film * 120 (MBTA bus) * 120 (New Jersey bus) * 120 (Ken ...
has the prime factorization :120 = 2^3 3^1 5^1 which gives the multiset . A related example is the multiset of solutions of an algebraic equation. A
quadratic equation In algebra, a quadratic equation () is any equation that can be rearranged in standard form as ax^2 + bx + c = 0\,, where represents an unknown value, and , , and represent known numbers, where . (If and then the equation is linear, not q ...
, for example, has two solutions. However, in some cases they are both the same number. Thus the multiset of solutions of the equation could be , or it could be . In the latter case it has a solution of multiplicity 2. More generally, the
fundamental theorem of algebra The fundamental theorem of algebra, also known as d'Alembert's theorem, or the d'Alembert–Gauss theorem, states that every non- constant single-variable polynomial with complex coefficients has at least one complex root. This includes polynomia ...
asserts that the
complex Complex commonly refers to: * Complexity, the behaviour of a system whose components interact in multiple ways so possible interactions are difficult to describe ** Complex system, a system composed of many components which may interact with each ...
solutions of a
polynomial equation In mathematics, an algebraic equation or polynomial equation is an equation of the form :P = 0 where ''P'' is a polynomial with coefficients in some field (mathematics), field, often the field of the rational numbers. For many authors, the term '' ...
of degree always form a multiset of cardinality . A special case of the above are the
eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted ...
s of a matrix, whose multiplicity is usually defined as their multiplicity as roots of the characteristic polynomial. However two other multiplicities are naturally defined for eigenvalues, their multiplicities as roots of the minimal polynomial, and the
geometric multiplicity In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted ...
, which is defined as the
dimension In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coor ...
of the
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learn ...
of (where is an eigenvalue of the matrix ). These three multiplicities define three multisets of eigenvalues, which may be all different: Let be a matrix in
Jordan normal form In linear algebra, a Jordan normal form, also known as a Jordan canonical form (JCF), is an upper triangular matrix of a particular form called a Jordan matrix representing a linear operator on a finite-dimensional vector space with respect to so ...
that has a single eigenvalue. Its multiplicity is , its multiplicity as a root of the minimal polynomial is the size of the largest Jordan block, and its geometric multiplicity is the number of Jordan blocks.


Definition

A multiset may be formally defined as an ordered pair where is the ''underlying set'' of the multiset, formed from its distinct elements, and m\colon A \to \mathbb^ is a
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-oriente ...
from to the set of the
positive Positive is a property of positivity and may refer to: Mathematics and science * Positive formula, a logical formula not containing negation * Positive number, a number that is greater than 0 * Plus sign, the sign "+" used to indicate a posi ...
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
s, giving the ''multiplicity'', that is, the number of occurrences, of the element in the multiset as the number . Representing the function by its
graph Graph may refer to: Mathematics *Graph (discrete mathematics), a structure made of vertices and edges **Graph theory, the study of such graphs and their properties *Graph (topology), a topological space resembling a graph in the sense of discre ...
(the set of ordered pairs \left\) allows for writing the multiset as , and the multiset as . This notation is however not commonly used and more compact notations are employed. If A=\ is a
finite set In mathematics, particularly set theory, a finite set is a set that has a finite number of elements. Informally, a finite set is a set which one could in principle count and finish counting. For example, :\ is a finite set with five elements. T ...
, the multiset is often represented as :\left\,\quad sometimes simplified to \quad a_1^ \cdots a_n^, where upper indices equal to 1 are omitted. For example, the multiset may be written \ or a^2b. If the elements of the multiset are numbers, a confusion is possible with ordinary
arithmetic operations Arithmetic () is an elementary part of mathematics that consists of the study of the properties of the traditional operations on numbers—addition, subtraction, multiplication, division, exponentiation, and extraction of roots. In the 19th ce ...
, those normally can be excluded from the context. On the other hand, the latter notation is coherent with the fact that the prime factorization of a positive integer is a uniquely defined multiset, as asserted by the
fundamental theorem of arithmetic In mathematics, the fundamental theorem of arithmetic, also called the unique factorization theorem and prime factorization theorem, states that every integer greater than 1 can be represented uniquely as a product of prime numbers, up to the ord ...
. Also, a
monomial In mathematics, a monomial is, roughly speaking, a polynomial which has only one term. Two definitions of a monomial may be encountered: # A monomial, also called power product, is a product of powers of variables with nonnegative integer expone ...
is a multiset of indeterminates; for example, the monomial ''x''3''y''2 corresponds to the multiset . A multiset corresponds to an ordinary set if the multiplicity of every element is one (as opposed to some larger positive integer). An
indexed family In mathematics, a family, or indexed family, is informally a collection of objects, each associated with an index from some index set. For example, a ''family of real numbers, indexed by the set of integers'' is a collection of real numbers, wher ...
, , where varies over some index-set ''I'', may define a multiset, sometimes written . In this view the underlying set of the multiset is given by the image of the family, and the multiplicity of any element is the number of index values such that a_i = x. In this article the multiplicities are considered to be finite, i.e. no element occurs infinitely many times in the family: even in an infinite multiset, the multiplicities are finite numbers. It is possible to extend the definition of a multiset by allowing multiplicities of individual elements to be infinite
cardinals Cardinal or The Cardinal may refer to: Animals * Cardinal (bird) or Cardinalidae, a family of North and South American birds **''Cardinalis'', genus of cardinal in the family Cardinalidae **''Cardinalis cardinalis'', or northern cardinal, the ...
instead of positive integers, but not all properties carry over to this generalization.


Basic properties and operations

Elements of a multiset are generally taken in a fixed set , sometimes called a ''universe'', which is often the set of
natural number In mathematics, the natural numbers are those numbers used for counting (as in "there are ''six'' coins on the table") and ordering (as in "this is the ''third'' largest city in the country"). Numbers used for counting are called ''cardinal ...
s. An element of that does not belong to a given multiset is said to have a multiplicity 0 in this multiset. This extends the multiplicity function of the multiset to a function from to the set \N of nonnegative integers. This defines a one-to-one correspondence between these functions and the multisets that have their elements in . This extended multiplicity function is commonly called simply the multiplicity function, and suffices for defining multisets when the universe containing the elements has been fixed. This multiplicity function is a generalization of the indicator function of a subset, and shares some properties with it. The support of a multiset A in a universe is the underlying set of the multiset. Using the multiplicity function m, it is characterized as :\operatorname(A) := \left\. A multiset is ''finite'' if its support is finite, or, equivalently, if its cardinality :, A, = \sum_m_A(x)= \sum_m_A(x) is finite. The ''empty multiset'' is the unique multiset with an
empty Empty may refer to: ‍ Music Albums * ''Empty'' (God Lives Underwater album) or the title song, 1995 * ''Empty'' (Nils Frahm album), 2020 * ''Empty'' (Tait album) or the title song, 2001 Songs * "Empty" (The Click Five song), 2007 * ...
support (underlying set), and thus a cardinality 0. The usual operations of sets may be extended to multisets by using the multiplicity function, in a similar way to using the indicator function for subsets. In the following, and are multisets in a given universe , with multiplicity functions m_A and m_B. * Inclusion: is ''included'' in , denoted , if :: m_A(x) \le m_B(x)\quad\forall x\in U. * Union: the ''union'' (called, in some contexts, the ''maximum'' or ''lowest common multiple'') of and is the multiset with multiplicity function :: m_C(x) = \max(m_A(x),m_B(x))\quad\forall x\in U. * Intersection: the ''intersection'' (called, in some contexts, the ''infimum'' or ''greatest common divisor'') of and is the multiset with multiplicity function :: m_C(x) = \min(m_A(x),m_B(x))\quad\forall x\in U. * Sum: the ''sum'' of and is the multiset with multiplicity function :: m_C(x) = m_A(x) + m_B(x)\quad\forall x\in U. : It may be viewed as a generalization of the
disjoint union In mathematics, a disjoint union (or discriminated union) of a family of sets (A_i : i\in I) is a set A, often denoted by \bigsqcup_ A_i, with an injection of each A_i into A, such that the images of these injections form a partition of A ( ...
of sets. It defines a
commutative monoid In abstract algebra, a branch of mathematics, a monoid is a set equipped with an associative binary operation and an identity element. For example, the nonnegative integers with addition form a monoid, the identity element being 0. Monoids ar ...
structure on the finite multisets in a given universe. This monoid is a free commutative monoid, with the universe as a basis. * Difference: the ''difference'' of and is the multiset with multiplicity function :: m_C(x) = \max(m_A(x) - m_B(x), 0)\quad\forall x\in U. Two multisets are ''disjoint'' if their supports are disjoint sets. This is equivalent to saying that their intersection is the empty multiset or that their sum equals their union. There is an inclusion–exclusion principle for finite multisets (similar to the one for sets), stating that a finite union of finite multisets is the difference of two sums of multisets: in the first sum we consider all possible intersections of an odd number of the given multisets, while in the second sum we consider all possible intersections of an even number of the given multisets.


Counting multisets

The number of multisets of cardinality , with elements taken from a finite set of cardinality , is called the multiset coefficient or multiset number. This number is written by some authors as \textstyle\left(\!\!\!\!\right), a notation that is meant to resemble that of binomial coefficients; it is used for instance in (Stanley, 1997), and could be pronounced " multichoose " to resemble " choose " for \tbinom nk. Like the binomial distribution that involves binomial coefficients, there is a negative binomial distribution in which the multiset coefficients occur. Multiset coefficients should not be confused with the unrelated
multinomial coefficient In mathematics, the multinomial theorem describes how to expand a power of a sum in terms of powers of the terms in that sum. It is the generalization of the binomial theorem from binomials to multinomials. Theorem For any positive integer ...
s that occur in the
multinomial theorem In mathematics, the multinomial theorem describes how to expand a power of a sum in terms of powers of the terms in that sum. It is the generalization of the binomial theorem from binomials to multinomials. Theorem For any positive integer ...
. The value of multiset coefficients can be given explicitly as :\left(\!\!\!\!\right) = = \frac = , where the second expression is as a binomial coefficient; many authors in fact avoid separate notation and just write binomial coefficients. So, the number of such multisets is the same as the number of subsets of cardinality in a set of cardinality . The analogy with binomial coefficients can be stressed by writing the numerator in the above expression as a
rising factorial power In mathematics, the falling factorial (sometimes called the descending factorial, falling sequential product, or lower factorial) is defined as the polynomial :\begin (x)_n = x^\underline &= \overbrace^ \\ &= \prod_^n(x-k+1) = \prod_^(x-k) \,. \e ...
:\left(\!\!\!\!\right) = , to match the expression of binomial coefficients using a falling factorial power: : = . either of which is well defined even if we allow to be an arbitrary complex number. There are for example 4 multisets of cardinality 3 with elements taken from the set of cardinality 2 (, ), namely , , , . There are also 4 ''subsets'' of cardinality 3 in the set of cardinality 4 (), namely , , , . One simple way to prove the equality of multiset coefficients and binomial coefficients given above, involves representing multisets in the following way. First, consider the notation for multisets that would represent (6 s, 2 s, 3 s, 7 s) in this form: : This is a multiset of cardinality = 18 made of elements of a set of cardinality = 4. The number of characters including both dots and vertical lines used in this notation is 18 + 4 − 1. The number of vertical lines is 4 − 1. The number of multisets of cardinality 18 is then the number of ways to arrange the 4 − 1 vertical lines among the 18 + 4 − 1 characters, and is thus the number of subsets of cardinality 4 − 1 in a set of cardinality 18 + 4 − 1. Equivalently, it is the number of ways to arrange the 18 dots among the 18 + 4 − 1 characters, which is the number of subsets of cardinality 18 of a set of cardinality 18 + 4 − 1. This is := = 1330, thus is the value of the multiset coefficient and its equivalencies: :\left(\!\!\!\!\right)

\frac=,
: ::=\frac, : ::=\frac , : ::=\frac. From the relation between binomial coefficients and multiset coefficients, it follows that the number of multisets of cardinality in a set of cardinality can be written :\left(\!\!\!\!\right)=(-1)^k. Additionally, :\left(\!\!\!\!\right)=\left(\!\!\!\!\right).


Recurrence relation

A
recurrence relation In mathematics, a recurrence relation is an equation according to which the nth term of a sequence of numbers is equal to some combination of the previous terms. Often, only k previous terms of the sequence appear in the equation, for a parameter ...
for multiset coefficients may be given as :\left(\!\!\!\!\right) = \left(\!\!\!\!\right) + \left(\!\!\!\!\right) \quad \mbox n,k>0 with :\left(\!\!\!\!\right) = 1,\quad n\in\N, \quad\mbox\quad \left(\!\!\!\!\right) = 0,\quad k>0. The above recurrence may be interpreted as follows. Let  := \ be the source set. There is always exactly one (empty) multiset of size 0, and if there are no larger multisets, which gives the initial conditions. Now, consider the case in which . A multiset of cardinality with elements from might or might not contain any instance of the final element . If it does appear, then by removing once, one is left with a multiset of cardinality of elements from , and every such multiset can arise, which gives a total of :\left(\!\!\!\!\right) possibilities. If does not appear, then our original multiset is equal to a multiset of cardinality with elements from , of which there are :\left(\!\!\!\!\right). Thus, :\left(\!\!\!\!\right) = \left(\!\!\!\!\right) + \left(\!\!\!\!\right).


Generating series

The generating function of the multiset coefficients is very simple, being :\sum_^\infty \left(\!\!\!\!\right)t^d = \frac. As multisets are in one-to-one correspondence with monomials, \left(\!\!\!\!\right) is also the number of
monomial In mathematics, a monomial is, roughly speaking, a polynomial which has only one term. Two definitions of a monomial may be encountered: # A monomial, also called power product, is a product of powers of variables with nonnegative integer expone ...
s of degree in indeterminates. Thus, the above series is also the Hilbert series of the
polynomial ring In mathematics, especially in the field of algebra, a polynomial ring or polynomial algebra is a ring (which is also a commutative algebra) formed from the set of polynomials in one or more indeterminates (traditionally also called variables ...
k _1, \ldots, x_n As \left(\!\!\!\!\right) is a polynomial in , it and the generating function are well defined for any
complex Complex commonly refers to: * Complexity, the behaviour of a system whose components interact in multiple ways so possible interactions are difficult to describe ** Complex system, a system composed of many components which may interact with each ...
value of .


Generalization and connection to the negative binomial series

The multiplicative formula allows the definition of multiset coefficients to be extended by replacing ''n'' by an arbitrary number ''α'' (negative, real, complex): :\left(\!\!\!\!\right) = \frac = \frac \quad\text k\in\N \text \alpha. With this definition one has a generalization of the negative binomial formula (with one of the variables set to 1), which justifies calling the \left(\!\!\!\!\right) negative binomial coefficients: : (1-X)^ = \sum_^\infty \left(\!\!\!\!\right) X^k. This
Taylor series In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
formula is valid for all complex numbers ''α'' and ''X'' with  < 1. It can also be interpreted as an identity of formal power series in ''X'', where it actually can serve as definition of arbitrary powers of series with constant coefficient equal to 1; the point is that with this definition all identities hold that one expects for
exponentiation Exponentiation is a mathematical operation, written as , involving two numbers, the '' base'' and the ''exponent'' or ''power'' , and pronounced as " (raised) to the (power of) ". When is a positive integer, exponentiation corresponds to r ...
, notably :(1-X)^(1-X)^=(1-X)^ \quad\text\quad ((1-X)^)^=(1-X)^, and formulas such as these can be used to prove identities for the multiset coefficients. If ''α'' is a nonpositive integer ''n'', then all terms with ''k'' > −''n'' are zero, and the infinite series becomes a finite sum. However, for other values of ''α'', including positive integers and
rational number In mathematics, a rational number is a number that can be expressed as the quotient or fraction of two integers, a numerator and a non-zero denominator . For example, is a rational number, as is every integer (e.g. ). The set of all rat ...
s, the series is infinite.


Applications

Multisets have various applications. They are becoming fundamental in combinatorics. Multisets have become an important tool in the theory of relational databases, which often uses the synonym ''bag''. For instance, multisets are often used to implement relations in database systems. In particular, a table (without a primary key) works as a multiset, because it can have multiple identical records. Similarly, SQL operates on multisets and return identical records. For instance, consider "SELECT name from Student". In the case that there are multiple records with name "Sara" in the student table, all of them are shown. That means the result of an SQL query is a multiset; if the result were instead a set, the repetitive records in the result set would have been eliminated. Another application of multiset is in modeling
multigraph In mathematics, and more specifically in graph theory, a multigraph is a graph which is permitted to have multiple edges (also called ''parallel edges''), that is, edges that have the same end nodes. Thus two vertices may be connected by more ...
s. In multigraphs there can be multiple edges between any two given vertices. As such, the entity that shows edges is a multiset, and not a set. There are also other applications. For instance,
Richard Rado Richard Rado FRS (28 April 1906 – 23 December 1989) was a German-born British mathematician whose research concerned combinatorics and graph theory. He was Jewish and left Germany to escape Nazi persecution. He earned two PhDs: in 1933 from th ...
used multisets as a device to investigate the properties of families of sets. He wrote, "The notion of a set takes no account of multiple occurrence of any one of its members, and yet it is just this kind of information which is frequently of importance. We need only think of the set of roots of a polynomial ''f''(''x'') or the spectrum of a linear operator."


Generalizations

Different generalizations of multisets have been introduced, studied and applied to solving problems. * Real-valued multisets (in which multiplicity of an element can be any real number) * Fuzzy multisets * Rough multisets * Hybrid sets * Multisets whose multiplicity is any real-valued step function * Soft multisets * Soft fuzzy multisets * Named sets (unification of all generalizations of sets)


See also

* Frequency (statistics) as multiplicity analog * Quasi-sets *
Set theory Set theory is the branch of mathematical logic that studies sets, which can be informally described as collections of objects. Although objects of any kind can be collected into a set, set theory, as a branch of mathematics, is mostly conce ...
* {{Wikiversity-inline, Partitions of multisets


References

Basic concepts in set theory Factorial and binomial topics