Multi-set
   HOME

TheInfoList



OR:

In
mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
, a multiset (or bag, or mset) is a modification of the concept of a
set Set, The Set, SET or SETS may refer to: Science, technology, and mathematics Mathematics *Set (mathematics), a collection of elements *Category of sets, the category whose objects and morphisms are sets and total functions, respectively Electro ...
that, unlike a set, allows for multiple instances for each of its elements. The number of instances given for each element is called the multiplicity of that element in the multiset. As a consequence, an infinite number of multisets exist which contain only elements and , but vary in the multiplicities of their elements: * The set contains only elements and , each having multiplicity 1 when is seen as a multiset. * In the multiset , the element has multiplicity 2, and has multiplicity 1. * In the multiset , and both have multiplicity 3. These objects are all different when viewed as multisets, although they are the same
set Set, The Set, SET or SETS may refer to: Science, technology, and mathematics Mathematics *Set (mathematics), a collection of elements *Category of sets, the category whose objects and morphisms are sets and total functions, respectively Electro ...
, since they all consist of the same elements. As with sets, and in contrast to tuples,
order Order, ORDER or Orders may refer to: * Categorization, the process in which ideas and objects are recognized, differentiated, and understood * Heterarchy, a system of organization wherein the elements have the potential to be ranked a number of d ...
does not matter in discriminating multisets, so and denote the same multiset. To distinguish between sets and multisets, a notation that incorporates square brackets is sometimes used: the multiset can be denoted by . The
cardinality In mathematics, the cardinality of a set is a measure of the number of elements of the set. For example, the set A = \ contains 3 elements, and therefore A has a cardinality of 3. Beginning in the late 19th century, this concept was generalized ...
of a multiset is the sum of the multiplicities of all its elements. For example, in the multiset the multiplicities of the members , , and are respectively 2, 3, and 1, and therefore the cardinality of this multiset is 6.
Nicolaas Govert de Bruijn Nicolaas Govert (Dick) de Bruijn (; 9 July 1918 – 17 February 2012) was a Dutch mathematician, noted for his many contributions in the fields of analysis, number theory, combinatorics and logic.
coined the word ''multiset'' in the 1970s, according to Donald Knuth. However, the concept of multisets predates the coinage of the word ''multiset'' by many centuries. Knuth himself attributes the first study of multisets to the Indian mathematician Bhāskarāchārya, who described permutations of multisets around 1150. Other names have been proposed or used for this concept, including ''list'', ''bunch'', ''bag'', ''heap'', ''sample'', ''weighted set'', ''collection'', and ''suite''.


History

Wayne Blizard traced multisets back to the very origin of numbers, arguing that "in ancient times, the number ''n'' was often represented by a collection of ''n'' strokes, tally marks, or units." These and similar collections of objects are multisets, because strokes, tally marks, or units are considered indistinguishable. This shows that people implicitly used multisets even before mathematics emerged. Practical needs for this structure have caused multisets to be rediscovered several times, appearing in literature under different names. For instance, they were important in early AI languages, such as QA4, where they were referred to as ''bags,'' a term attributed to Peter Deutsch. A multiset has been also called an aggregate, heap, bunch, sample, weighted set, occurrence set, and fireset (finitely repeated element set). Although multisets were used implicitly from ancient times, their explicit exploration happened much later. The first known study of multisets is attributed to the Indian mathematician Bhāskarāchārya circa 1150, who described permutations of multisets. The work of
Marius Nizolius Marius Nizolius ( it, Mario Nizolio; 1498–1576) was an Italian humanist scholar, known as a proponent of Cicero. He considered rhetoric to be the central intellectual discipline, slighting other aspects of the philosophical tradition. He is des ...
(1498–1576) contains another early reference to the concept of multisets. Athanasius Kircher found the number of multiset permutations when one element can be repeated.
Jean Prestet Jean Prestet (1648–1690) was a French Oratorian priest and mathematician who contributed to the fields of combinatorics and number theory. Prestet grew up poor. As a teenager, he worked as a servant of the Oratory of Jesus in Paris. He was prom ...
published a general rule for multiset permutations in 1675.
John Wallis John Wallis (; la, Wallisius; ) was an English clergyman and mathematician who is given partial credit for the development of infinitesimal calculus. Between 1643 and 1689 he served as chief cryptographer for Parliament and, later, the royal ...
explained this rule in more detail in 1685. Multisets appeared explicitly in the work of
Richard Dedekind Julius Wilhelm Richard Dedekind (6 October 1831 – 12 February 1916) was a German mathematician who made important contributions to number theory, abstract algebra (particularly ring theory), and the axiomatic foundations of arithmetic. His ...
. Other mathematicians formalized multisets and began to study them as precise mathematical structures in the 20th century. For example, Whitney (1933) described ''generalized sets'' ("sets" whose characteristic functions may take any integer value - positive, negative or zero). Monro (1987) investigated the category Mul of multisets and their morphisms, defining a ''multiset'' as a set with an equivalence relation between elements "of the same ''sort''", and a ''morphism'' between multisets as a function which respects ''sorts''. He also introduced a ''multinumber'': a function ''f''(''x'') from a multiset to the natural numbers, giving the ''multiplicity'' of element ''x'' in the multiset. Monro argued that the concepts of multiset and multinumber are often mixed indiscriminately, though both are useful.


Examples

One of the simplest and most natural examples is the multiset of prime factors of a natural number . Here the underlying set of elements is the set of prime factors of . For example, the number 120 has the
prime factorization In number theory, integer factorization is the decomposition of a composite number into a product of smaller integers. If these factors are further restricted to prime numbers, the process is called prime factorization. When the numbers are suf ...
:120 = 2^3 3^1 5^1 which gives the multiset . A related example is the multiset of solutions of an algebraic equation. A quadratic equation, for example, has two solutions. However, in some cases they are both the same number. Thus the multiset of solutions of the equation could be , or it could be . In the latter case it has a solution of multiplicity 2. More generally, the fundamental theorem of algebra asserts that the complex solutions of a polynomial equation of degree always form a multiset of cardinality . A special case of the above are the eigenvalues of a
matrix Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** ''The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...
, whose multiplicity is usually defined as their multiplicity as roots of the
characteristic polynomial In linear algebra, the characteristic polynomial of a square matrix is a polynomial which is invariant under matrix similarity and has the eigenvalues as roots. It has the determinant and the trace of the matrix among its coefficients. The chara ...
. However two other multiplicities are naturally defined for eigenvalues, their multiplicities as roots of the minimal polynomial, and the geometric multiplicity, which is defined as the dimension of the kernel of (where is an eigenvalue of the matrix ). These three multiplicities define three multisets of eigenvalues, which may be all different: Let be a matrix in Jordan normal form that has a single eigenvalue. Its multiplicity is , its multiplicity as a root of the minimal polynomial is the size of the largest Jordan block, and its geometric multiplicity is the number of Jordan blocks.


Definition

A multiset may be formally defined as an
ordered pair In mathematics, an ordered pair (''a'', ''b'') is a pair of objects. The order in which the objects appear in the pair is significant: the ordered pair (''a'', ''b'') is different from the ordered pair (''b'', ''a'') unless ''a'' = ''b''. (In con ...
where is the ''underlying set'' of the multiset, formed from its distinct elements, and m\colon A \to \mathbb^ is a function from to the set of the positive integers, giving the ''multiplicity'', that is, the number of occurrences, of the element in the multiset as the number . Representing the function by its graph (the set of
ordered pair In mathematics, an ordered pair (''a'', ''b'') is a pair of objects. The order in which the objects appear in the pair is significant: the ordered pair (''a'', ''b'') is different from the ordered pair (''b'', ''a'') unless ''a'' = ''b''. (In con ...
s \left\) allows for writing the multiset as , and the multiset as . This notation is however not commonly used and more compact notations are employed. If A=\ is a finite set, the multiset is often represented as :\left\,\quad sometimes simplified to \quad a_1^ \cdots a_n^, where upper indices equal to 1 are omitted. For example, the multiset may be written \ or a^2b. If the elements of the multiset are numbers, a confusion is possible with ordinary arithmetic operations, those normally can be excluded from the context. On the other hand, the latter notation is coherent with the fact that the
prime factorization In number theory, integer factorization is the decomposition of a composite number into a product of smaller integers. If these factors are further restricted to prime numbers, the process is called prime factorization. When the numbers are suf ...
of a positive integer is a uniquely defined multiset, as asserted by the fundamental theorem of arithmetic. Also, a monomial is a multiset of
indeterminate Indeterminate may refer to: In mathematics * Indeterminate (variable), a symbol that is treated as a variable * Indeterminate system, a system of simultaneous equations that has more than one solution * Indeterminate equation, an equation that ha ...
s; for example, the monomial ''x''3''y''2 corresponds to the multiset . A multiset corresponds to an ordinary set if the multiplicity of every element is one (as opposed to some larger positive integer). An indexed family, , where varies over some index-set ''I'', may define a multiset, sometimes written . In this view the underlying set of the multiset is given by the
image An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensiona ...
of the family, and the multiplicity of any element is the number of index values such that a_i = x. In this article the multiplicities are considered to be finite, i.e. no element occurs infinitely many times in the family: even in an infinite multiset, the multiplicities are finite numbers. It is possible to extend the definition of a multiset by allowing multiplicities of individual elements to be infinite cardinals instead of positive integers, but not all properties carry over to this generalization.


Basic properties and operations

Elements of a multiset are generally taken in a fixed set , sometimes called a ''universe'', which is often the set of natural numbers. An element of that does not belong to a given multiset is said to have a multiplicity 0 in this multiset. This extends the multiplicity function of the multiset to a function from to the set \N of
nonnegative integer In mathematics, the natural numbers are those numbers used for counting (as in "there are ''six'' coins on the table") and ordering (as in "this is the ''third'' largest city in the country"). Numbers used for counting are called ''cardinal n ...
s. This defines a one-to-one correspondence between these functions and the multisets that have their elements in . This extended multiplicity function is commonly called simply the multiplicity function, and suffices for defining multisets when the universe containing the elements has been fixed. This multiplicity function is a generalization of the
indicator function In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , one has \mathbf_(x)=1 if x\i ...
of a subset, and shares some properties with it. The support of a multiset A in a universe is the underlying set of the multiset. Using the multiplicity function m, it is characterized as :\operatorname(A) := \left\. A multiset is ''finite'' if its support is finite, or, equivalently, if its cardinality :, A, = \sum_m_A(x)= \sum_m_A(x) is finite. The ''empty multiset'' is the unique multiset with an empty support (underlying set), and thus a cardinality 0. The usual operations of sets may be extended to multisets by using the multiplicity function, in a similar way to using the indicator function for subsets. In the following, and are multisets in a given universe , with multiplicity functions m_A and m_B. * Inclusion: is ''included'' in , denoted , if :: m_A(x) \le m_B(x)\quad\forall x\in U. * Union: the ''union'' (called, in some contexts, the ''maximum'' or ''lowest common multiple'') of and is the multiset with multiplicity function :: m_C(x) = \max(m_A(x),m_B(x))\quad\forall x\in U. * Intersection: the ''intersection'' (called, in some contexts, the ''infimum'' or ''greatest common divisor'') of and is the multiset with multiplicity function :: m_C(x) = \min(m_A(x),m_B(x))\quad\forall x\in U. * Sum: the ''sum'' of and is the multiset with multiplicity function :: m_C(x) = m_A(x) + m_B(x)\quad\forall x\in U. : It may be viewed as a generalization of the disjoint union of sets. It defines a commutative monoid structure on the finite multisets in a given universe. This monoid is a
free commutative monoid In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences (or strings) of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero eleme ...
, with the universe as a basis. * Difference: the ''difference'' of and is the multiset with multiplicity function :: m_C(x) = \max(m_A(x) - m_B(x), 0)\quad\forall x\in U. Two multisets are ''disjoint'' if their supports are
disjoint sets In mathematics, two sets are said to be disjoint sets if they have no element in common. Equivalently, two disjoint sets are sets whose intersection is the empty set.. For example, and are ''disjoint sets,'' while and are not disjoint. A ...
. This is equivalent to saying that their intersection is the empty multiset or that their sum equals their union. There is an inclusion–exclusion principle for finite multisets (similar to the one for sets), stating that a finite union of finite multisets is the difference of two sums of multisets: in the first sum we consider all possible intersections of an
odd Odd means unpaired, occasional, strange or unusual, or a person who is viewed as eccentric. Odd may also refer to: Acronym * ODD (Text Encoding Initiative) ("One Document Does it all"), an abstracted literate-programming format for describing X ...
number of the given multisets, while in the second sum we consider all possible intersections of an
even Even may refer to: General * Even (given name), a Norwegian male personal name * Even (surname) * Even (people), an ethnic group from Siberia and Russian Far East ** Even language, a language spoken by the Evens * Odd and Even, a solitaire game w ...
number of the given multisets.


Counting multisets

The number of multisets of cardinality , with elements taken from a finite set of cardinality , is called the multiset coefficient or multiset number. This number is written by some authors as \textstyle\left(\!\!\!\!\right), a notation that is meant to resemble that of
binomial coefficient In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers and is written \tbinom. It is the coefficient of the t ...
s; it is used for instance in (Stanley, 1997), and could be pronounced " multichoose " to resemble " choose " for \tbinom nk. Like the
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
that involves binomial coefficients, there is a
negative binomial distribution In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-r ...
in which the multiset coefficients occur. Multiset coefficients should not be confused with the unrelated multinomial coefficients that occur in the multinomial theorem. The value of multiset coefficients can be given explicitly as :\left(\!\!\!\!\right) = = \frac = , where the second expression is as a binomial coefficient; many authors in fact avoid separate notation and just write binomial coefficients. So, the number of such multisets is the same as the number of subsets of cardinality in a set of cardinality . The analogy with binomial coefficients can be stressed by writing the numerator in the above expression as a rising factorial power :\left(\!\!\!\!\right) = , to match the expression of binomial coefficients using a falling factorial power: : = . either of which is well defined even if we allow to be an arbitrary complex number. There are for example 4 multisets of cardinality 3 with elements taken from the set of cardinality 2 (, ), namely , , , . There are also 4 ''subsets'' of cardinality 3 in the set of cardinality 4 (), namely , , , . One simple way to prove the equality of multiset coefficients and binomial coefficients given above, involves representing multisets in the following way. First, consider the notation for multisets that would represent (6 s, 2 s, 3 s, 7 s) in this form: : This is a multiset of cardinality = 18 made of elements of a set of cardinality = 4. The number of characters including both dots and vertical lines used in this notation is 18 + 4 − 1. The number of vertical lines is 4 − 1. The number of multisets of cardinality 18 is then the number of ways to arrange the 4 − 1 vertical lines among the 18 + 4 − 1 characters, and is thus the number of subsets of cardinality 4 − 1 in a set of cardinality 18 + 4 − 1. Equivalently, it is the number of ways to arrange the 18 dots among the 18 + 4 − 1 characters, which is the number of subsets of cardinality 18 of a set of cardinality 18 + 4 − 1. This is := = 1330, thus is the value of the multiset coefficient and its equivalencies: :\left(\!\!\!\!\right)

\frac=,
: ::=\frac, : ::=\frac , : ::=\frac. From the relation between binomial coefficients and multiset coefficients, it follows that the number of multisets of cardinality in a set of cardinality can be written :\left(\!\!\!\!\right)=(-1)^k. Additionally, :\left(\!\!\!\!\right)=\left(\!\!\!\!\right).


Recurrence relation

A recurrence relation for multiset coefficients may be given as :\left(\!\!\!\!\right) = \left(\!\!\!\!\right) + \left(\!\!\!\!\right) \quad \mbox n,k>0 with :\left(\!\!\!\!\right) = 1,\quad n\in\N, \quad\mbox\quad \left(\!\!\!\!\right) = 0,\quad k>0. The above recurrence may be interpreted as follows. Let  := \ be the source set. There is always exactly one (empty) multiset of size 0, and if there are no larger multisets, which gives the initial conditions. Now, consider the case in which . A multiset of cardinality with elements from might or might not contain any instance of the final element . If it does appear, then by removing once, one is left with a multiset of cardinality of elements from , and every such multiset can arise, which gives a total of :\left(\!\!\!\!\right) possibilities. If does not appear, then our original multiset is equal to a multiset of cardinality with elements from , of which there are :\left(\!\!\!\!\right). Thus, :\left(\!\!\!\!\right) = \left(\!\!\!\!\right) + \left(\!\!\!\!\right).


Generating series

The
generating function In mathematics, a generating function is a way of encoding an infinite sequence of numbers () by treating them as the coefficients of a formal power series. This series is called the generating function of the sequence. Unlike an ordinary seri ...
of the multiset coefficients is very simple, being :\sum_^\infty \left(\!\!\!\!\right)t^d = \frac. As multisets are in one-to-one correspondence with monomials, \left(\!\!\!\!\right) is also the number of monomials of degree in indeterminates. Thus, the above series is also the
Hilbert series In commutative algebra, the Hilbert function, the Hilbert polynomial, and the Hilbert series of a graded commutative algebra finitely generated over a field are three strongly related notions which measure the growth of the dimension of the homoge ...
of the polynomial ring k _1, \ldots, x_n As \left(\!\!\!\!\right) is a polynomial in , it and the generating function are well defined for any complex value of .


Generalization and connection to the negative binomial series

The multiplicative formula allows the definition of multiset coefficients to be extended by replacing ''n'' by an arbitrary number ''α'' (negative, real, complex): :\left(\!\!\!\!\right) = \frac = \frac \quad\text k\in\N \text \alpha. With this definition one has a generalization of the negative binomial formula (with one of the variables set to 1), which justifies calling the \left(\!\!\!\!\right) negative binomial coefficients: : (1-X)^ = \sum_^\infty \left(\!\!\!\!\right) X^k. This Taylor series formula is valid for all complex numbers ''α'' and ''X'' with  < 1. It can also be interpreted as an identity of
formal power series In mathematics, a formal series is an infinite sum that is considered independently from any notion of convergence, and can be manipulated with the usual algebraic operations on series (addition, subtraction, multiplication, division, partial sum ...
in ''X'', where it actually can serve as definition of arbitrary powers of series with constant coefficient equal to 1; the point is that with this definition all identities hold that one expects for exponentiation, notably :(1-X)^(1-X)^=(1-X)^ \quad\text\quad ((1-X)^)^=(1-X)^, and formulas such as these can be used to prove identities for the multiset coefficients. If ''α'' is a nonpositive integer ''n'', then all terms with ''k'' > −''n'' are zero, and the infinite series becomes a finite sum. However, for other values of ''α'', including positive integers and rational numbers, the series is infinite.


Applications

Multisets have various applications. They are becoming fundamental in
combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many appl ...
. Multisets have become an important tool in the theory of
relational database A relational database is a (most commonly digital) database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relatio ...
s, which often uses the synonym ''bag''. For instance, multisets are often used to implement relations in database systems. In particular, a table (without a primary key) works as a multiset, because it can have multiple identical records. Similarly, SQL operates on multisets and return identical records. For instance, consider "SELECT name from Student". In the case that there are multiple records with name "sara" in the student table, all of them are shown. That means the result set of SQL is a multiset. If it was a set, the repetitive records in the result set were eliminated. Another application of multiset is in modeling multigraphs. In multigraphs there can be multiple edges between any two given vertices. As such, the entity that shows edges is a multiset, and not a set. There are also other applications. For instance, Richard Rado used multisets as a device to investigate the properties of families of sets. He wrote, "The notion of a set takes no account of multiple occurrence of any one of its members, and yet it is just this kind of information which is frequently of importance. We need only think of the set of roots of a polynomial ''f''(''x'') or the spectrum of a linear operator."


Generalizations

Different generalizations of multisets have been introduced, studied and applied to solving problems. * Real-valued multisets (in which multiplicity of an element can be any real number) * Fuzzy multisets * Rough multisets * Hybrid sets * Multisets whose multiplicity is any real-valued step function * Soft multisets * Soft fuzzy multisets * Named sets (unification of all generalizations of sets)


See also

*
Frequency (statistics) In statistics, the frequency (or absolute frequency) of an event i is the number n_i of times the observation has occurred/recorded in an experiment or study. These frequencies are often depicted graphically or in tabular form. Types The cumula ...
as multiplicity analog * Quasi-sets * Set theory * {{Wikiversity-inline, Partitions of multisets


References

Basic concepts in set theory Factorial and binomial topics