HOME

TheInfoList




In
computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of computation, automation, a ...
, a set is an
abstract data type In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of , ...
that can store unique values, without any particular
order Order, ORDER or Orders may refer to: * Orderliness Orderliness is a quality that is characterized by a person’s interest in keeping their surroundings and themselves well organized, and is associated with other qualities such as cleanliness a ...

order
. It is a computer implementation of the
mathematical Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and their changes (cal ...
concept of a
finite set In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and t ...
. Unlike most other
collection Collection or Collections may refer to: * Cash collection, the function of an accounts receivable department * Collection agency, agency to collect cash * Collections management (museum) ** Collection (artwork), objects in a particular field fo ...
types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set. Some set data structures are designed for static or frozen sets that do not change after they are constructed. Static sets allow only query operations on their elements — such as checking whether a given value is in the set, or enumerating the values in some arbitrary order. Other variants, called dynamic or mutable sets, allow also the insertion and deletion of elements from the set. A
multiset In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and t ...
is a special kind of set in which an element can figure several times.


Type theory

In
type theory In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers ( and ), formulas and related structures (), shapes and spaces in which they are contained (), and quantities and their changes ( and ). There is no gene ...
, sets are generally identified with their
indicator function In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities a ...
(characteristic function): accordingly, a set of values of type A may be denoted by 2^ or \mathcal(A). (Subtypes and subsets may be modeled by
refinement type In type theory In mathematics, logic, and computer science, a type system is a formal system in which every term has a "type" which defines its meaning and the operations that may be performed on it. Type theory is the academic study of type system ...
s, and
quotient set In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities an ...
s may be replaced by
setoid In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and ...
s.) The characteristic function F of a set S is defined as: :F(x) = \begin 1, & \mbox x \in S \\ 0, & \mbox x \not \in S \end In theory, many other abstract data structures can be viewed as set structures with additional operations and/or additional
axiom An axiom, postulate or assumption is a statement that is taken to be truth, true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Greek ''axíōma'' () 'that which is thought worthy or fit' o ...

axiom
s imposed on the standard operations. For example, an abstract
heap Heap or HEAP may refer to: Computing and mathematics * Heap (data structure), a data structure commonly used to implement a priority queue * Heap (mathematics), a generalization of a group * Heap (programming) (or free store), an area of memory for ...
can be viewed as a set structure with a min(''S'') operation that returns the element of smallest value.


Operations


Core set-theoretical operations

One may define the operations of the
algebra of sets In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). I ...
: * union(''S'',''T''): returns the union of sets ''S'' and ''T''. * intersection(''S'',''T''): returns the
intersection The line (purple) in two points (red). The disk (yellow) intersects the line in the line segment between the two red points. In mathematics, the intersection of two or more objects is another, usually "smaller" object. Intuitively, the inter ...
of sets ''S'' and ''T''. * difference(''S'',''T''): returns the difference of sets ''S'' and ''T''. * subset(''S'',''T''): a predicate that tests whether the set ''S'' is a
subset In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities a ...

subset
of set ''T''.


Static sets

Typical operations that may be provided by a static set structure ''S'' are: * is_element_of(''x'',''S''): checks whether the value ''x'' is in the set ''S''. * is_empty(''S''): checks whether the set ''S'' is empty. * size(''S'') or
cardinality In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...
(''S'')
: returns the number of elements in ''S''. * iterate(''S''): returns a function that returns one more value of ''S'' at each call, in some arbitrary order. * enumerate(''S''): returns a list containing the elements of ''S'' in some arbitrary order. * build(''x''1,''x''2,…,''x''''n'',): creates a set structure with values ''x''1,''x''2,...,''x''''n''. * create_from(''collection''): creates a new set structure containing all the elements of the given
collection Collection or Collections may refer to: * Cash collection, the function of an accounts receivable department * Collection agency, agency to collect cash * Collections management (museum) ** Collection (artwork), objects in a particular field fo ...
or all the elements returned by the given
iterator In computer programming, an iterator is an object (computing), object that enables a programmer to traverse a Container (data structure), container, particularly List (abstract data type), lists. Various types of iterators are often provided via a ...
.


Dynamic sets

Dynamic set structures typically add: * create(): creates a new, initially empty set structure. ** create_with_capacity(''n''): creates a new set structure, initially empty but capable of holding up to ''n'' elements. * add(''S'',''x''): adds the element ''x'' to ''S'', if it is not present already. * remove(''S'', ''x''): removes the element ''x'' from ''S'', if it is present. * capacity(''S''): returns the maximum number of values that ''S'' can hold. Some set structures may allow only some of these operations. The cost of each operation will depend on the implementation, and possibly also on the particular values stored in the set, and the order in which they are inserted.


Additional operations

There are many other operations that can (in principle) be defined in terms of the above, such as: * pop(''S''): returns an arbitrary element of ''S'', deleting it from ''S''. * pick(''S''): returns an arbitrary element of ''S''. Functionally, the mutator pop can be interpreted as the pair of selectors (pick, rest), where rest returns the set consisting of all elements except for the arbitrary element. Can be interpreted in terms of iterate. *
map A map is a symbol A symbol is a mark, sign, or that indicates, signifies, or is understood as representing an , , or . Symbols allow people to go beyond what is n or seen by creating linkages between otherwise very different s and s. A ...
(''F'',''S'')
: returns the set of distinct values resulting from applying function ''F'' to each element of ''S''. *
filter Filter, filtering or filters may refer to: Science and technology Device * Filter (chemistry), a device which separates solids from fluids (liquids or gases) by adding a medium through which only the fluid can pass ** Filter (aquarium), critical ...
(''P'',''S'')
: returns the subset containing all elements of ''S'' that satisfy a given
predicate Predicate or predication may refer to: Computer science *Syntactic predicate (in parser technology) guidelines the parser process Linguistics *Predicate (grammar), a grammatical component of a sentence Philosophy and logic * Predication (philo ...
''P''. *
fold Fold or folding may refer to: Arts, entertainment, and media *Fold (album), ''Fold'' (album), the debut release by Australian rock band Epicure *Fold (poker), in the game of poker, to discard one's hand and forfeit interest in the current pot *Ab ...
(''A''0,''F'',''S'')
: returns the value ''A'', ''S'', after applying ''A''i+1 := ''F''(''Ai'', ''e'') for each element ''e'' of ''S,'' for some binary operation ''F.'' ''F'' must be associative and commutative for this to be well-defined. * clear(''S''): delete all elements of ''S''. * equal(''S''1', ''S''2'): checks whether the two given sets are equal (i.e. contain all and only the same elements). * hash(''S''): returns a
hash value A hash function is any Function (mathematics), function that can be used to map data (computing), data of arbitrary size to fixed-size values. The values returned by a hash function are called ''hash values'', ''hash codes'', ''digests'', or s ...
for the static set ''S'' such that if equal(''S''1, ''S''2) then hash(''S1'') = hash(''S2'') Other operations can be defined for sets with elements of a special type: * sum(''S''): returns the sum of all elements of ''S'' for some definition of "sum". For example, over integers or reals, it may be defined as fold(0, add, ''S''). * collapse(''S''): given a set of sets, return the union. For example, collapse()

. May be considered a kind of sum. * flatten(''S''): given a set consisting of sets and atomic elements (elements that are not sets), returns a set whose elements are the atomic elements of the original top-level set or elements of the sets it contains. In other words, remove a level of nesting – like collapse, but allow atoms. This can be done a single time, or recursively flattening to obtain a set of only atomic elements. For example, flatten()

. * nearest(''S'',''x''): returns the element of ''S'' that is closest in value to ''x'' (by some
metric Metric or metrical may refer to: * Metric system, an internationally adopted decimal system of measurement Mathematics * Metric (mathematics), an abstraction of the notion of ''distance'' in a metric space * Metric tensor, in differential geomet ...
). * min(''S''), max(''S''): returns the minimum/maximum element of ''S''.


Implementations

Sets can be implemented using various
data structure In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of ...

data structure
s, which provide different time and space trade-offs for various operations. Some implementations are designed to improve the efficiency of very specialized operations, such as nearest or union. Implementations described as "general use" typically strive to optimize the element_of, add, and delete operations. A simple implementation is to use a
list A ''list'' is any set of items. List or lists may also refer to: People * List (surname)List or Liste is a European surname. Notable people with the surname include: List * Friedrich List (1789–1846), German economist * Garrett List (194 ...
, ignoring the order of the elements and taking care to avoid repeated values. This is simple but inefficient, as operations like set membership or element deletion are ''O''(''n''), as they require scanning the entire list. Sets are often instead implemented using more efficient data structures, particularly various flavors of
trees In botany Botany, also called , plant biology or phytology, is the science Science (from the Latin word ''scientia'', meaning "knowledge") is a systematic enterprise that Scientific method, builds and Taxonomy (general), organiz ...
,
trie In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of , ...

trie
s, or
hash tables In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and ...
. As sets can be interpreted as a kind of map (by the indicator function), sets are commonly implemented in the same way as (partial) maps (
associative array In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of , ...
s) – in this case in which the value of each key-value pair has the
unit type In the area of mathematical logic Mathematical logic is the study of formal logic within mathematics. Major subareas include model theory, proof theory, set theory, and recursion theory. Research in mathematical logic commonly addresses the m ...
or a sentinel value (like 1) – namely, a
self-balancing binary search tree In computer science, a self-balancing binary search tree (BST) is any node (computer science), node-based binary search tree that automatically keeps its height (maximal number of levels below the root) small in the face of arbitrary item insert ...
for sorted sets (which has O(log n) for most operations), or a
hash table In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and ...
for unsorted sets (which has O(1) average-case, but O(n) worst-case, for most operations). A sorted linear hash table may be used to provide deterministically ordered sets. Further, in languages that support maps but not sets, sets can be implemented in terms of maps. For example, a common
programming idiomA programming idiom or code idiom is a syntactic fragment that recurs frequently across software projects and has a single semantic role, often expressing a special feature of a recurring construct in one or more programming languages. Software devel ...
in
Perl Perl is a family of two high-level High-level and low-level, as technical terms, are used to classify, describe and point to specific Objective (goal), goals of a systematic operation; and are applied in a wide range of contexts, such as, for ...
that converts an array to a hash whose values are the sentinel value 1, for use as a set, is: my %elements = map @elements; Other popular methods include
arrays ARRAY, also known as ARRAY Now, is an independent distribution company launched by film maker and former publicist Ava DuVernay Ava Marie DuVernay (; born August 24, 1972) is an American filmmaker. She won the directing award in the U.S. dram ...
. In particular a subset of the integers 1..''n'' can be implemented efficiently as an ''n''-bit
bit array A bit array (also known as bit map, bit set, bit string, or bit vector) is an array data structure ARRAY, also known as ARRAY Now, is an independent distribution company launched by film maker and former publicist Ava DuVernay Ava Marie DuV ...
, which also support very efficient union and intersection operations. A Bloom map implements a set probabilistically, using a very compact representation but risking a small chance of false positives on queries. The Boolean set operations can be implemented in terms of more elementary operations (pop, clear, and add), but specialized algorithms may yield lower asymptotic time bounds. If sets are implemented as sorted lists, for example, the naive algorithm for union(''S'',''T'') will take time proportional to the length ''m'' of ''S'' times the length ''n'' of ''T''; whereas a variant of the list merging algorithm will do the job in time proportional to ''m''+''n''. Moreover, there are specialized set data structures (such as the union-find data structure) that are optimized for one or more of these operations, at the expense of others.


Language support

One of the earliest languages to support sets was
Pascal Pascal, Pascal's or PASCAL may refer to: People and fictional characters * Pascal (given name), including a list of people with the name * Pascal (surname), including a list of people and fictional characters with the name ** Blaise Pascal, French ...
; many languages now include it, whether in the core language or in a
standard library A standard library in computer programming is the library (computing), library made available across implementations of a programming language. These libraries are conventionally described in programming language specifications; however, contents o ...
. * In
C++ C++ () is a general-purpose programming language In computer software, a general-purpose programming language is a programming language dedicated to a general-purpose, designed to be used for writing software in a wide variety of application ...

C++
, the
Standard Template Library The Standard Template Library (STL) is a software library for the C++ programming language that influenced many parts of the C++ Standard Library. It provides four components called ''algorithm of an algorithm (Euclid's algorithm) for calcu ...
(STL) provides the set template class, which is typically implemented using a binary search tree (e.g.
red–black tree In computer science, a red–black tree is a kind of self-balancing binary search tree. Each node stores an extra bit representing "color" ("red" or "black"), used to ensure that the tree remains balanced during insertions and deletions. When the ...
);
SGI SGI may refer to: Companies *Saskatchewan Government Insurance Saskatchewan Government Insurance (SGI) is a Canada, Canadian insurance company and a Crown corporations, Crown corporation wholly owned by the Government of Saskatchewan. SGI's opera ...
's STL also provides the hash_set template class, which implements a set using a hash table.
C++11 C11, C.XI, C-11 or C.11 may refer to: Transport * C-11 Fleetster The Consolidated Model 17 Fleetster was a 1920s United States, American light transport monoplane aircraft built by the Consolidated Aircraft, Consolidated Aircraft Corporation. De ...
has support for the unordered_set template class, which is implemented using a hash table. In sets, the elements themselves are the keys, in contrast to sequenced containers, where elements are accessed using their (relative or absolute) position. Set elements must have a strict weak ordering. *
Java Java ( id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 147.7 million people, Java is the world's List of ...
offers the
interface Interface or interfacing may refer to: Academic journals * Interface (journal), ''Interface'' (journal), by the Electrochemical Society * ''Interface, Journal of Applied Linguistics'', now merged with ''ITL International Journal of Applied Li ...
to support sets (with the class implementing it using a hash table), and the sub-interface to support sorted sets (with the class implementing it using a binary search tree). *
Apple An apple is an edible fruit In botany Botany, also called , plant biology or phytology, is the science of plant life and a branch of biology. A botanist, plant scientist or phytologist is a scientist who specialises in this fie ...
's Foundation framework (part of
Cocoa Cocoa or COCOA may refer to: Chocolate * ''Theobroma cacao ''Theobroma cacao'', also called the cacao tree and the cocoa tree, is a small ( tall) evergreen In botany Botany, also called , plant biology or phytology, is the scienc ...
) provides the
Objective-C Objective-C is a general-purpose, object-oriented Object-oriented programming (OOP) is a programming paradigm Program, programme, programmer, or programming may refer to: Business and management * Program management, the process of m ...
classes NSSet
/code>, NSMutableSet
/code>, NSCountedSet
/code>,
/code>, and
/code>. The CoreFoundation APIs provide th
CFSet
an
CFMutableSet
types for use in C. *
Python Python may refer to: * Pythonidae The Pythonidae, commonly known as pythons, are a family of nonvenomous snakes found in Africa, Asia, and Australia. Among its members are some of the largest snakes in the world. Ten genera and 42 species ...
has built-i
set and frozenset types
since 2.4, and since Python 3.0 and 2.7, supports non-empty set literals using a curly-bracket syntax, e.g.: ; empty sets must be created using set(), because Python uses to represent the empty dictionary. * The
.NET Framework The .NET Framework (pronounced as "''dot net"'') is a software framework In computer programming Computer programming is the process of designing and building an executable computer program to accomplish a specific computing result or t ...
provides the generic HashSet
/code> and SortedSet
/code> classes that implement the generic ISet
/code> interface. *
Smalltalk Smalltalk is an object-oriented programming, object-oriented, dynamically typed reflection (computer science), reflective programming language. Smalltalk was created as the language underpinning the "new world" of computing exemplified by "human ...

Smalltalk
's class library includes Set and IdentitySet, using equality and identity for inclusion test respectively. Many dialects provide variations for compressed storage (NumberSet, CharacterSet), for ordering (OrderedSet, SortedSet, etc.) or for
weak referenceIn computer programming Computer programming is the process of designing and building an executable computer program to accomplish a specific computing result or to perform a specific task. Programming involves tasks such as: analysis, generatin ...
s (WeakIdentitySet). *
Ruby A ruby is a pink-ish red to blood-red colored gemstone A gemstone (also called a fine gem, jewel, precious stone, or semi-precious stone) is a piece of mineral In geology and mineralogy, a mineral or mineral species is, broadly spea ...
's standard library includes a set
/code> module which contains Set and SortedSet classes that implement sets using hash tables, the latter allowing iteration in sorted order. *
OCaml OCaml ( , formerly Objective Caml) is a general-purpose, multi-paradigm programming language Programming paradigms are a way to classify programming languages based on their features. Languages can be classified into multiple paradigms. S ...
's standard library contains a Set module, which implements a functional set data structure using binary search trees. * The GHC implementation of Haskell provides a Data.Set
/code> module, which implements immutable sets using binary search trees. * The
Tcl Tcl (pronounced "tickle" or as an initialism An acronym is a word In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semantic, objective or pragmatics, prac ...

Tcl
Tcllib package provides a set module which implements a set data structure based upon TCL lists. * The
Swift The Society for Worldwide Interbank Financial Telecommunication (SWIFT), legally S.W.I.F.T. SCRL, is a Belgium, Belgian cooperative society that serves as an intermediary and executor of financial transactions between banks worldwide. It also ...
standard library contains a Set type, since Swift 1.2. *
JavaScript JavaScript (), often abbreviated JS, is a programming language A programming language is a formal language In mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), ma ...

JavaScript
introduced Set
/code> as a standard built-in object with the ECMAScript 2015 standard. * Erlang's standard library has a sets
/code> module. *
Clojure Clojure (, like ''closure'') is a dynamic programming language, dynamic and functional programming, functional dialect (computing), dialect of the Lisp (programming language), Lisp programming language on the Java (software platform), Java plat ...
has literal syntax for hashed sets, and also implements sorted sets. *
LabVIEW Laboratory Virtual Instrument Engineering Workbench (LabVIEW) is a system-design platform and development environment for a visual programming language from National Instruments. The graphical language is named "G"; not to be confused with G-code ...
has native support for sets, from version 2019. As noted in the previous section, in languages which do not directly support sets but do support
associative array In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of , ...
s, sets can be emulated using associative arrays, by using the elements as keys, and using a dummy value as the values, which are ignored.


Multiset

A generalization of the notion of a set is that of a
multiset In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and t ...
or bag, which is similar to a set but allows repeated ("equal") values (duplicates). This is used in two distinct senses: either equal values are considered ''identical,'' and are simply counted, or equal values are considered ''equivalent,'' and are stored as distinct items. For example, given a list of people (by name) and ages (in years), one could construct a multiset of ages, which simply counts the number of people of a given age. Alternatively, one can construct a multiset of people, where two people are considered equivalent if their ages are the same (but may be different people and have different names), in which case each pair (name, age) must be stored, and selecting on a given age gives all the people of a given age. Formally, it is possible for objects in computer science to be considered "equal" under some
equivalence relation In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities a ...
but still distinct under another relation. Some types of multiset implementations will store distinct equal objects as separate items in the data structure; while others will collapse it down to one version (the first one encountered) and keep a positive integer count of the multiplicity of the element. As with sets, multisets can naturally be implemented using hash table or trees, which yield different performance characteristics. The set of all bags over type T is given by the expression bag T. If by multiset one considers equal items identical and simply counts them, then a multiset can be interpreted as a function from the input domain to the non-negative integers (
natural number In mathematics, the natural numbers are those numbers used for counting (as in "there are ''six'' coins on the table") and total order, ordering (as in "this is the ''third'' largest city in the country"). In common mathematical terminology, w ...
s), generalizing the identification of a set with its indicator function. In some cases a multiset in this counting sense may be generalized to allow negative values, as in Python. * C++'s
Standard Template Library The Standard Template Library (STL) is a software library for the C++ programming language that influenced many parts of the C++ Standard Library. It provides four components called ''algorithm of an algorithm (Euclid's algorithm) for calcu ...
implements both sorted and unsorted multisets. It provides the
multiset In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and t ...
class for the sorted multiset, as a kind of
associative container In computer science, an associative array, map, symbol table, or dictionary is an abstract data type composed of a collection (abstract data type), collection of attribute–value pair, (key, value) pairs, such that each possible key appears at m ...
, which implements this multiset using a
self-balancing binary search tree In computer science, a self-balancing binary search tree (BST) is any node (computer science), node-based binary search tree that automatically keeps its height (maximal number of levels below the root) small in the face of arbitrary item insert ...
. It provides the unordered_multiset class for the unsorted multiset, as a kind of unordered associative containers, which implements this multiset using a
hash table In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and ...
. The unsorted multiset is standard as of
C++11 C11, C.XI, C-11 or C.11 may refer to: Transport * C-11 Fleetster The Consolidated Model 17 Fleetster was a 1920s United States, American light transport monoplane aircraft built by the Consolidated Aircraft, Consolidated Aircraft Corporation. De ...
; previously SGI's STL provides the hash_multiset class, which was copied and eventually standardized. * For
Java Java ( id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 147.7 million people, Java is the world's List of ...
, third-party libraries provide multiset functionality: **
Apache Commons The Apache Commons is a project of the Apache Software Foundation, formerly under the Jakarta Project. The purpose of the Commons is to provide reusable, Open-source software, open source Java software. The Commons is composed of three parts: proper ...
Collections provides the Bag
/code> and SortedBag interfaces, with implementing classes like HashBag and TreeBag. **
Google Guava Google Guava is an open-source license, open-source set of common libraries for Java (programming language), Java, mainly developed by Google engineers. Overview Google Guava can be roughly divided into three components: basic utilities to redu ...
provides the Multiset
/code> interface, with implementing classes like
/code> and
/code>. * Apple provides the NSCountedSet
/code> class as part of
Cocoa Cocoa or COCOA may refer to: Chocolate * ''Theobroma cacao ''Theobroma cacao'', also called the cacao tree and the cocoa tree, is a small ( tall) evergreen In botany Botany, also called , plant biology or phytology, is the scienc ...
, and the CFBag
/code> and CFMutableBag
/code> types as part of CoreFoundation. * Python's standard library includes collections.Counter
/code>, which is similar to a multiset. *
Smalltalk Smalltalk is an object-oriented programming, object-oriented, dynamically typed reflection (computer science), reflective programming language. Smalltalk was created as the language underpinning the "new world" of computing exemplified by "human ...

Smalltalk
includes the Bag class, which can be instantiated to use either identity or equality as predicate for inclusion test. Where a multiset data structure is not available, a workaround is to use a regular set, but override the equality predicate of its items to always return "not equal" on distinct objects (however, such will still not be able to store multiple occurrences of the same object) or use an
associative array In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of , ...
mapping the values to their integer multiplicities (this will not be able to distinguish between equal elements at all). Typical operations on bags: * contains(''B'', ''x''): checks whether the element ''x'' is present (at least once) in the bag ''B'' * is_sub_bag(''B''1, ''B''2): checks whether each element in the bag ''B''1 occurs in ''B''1 no more often than it occurs in the bag ''B''2; sometimes denoted as ''B''1 ⊑ ''B''2. * count(''B'', ''x''): returns the number of times that the element ''x'' occurs in the bag ''B''; sometimes denoted as ''B'' # ''x''. * scaled_by(''B'', ''n''): given a
natural number In mathematics, the natural numbers are those numbers used for counting (as in "there are ''six'' coins on the table") and total order, ordering (as in "this is the ''third'' largest city in the country"). In common mathematical terminology, w ...
''n'', returns a bag which contains the same elements as the bag ''B'', except that every element that occurs ''m'' times in ''B'' occurs ''n'' * ''m'' times in the resulting bag; sometimes denoted as ''n'' ⊗ ''B''. * union(''B''1, ''B''2): returns a bag containing just those values that occur in either the bag ''B''1 or the bag ''B''2, except that the number of times a value ''x'' occurs in the resulting bag is equal to (''B''1 # x) + (''B''2 # x); sometimes denoted as ''B''1 ⊎ ''B''2.


Multisets in SQL

In
relational databases A relational database is a digital database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relational database system ...
, a table can be a (mathematical) set or a multiset, depending on the presence of unicity constraints on some columns (which turns it into a
candidate key In the relational model The relational model (RM) for database In computing, a database is an organized collection of Data (computing), data stored and accessed electronically from a computer system. Where databases are more complex they are of ...
).
SQL SQL ( ''S-Q-L'', "sequel"; Structured Query Language) is a domain-specific languageA domain-specific language (DSL) is a computer languageA computer language is a method of communication with a computer A computer is a machine that can b ...

SQL
allows the selection of rows from a relational table: this operation will in general yield a multiset, unless the keyword DISTINCT is used to force the rows to be all different, or the selection includes the primary (or a candidate) key. In ANSI SQL the MULTISET keyword can be used to transform a subquery into a collection expression: SELECT expression1, expression2... FROM table_name... is a general select that can be used as '' subquery expression'' of another more general query, while MULTISET(SELECT expression1, expression2... FROM table_name...) transforms the subquery into a '' collection expression'' that can be used in another query, or in assignment to a column of appropriate collection type.


See also

*
Bloom filter A Bloom filter is a space-efficient probabilistic Probability is the branch of mathematics Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (al ...

Bloom filter
*
Disjoint set In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and ...
*
Set (mathematics) In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities an ...


Notes


References

{{Data structures Data types Composite data types Abstract data types