computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...

, an enumeration algorithm is an

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

that enumerates the answers to a

computational problem In theoretical computer science, a computational problem is a problem that may be solved by an algorithm. For example, the problem of factoring :"Given a positive integer ''n'', find a nontrivial prime factor of ''n''." is a computational probl ...

. Formally, such an algorithm applies to problems that take an input and produce a list of solutions, similarly to

function problems In computational complexity theory, a function problem is a computational problem where a single output (of a total function) is expected for every input, but the output is more complex than that of a decision problem. For function problems, th ...

. For each input, the enumeration algorithm must produce the list of all solutions, without duplicates, and then halt. The performance of an enumeration algorithm is measured in terms of the time required to produce the solutions, either in terms of the total time required to produce all solutions, or in terms of the maximal delay between two consecutive solutions and in terms of a preprocessing time, counted as the time before outputting the first solution. This complexity can be expressed in terms of the size of the input, the size of each individual output, or the total size of the set of all outputs, similarly to what is done with

output-sensitive algorithm In computer science, an output-sensitive algorithm is an algorithm whose running time depends on the size of the output, instead of, or in addition to, the size of the input. For certain problems where the output size varies widely, for example fro ...

Formal definitions

An enumeration problem

P

is defined as a relation

R

over strings of an arbitrary

alphabet An alphabet is a standardized set of basic written graphemes (called letters) that represent the phonemes of certain spoken languages. Not all writing systems represent language in this way; in a syllabary, each character represents a s ...

\Sigma

R \subseteq \Sigma^* \times \Sigma^*

An algorithm solves

P

if for every input

x

the algorithm produces the (possibly infinite) sequence

y

such that

y

has no duplicate and

z\in y

if and only if

(x,z)\in R

. The algorithm should halt if the sequence

y

is finite.

Common complexity classes

Enumeration problems have been studied in the context of

computational complexity theory In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved ...

, and several

complexity class In computational complexity theory, a complexity class is a set of computational problems of related resource-based complexity. The two most commonly analyzed resources are time and memory. In general, a complexity class is defined in terms ...

es have been introduced for such problems. A very general such class is EnumP, the class of problems for which the correctness of a possible output can be checked in

polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...

in the input and output. Formally, for such a problem, there must exist an algorithm A which takes as input the problem input ''x'', the candidate output ''y'', and solves the

decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whethe ...

of whether ''y'' is a correct output for the input ''x'', in polynomial time in ''x'' and ''y''. For instance, this class contains all problems that amount to enumerating the witnesses of a problem in the class NP. Other classes that have been defined include the following. In the case of problems that are also in EnumP, these problems are ordered from least to most specific: * Output polynomial, the class of problems whose complete output can be computed in polynomial time. * Incremental polynomial time, the class of problems where, for all ''i'', the ''i''-th output can be produced in polynomial time in the input size and in the number ''i''. *

Polynomial delay In the analysis of algorithms, an enumeration algorithm (i.e., an algorithm for listing a large or infinite collection of structures) is said to have polynomial delay if the time between the output of any one structure and the next is bounded by a ...

, the class of problems where the delay between two consecutive outputs is polynomial in the input (and independent from the output). * Strongly polynomial delay, the class of problems where the delay before each output is polynomial in the size of this specific output (and independent from the input or from the other outputs). The preprocessing is generally assumed to be polynomial. * Constant delay, the class of problems where the delay before each output is constant, i.e., independent from the input and output. The preprocessing phase is generally assumed to be polynomial in the input.

Common techniques

Backtracking Backtracking is a class of algorithms for finding solutions to some computational problems, notably constraint satisfaction problems, that incrementally builds candidates to the solutions, and abandons a candidate ("backtracks") as soon as it de ...

: The simplest way to enumerate all solutions is by systematically exploring the space of possible results ( partitioning it at each successive step). However, performing this may not give good guarantees on the delay, i.e., a backtracking algorithm may spend a long time exploring parts of the space of possible results that do not give rise to a full solution. * : This technique improves on backtracking by exploring the space of all possible solutions but solving at each step the problem of whether the current partial solution can be extended to a partial solution. If the answer is no, then the algorithm can immediately backtrack and avoid wasting time, which makes it easier to show guarantees on the delay between any two complete solutions. In particular, this technique applies well to

self-reducible In computational complexity theory, a function problem is a computational problem where a single output (of a total function) is expected for every input, but the output is more complex than that of a decision problem. For function problems, th ...

problems. * Closure under set operations: If we wish to enumerate the disjoint union of two sets, then we can solve the problem by enumerating the first set and then the second set. If the union is non disjoint but the sets can be enumerated in sorted order, then the enumeration can be performed in parallel on both sets while eliminating duplicates on the fly. If the union is not disjoint and both sets are not sorted then duplicates can be eliminated at the expense of a higher memory usage, e.g., using a

hash table In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', ...

. Likewise, the

cartesian product In mathematics, specifically set theory, the Cartesian product of two sets ''A'' and ''B'', denoted ''A''×''B'', is the set of all ordered pairs where ''a'' is in ''A'' and ''b'' is in ''B''. In terms of set-builder notation, that is : A\ ...

of two sets can be enumerated efficiently by enumerating one set and joining each result with all results obtained when enumerating the second step.

Examples of enumeration problems

* The vertex enumeration problem, where we are given a

polytope In elementary geometry, a polytope is a geometric object with flat sides ('' faces''). Polytopes are the generalization of three-dimensional polyhedra to any number of dimensions. Polytopes may exist in any general number of dimensions as an ...

described as a system of

linear inequalities Linearity is the property of a mathematical relationship ('' function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear ...

and we must enumerate the vertices of the polytope. * Enumerating the minimal transversals of a

hypergraph In mathematics, a hypergraph is a generalization of a graph in which an edge can join any number of vertices. In contrast, in an ordinary graph, an edge connects exactly two vertices. Formally, an undirected hypergraph H is a pair H = (X,E) ...

. This problem is related to monotone dualization and is connected to many applications in

database theory Database theory encapsulates a broad range of topics related to the study and research of the theoretical realm of databases and database management systems. Theoretical aspects of data management include, among other areas, the foundations of qu ...

and

graph theory In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...

. * Enumerating the answers to a

database query In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases span ...

, for instance a

conjunctive query In database theory, a conjunctive query is a restricted form of first-order queries using the logical conjunction operator. Many first-order queries can be written as conjunctive queries. In particular, a large part of queries issued on relation ...

or a query expressed in

monadic second-order In mathematical logic, monadic second-order logic (MSO) is the fragment of second-order logic where the second-order quantification is limited to quantification over sets. It is particularly important in the logic of graphs, because of Courcelle's ...

. There have been characterizations in

of which conjunctive queries could be enumerated with

linear Linearity is the property of a mathematical relationship ('' function'') that can be graphically represented as a straight line. Linearity is closely related to '' proportionality''. Examples in physics include rectilinear motion, the linear ...

preprocessing and constant delay. * The problem of enumerating maximal cliques in an input graph, e.g., with the Bron–Kerbosch algorithm * Listing all elements of structures such as

matroid In combinatorics, a branch of mathematics, a matroid is a structure that abstracts and generalizes the notion of linear independence in vector spaces. There are many equivalent ways to define a matroid axiomatically, the most significant being ...

s and greedoids * Several problems on graphs, e.g., enumerating independent sets, paths, cuts, etc. * Enumerating the satisfying assignments of representations of

Boolean function In mathematics, a Boolean function is a function whose arguments and result assume values from a two-element set (usually , or ). Alternative names are switching function, used especially in older computer science literature, and truth function ...

s, e.g., a Boolean formula written in

conjunctive normal form In Boolean logic, a formula is in conjunctive normal form (CNF) or clausal normal form if it is a conjunction of one or more clauses, where a clause is a disjunction of literals; otherwise put, it is a product of sums or an AND of ORs. As a cano ...

disjunctive normal form In boolean logic, a disjunctive normal form (DNF) is a canonical normal form of a logical formula consisting of a disjunction of conjunctions; it can also be described as an OR of ANDs, a sum of products, or (in philosophical logic) a ''cluster co ...

, a

binary decision diagram In computer science, a binary decision diagram (BDD) or branching program is a data structure that is used to represent a Boolean function. On a more abstract level, BDDs can be considered as a compressed representation of sets or relations. U ...

such as an

OBDD In computer science, a binary decision diagram (BDD) or branching program is a data structure that is used to represent a Boolean function. On a more abstract level, BDDs can be considered as a compressed representation of sets or relations. ...

, or a

Boolean circuit In computational complexity theory and circuit complexity, a Boolean circuit is a mathematical model for combinational digital logic circuits. A formal language can be decided by a family of Boolean circuits, one circuit for each possible in ...

in restricted classes studied in

knowledge compilation Knowledge compilation is a family of approaches for addressing the intractability of a number of artificial intelligence problems. A propositional model is compiled in an off-line phase in order to support some queries in polynomial time. Many way ...

, e.g., NNF.{{Cite journal, last1=Marquis, first1=P., last2=Darwiche, first2=A., date=2002, title=A Knowledge Compilation Map, journal=Journal of Artificial Intelligence Research, volume=17, pages=229–264, language=en, doi=10.1613/jair.989, arxiv=1106.1819, s2cid=9919794

Connection to computability theory

The notion of enumeration algorithms is also used in the field of

computability theory Computability theory, also known as recursion theory, is a branch of mathematical logic, computer science, and the theory of computation that originated in the 1930s with the study of computable functions and Turing degrees. The field has since ...

to define some high complexity classes such as RE, the class of all

recursively enumerable In computability theory, a set ''S'' of natural numbers is called computably enumerable (c.e.), recursively enumerable (r.e.), semidecidable, partially decidable, listable, provable or Turing-recognizable if: *There is an algorithm such that the ...

problems. This is the class of sets for which there exist an enumeration algorithm that will produce all elements of the set: the algorithm may run forever if the set is infinite, but each solution must be produced by the algorithm after a finite time.

References

Algorithms