TheInfoList

The set cover problem is a classical question in
combinatorics Combinatorics is an area of mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geom ...
,
computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of , , and . Computer science ...
, operations research, and complexity theory. It is one of
Karp's 21 NP-complete problems In computational complexity theory Computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by a computer. A ...
shown to be
NP-complete In computational complexity theory Computational complexity theory focuses on classifying computational problem In theoretical computer science An artistic representation of a Turing machine. Turing machines are used to model general computi ...
in 1972. It is a problem "whose study has led to the development of fundamental techniques for the entire field" of
approximation algorithms In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of , , an ...
. Given a set of elements $\$ (called the
universe The universe ( la, universus) is all of space and time and their contents, including planets, stars, galaxy, galaxies, and all other forms of matter and energy. The Big Bang theory is the prevailing cosmological description of the development ...
) and a collection $S$ of $m$ sets whose union equals the universe, the set cover problem is to identify the smallest sub-collection of $S$ whose union equals the universe. For example, consider the universe $U = \$ and the collection of sets $S = \$. Clearly the union of $S$ is $U$. However, we can cover all of the elements with the following, smaller number of sets: $\$. More formally, given a universe $\mathcal$ and a family $\mathcal$ of subsets of $\mathcal$, a ''cover'' is a subfamily $\mathcal\subseteq\mathcal$ of sets whose union is $\mathcal$. In the set covering
decision problem In computability theory and computational complexity theory, a decision problem is a problem that can be posed as a yesâ€“no question of the input values. An example of a decision problem is deciding whether a given natural number is prime. Anot ...
, the input is a pair $\left(\mathcal,\mathcal\right)$ and an integer $k$; the question is whether there is a set covering of size $k$ or less. In the set covering
optimization problem In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and t ...
, the input is a pair $\left(\mathcal,\mathcal\right)$, and the task is to find a set covering that uses the fewest sets. The decision version of set covering is
NP-complete In computational complexity theory Computational complexity theory focuses on classifying computational problem In theoretical computer science An artistic representation of a Turing machine. Turing machines are used to model general computi ...
, and the optimization/search version of set cover is
NP-hard In computational complexity theory Computational complexity theory focuses on classifying computational problem In theoretical computer science An artistic representation of a Turing machine. Turing machines are used to model general computin ...
. If each set is assigned a cost, it becomes a ''weighted'' set cover problem.

# Integer linear program formulation

The minimum set cover problem can be formulated as the following
integer linear program An integer programming problem is a mathematical optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of ...
(ILP). This ILP belongs to the more general class of ILPs for
covering problem In combinatorics and computer science, covering problems are computational problems that ask whether a certain combinatorial structure 'covers' another, or how large the structure has to be to do that. Covering problems are Optimization (mathematic ...
s. The integrality gap of this ILP is at most $\scriptstyle \log n$, so its relaxation gives a factor-$\scriptstyle \log n$
approximation algorithm In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of , , an ...
for the minimum set cover problem (where $\scriptstyle n$ is the size of the universe). In weighted set cover, the sets are assigned weights. Denote the weight of set $s\in \mathcal$ by $w_$. Then the integer linear program describing weighted set cover is identical to the one given above, except that the objective function to minimize is $\sum_ w_s x_s$.

# Hitting set formulation

Set covering is equivalent to the hitting set problem. That is seen by observing that an instance of set covering can be viewed as an arbitrary
bipartite graph In the mathematical Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities ...

, with sets represented by vertices on the left, the universe represented by vertices on the right, and edges representing the inclusion of elements in sets. The task is then to find a minimum cardinality subset of left-vertices which covers all of the right-vertices. In the Hitting set problem, the objective is to cover the left-vertices using a minimum subset of the right vertices. Converting from one problem to the other is therefore achieved by interchanging the two sets of vertices.

# Greedy algorithm

There is a
greedy algorithm A greedy algorithm is any algorithm In and , an algorithm () is a finite sequence of , computer-implementable instructions, typically to solve a class of problems or to perform a computation. Algorithms are always and are used as specific ...

for polynomial time approximation of set covering that chooses sets according to one rule: at each stage, choose the set that contains the largest number of uncovered elements. This method can be implemented in time linear in the sum of sizes of the input sets, using a
bucket queue In the design and analysis of data structures, a bucket queue (also called a bucket priority queue. See also p. 157 for the history and naming of this structure. or bounded-height priority queue) is a priority queue for prioritizing elements w ...
to prioritize the sets. It achieves an approximation ratio of $H\left(s\right)$, where $s$ is the size of the set to be covered. In other words, it finds a covering that may be $H\left(n\right)$ times as large as the minimum one, where $H\left(n\right)$ is the $n$-th
harmonic number In mathematics Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities a ...
: $H(n) = \sum_^ \frac \le \ln +1$ This greedy algorithm actually achieves an approximation ratio of $H\left(s^\prime\right)$ where $s^\prime$ is the maximum cardinality set of $S$. For $\delta-$dense instances, however, there exists a $c \ln$-approximation algorithm for every $c > 0$. There is a standard example on which the greedy algorithm achieves an approximation ratio of $\log_2\left(n\right)/2$. The universe consists of $n=2^-2$ elements. The set system consists of $k$ pairwise disjoint sets $S_1,\ldots,S_k$ with sizes $2,4,8,\ldots,2^k$ respectively, as well as two additional disjoint sets $T_0,T_1$, each of which contains half of the elements from each $S_i$. On this input, the greedy algorithm takes the sets $S_k,\ldots,S_1$, in that order, while the optimal solution consists only of $T_0$ and $T_1$. An example of such an input for $k=3$ is pictured on the right. Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for set cover up to lower order terms (see Inapproximability results below), under plausible complexity assumptions. A tighter analysis for the greedy algorithm shows that the approximation ratio is exactly $\ln - \ln + \Theta\left(1\right)$.

# Low-frequency systems

If each element occurs in at most sets, then a solution can be found in polynomial time that approximates the optimum to within a factor of using LP relaxation. If the constraint $x_S\in\$ is replaced by $x_S \geq 0$ for all in $\mathcal$ in the integer linear program shown
above Above may refer to: *Above (artist), Tavar Zawacki (born 1981), contemporary urban artist *Above (magazine), ''Above'' (magazine), an American environmental magazine 2009â€“2010 *Above (Mad Season album), ''Above'' (Mad Season album), 1995 *Above ...
, then it becomes a (non-integer) linear program . The algorithm can be described as follows: # Find an optimal solution for the program using some polynomial-time method of solving linear programs. # Pick all sets for which the corresponding variable has value at least 1/ in the solution .

# Inapproximability results

When $n$ refers to the size of the universe, showed that set covering cannot be approximated in polynomial time to within a factor of $\tfrac\log_2 \approx 0.72\ln$, unless NP has
quasi-polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by t ...
algorithms.
Feige Feige is a surname. Notable people with the surname include: * Claude Feige (born 1958), French curler * David Feige, American lawyer, legal commentator and author * Eric Feige (born 1961), American politician * Gerhard Feige (born 1951), bishop of ...
(1998) improved this lower bound to $\bigl\left(1-o\left(1\right)\bigr\right)\cdot\ln$ under the same assumptions, which essentially matches the approximation ratio achieved by the greedy algorithm. established a lower bound of $c\cdot\ln$, where $c$ is a certain constant, under the weaker assumption that P$\not=$NP. A similar result with a higher value of $c$ was recently proved by . showed optimal inapproximability by proving that it cannot be approximated to $\bigl\left(1 - o\left(1\right)\bigr\right) \cdot \ln$ unless P$=$NP.

# Weighted set cover

Relaxing Leisure has often been defined as a quality of experience or as free time. Free time is time Time is the continued sequence of existence and event (philosophy), events that occurs in an apparently irreversible process, irreversible s ...
the integer linear program for weighted set cover stated
above Above may refer to: *Above (artist), Tavar Zawacki (born 1981), contemporary urban artist *Above (magazine), ''Above'' (magazine), an American environmental magazine 2009â€“2010 *Above (Mad Season album), ''Above'' (Mad Season album), 1995 *Above ...
, one may use
randomized rounding Within computer science and operations research, many combinatorial optimization problems are computationally intractability (complexity), intractable to solve exactly (to optimality). Many such problems do admit fast (polynomial time) approximatio ...
to get an $O\left(\log n\right)$-factor approximation. The corresponding analysis for nonweighted set cover is outlined in Randomized rounding#Randomized-rounding algorithm for set cover and can be adapted to the weighted case.

# Related problems

* Hitting set is an equivalent reformulation of Set Cover. *
Vertex cover In graph theory, a vertex cover (sometimes node cover) of a Graph (discrete mathematics), graph is a set of Vertex (graph theory), vertices that includes at least one endpoint of every Edge (graph theory), edge of the graph (discrete mathematics) ...
is a special case of Hitting Set. *
Edge cover In graph theory, an edge cover of a Graph (discrete mathematics), graph is a set of edge (graph theory), edges such that every vertex (graph theory), vertex of the graph is incident to at least one edge of the set. In computer science, the minimum e ...
is a special case of Set Cover. * Geometric set cover is a special case of Set Cover when the universe is a set of points in $\mathbb^d$ and the sets are induced by the intersection of the universe and geometric shapes (e.g., disks, rectangles). *
Set packing Set packing is a classical NP-complete problem in computational complexity theory and combinatorics, and was one of Karp's 21 NP-complete problems. Suppose one has a finite set ''S'' and a list of subsets of ''S''. Then, the set packing problem asks ...
* Maximum coverage problem is to choose at most k sets to cover as many elements as possible. *
Dominating set In graph theory, a dominating set for a Graph (discrete mathematics), graph ''G'' = (''V'', ''E'') is a subset ''D'' of ''V'' such that every vertex not in ''D'' is adjacent to at least one member of ''D''. The domination number Î ...

is the problem of selecting a set of vertices (the dominating set) in a graph such that all other vertices are adjacent to at least one vertex in the dominating set. The Dominating set problem was shown to be NP complete through a reduction from Set cover. * Exact cover problem is to choose a set cover with no element included in more than one covering set. * Red Blue Set Cover. * Set-cover abduction.

# References

* . * * . * * . * . * . * * *