HOME

TheInfoList



OR:

The set cover problem is a classical question in
combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many a ...
,
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
,
operations research Operations research ( en-GB, operational research) (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a discipline that deals with the development and application of analytical methods to improve decis ...
, and complexity theory. It is one of Karp's 21 NP-complete problems shown to be NP-complete in 1972. Given a
set Set, The Set, SET or SETS may refer to: Science, technology, and mathematics Mathematics *Set (mathematics), a collection of elements *Category of sets, the category whose objects and morphisms are sets and total functions, respectively Electro ...
of elements (called the
universe The universe is all of space and time and their contents, including planets, stars, galaxies, and all other forms of matter and energy. The Big Bang theory is the prevailing cosmological description of the development of the univers ...
) and a collection of sets whose union equals the universe, the set cover problem is to identify the smallest sub-collection of whose union equals the universe. For example, consider the universe and the collection of sets Clearly the union of is . However, we can cover all of the elements with the following, smaller number of sets: More formally, given a universe \mathcal and a family \mathcal of subsets of \mathcal, a ''cover'' is a subfamily \mathcal\subseteq\mathcal of sets whose union is \mathcal. In the set covering
decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whe ...
, the input is a pair (\mathcal,\mathcal) and an integer k; the question is whether there is a set covering of size k or less. In the set covering
optimization problem In mathematics, computer science and economics, an optimization problem is the problem of finding the ''best'' solution from all feasible solutions. Optimization problems can be divided into two categories, depending on whether the variables ...
, the input is a pair (\mathcal,\mathcal), and the task is to find a set covering that uses the fewest sets. The decision version of set covering is NP-complete, and the optimization/search version of set cover is
NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
. It is a problem "whose study has led to the development of fundamental techniques for the entire field" of
approximation algorithms In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned s ...
. If each set is assigned a
cost In production, research, retail, and accounting, a cost is the value of money that has been used up to produce something or deliver a service, and hence is not available for use anymore. In business, the cost may be one of acquisition, in whic ...
, it becomes a ''weighted'' set cover problem.


Integer linear program formulation

The minimum set cover problem can be formulated as the following
integer linear program An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objectiv ...
(ILP). This ILP belongs to the more general class of ILPs for
covering problem In combinatorics and computer science, covering problems are computational problems that ask whether a certain combinatorial structure 'covers' another, or how large the structure has to be to do that. Covering problems are minimization problems ...
s. The
integrality gap In mathematics, the relaxation of a (mixed) integer linear program is the problem that arises by removing the integrality constraint of each variable. For example, in a 0–1 integer program, all constraints are of the form :x_i\in\. The relax ...
of this ILP is at most \scriptstyle \log n, so its relaxation gives a factor-\scriptstyle \log n
approximation algorithm In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned sol ...
for the minimum set cover problem (where \scriptstyle n is the size of the universe). In weighted set cover, the sets are assigned weights. Denote the weight of set s\in \mathcal by w_. Then the integer linear program describing weighted set cover is identical to the one given above, except that the objective function to minimize is \sum_ w_s x_s.


Hitting set formulation

Set covering is equivalent to the hitting set problem. That is seen by observing that an instance of set covering can be viewed as an arbitrary
bipartite graph In the mathematical field of graph theory, a bipartite graph (or bigraph) is a graph whose vertices can be divided into two disjoint and independent sets U and V, that is every edge connects a vertex in U to one in V. Vertex sets U and V a ...
, with the universe represented by vertices on the left, the sets represented by vertices on the right, and edges representing the inclusion of elements in sets. The task is then to find a minimum cardinality subset of right-vertices which covers all of the left-vertices, which is precisely the Hitting set problem.


Greedy algorithm

There is a greedy algorithm for polynomial time approximation of set covering that chooses sets according to one rule: at each stage, choose the set that contains the largest number of uncovered elements. This method can be implemented in time linear in the sum of sizes of the input sets, using a bucket queue to prioritize the sets. It achieves an approximation ratio of H(s), where s is the size of the set to be covered. In other words, it finds a covering that may be H(n) times as large as the minimum one, where H(n) is the n-th
harmonic number In mathematics, the -th harmonic number is the sum of the reciprocals of the first natural numbers: H_n= 1+\frac+\frac+\cdots+\frac =\sum_^n \frac. Starting from , the sequence of harmonic numbers begins: 1, \frac, \frac, \frac, \frac, \do ...
: H(n) = \sum_^ \frac \le \ln +1 This greedy algorithm actually achieves an approximation ratio of H(s^\prime) where s^\prime is the maximum cardinality set of S. For \delta-dense instances, however, there exists a c \ln-approximation algorithm for every c > 0. There is a standard example on which the greedy algorithm achieves an approximation ratio of \log_2(n)/2. The universe consists of n=2^-2 elements. The set system consists of k pairwise disjoint sets S_1,\ldots,S_k with sizes 2,4,8,\ldots,2^k respectively, as well as two additional disjoint sets T_0,T_1, each of which contains half of the elements from each S_i. On this input, the greedy algorithm takes the sets S_k,\ldots,S_1, in that order, while the optimal solution consists only of T_0 and T_1. An example of such an input for k=3 is pictured on the right. Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for set cover up to lower order terms (see Inapproximability results below), under plausible complexity assumptions. A tighter analysis for the greedy algorithm shows that the approximation ratio is exactly \ln - \ln + \Theta(1).


Low-frequency systems

If each element occurs in at most sets, then a solution can be found in polynomial time that approximates the optimum to within a factor of using LP relaxation. If the constraint x_S\in\ is replaced by x_S \geq 0 for all in \mathcal in the integer linear program shown above, then it becomes a (non-integer) linear program . The algorithm can be described as follows: # Find an optimal solution for the program using some polynomial-time method of solving linear programs. # Pick all sets for which the corresponding variable has value at least 1/ in the solution .


Inapproximability results

When n refers to the size of the universe, showed that set covering cannot be approximated in polynomial time to within a factor of \tfrac\log_2 \approx 0.72\ln, unless NP has
quasi-polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
algorithms. Feige (1998) improved this lower bound to \bigl(1-o(1)\bigr)\cdot\ln under the same assumptions, which essentially matches the approximation ratio achieved by the greedy algorithm. established a lower bound of c\cdot\ln, where c is a certain constant, under the weaker assumption that P\not=NP. A similar result with a higher value of c was recently proved by . showed optimal inapproximability by proving that it cannot be approximated to \bigl(1 - o(1)\bigr) \cdot \ln unless P=NP.


Weighted set cover

Relaxing Leisure has often been defined as a quality of experience or as free time. Free time is time spent away from business, work, job hunting, domestic chores, and education, as well as necessary activities such as eating and sleeping. Leisure ...
the integer linear program for weighted set cover stated above, one may use
randomized rounding Within computer science and operations research, many combinatorial optimization problems are computationally intractable to solve exactly (to optimality). Many such problems do admit fast ( polynomial time) approximation algorithms—that is, a ...
to get an O(\log n)-factor approximation. Non weighted set cover can be adapted to the weighted case.


Related problems

* Hitting set is an equivalent reformulation of Set Cover. * Vertex cover is a special case of Hitting Set. *
Edge cover In graph theory, an edge cover of a graph is a set of edges such that every vertex of the graph is incident to at least one edge of the set. In computer science, the minimum edge cover problem is the problem of finding an edge cover of minimum siz ...
is a special case of Set Cover. * Geometric set cover is a special case of Set Cover when the universe is a set of points in \mathbb^d and the sets are induced by the intersection of the universe and geometric shapes (e.g., disks, rectangles). *
Set packing Set packing is a classical NP-complete problem in computational complexity theory and combinatorics, and was one of Karp's 21 NP-complete problems. Suppose one has a finite set ''S'' and a list of subsets of ''S''. Then, the set packing problem asks ...
*
Maximum coverage problem The maximum coverage problem is a classical question in computer science, computational complexity theory, and operations research. It is a problem that is widely taught in approximation algorithms. As input you are given several sets and a numbe ...
is to choose at most k sets to cover as many elements as possible. * Dominating set is the problem of selecting a set of vertices (the dominating set) in a graph such that all other vertices are adjacent to at least one vertex in the dominating set. The Dominating set problem was shown to be NP complete through a reduction from Set cover. * Exact cover problem is to choose a set cover with no element included in more than one covering set. * Red Blue Set Cover. * Set-cover abduction.


Notes


References

* . * * . * * . * . * . * * *


External links


Benchmarks with Hidden Optimum Solutions for Set Covering, Set Packing and Winner Determination


{{DEFAULTSORT:Set Cover Problem Families of sets NP-complete problems Linear programming Approximation algorithms Covering problems