The maximum coverage problem is a classical question in

computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

computational complexity theory In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and explores the relationships between these classifications. A computational problem ...

, and

operations research Operations research () (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a branch of applied mathematics that deals with the development and application of analytical methods to improve management and ...

. It is a problem that is widely taught in

approximation algorithms In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned sol ...

. As input you are given several sets and a number

k

. The sets may have some elements in common. You must select at most

k

of these sets such that the maximum number of elements are covered, i.e. the union of the selected sets has maximal size. Formally, (unweighted) Maximum Coverage : Instance: A number

k

and a collection of sets

S = \

. : Objective: Find a subset

S' \subseteq S

of sets, such that

\left,  S' \ \leq k

and the number of covered elements

\left,  \bigcup_ \

is maximized. The maximum coverage problem is

NP-hard In computational complexity theory, a computational problem ''H'' is called NP-hard if, for every problem ''L'' which can be solved in non-deterministic polynomial-time, there is a polynomial-time reduction from ''L'' to ''H''. That is, assumi ...

, and cannot be approximated to within

1 - \frac + o(1) \approx 0.632

under standard assumptions. This result essentially matches the approximation ratio achieved by the generic greedy algorithm used for maximization of submodular functions with a cardinality constraint. G. L. Nemhauser, L. A. Wolsey and M. L. Fisher. An analysis of approximations for maximizing submodular set functions I, Mathematical Programming 14 (1978), 265–294

ILP formulation

The maximum coverage problem can be formulated as the following integer linear program.

Greedy algorithm

The

greedy algorithm A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally ...

for maximum coverage chooses sets according to one rule: at each stage, choose a set which contains the largest number of uncovered elements. It can be shown that this algorithm achieves an approximation ratio of

1 - \frac

. ln-approximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for maximum coverage unless

P = NP

Known extensions

The inapproximability results apply to all extensions of the maximum coverage problem since they hold the maximum coverage problem as a special case. The Maximum Coverage Problem can be applied to road traffic situations; one such example is selecting which bus routes in a public transportation network should be installed with pothole detectors to maximise coverage, when only a limited number of sensors is available. This problem is a known extension of the Maximum Coverage Problem and was first explored in literature by Junade Ali and Vladimir Dyo.

Weighted version

In the weighted version every element

e_j

has a weight

w(e_j)

. The task is to find a maximum coverage which has maximum weight. The basic version is a special case when all weights are

1

. :maximize

\sum_ w(e_j) \cdot y_j

. (maximizing the weighted sum of covered elements). :subject to

\sum  \leq k

; (no more than

k

sets are selected). ::

\sum_ x_i \geq y_j

; (if

y_j > 0

then at least one set

e_j \in S_i

is selected). ::

y_j \in \

; (if

y_j=1

then

e_j

is covered) ::

x_i \in \

(if

x_i=1

then

S_i

is selected for the cover). The greedy algorithm for the weighted maximum coverage at each stage chooses a set that contains the maximum weight of uncovered elements. This algorithm achieves an approximation ratio of

1 - \frac

Budgeted maximum coverage

In the budgeted maximum coverage version, not only does every element

e_j

have a weight

w(e_j)

, but also every set

S_i

has a cost

c(S_i)

. Instead of

k

that limits the number of sets in the cover a budget

B

is given. This budget

B

limits the total cost of the cover that can be chosen. :maximize

\sum_ w(e_j) \cdot y_j

. (maximizing the weighted sum of covered elements). :subject to

\sum  \leq B

; (the cost of the selected sets cannot exceed

B

). ::

\sum_ x_i \geq y_j

; (if

y_j > 0

then at least one set

e_j \in S_i

is selected). ::

y_j \in \

; (if

y_j=1

then

e_j

is covered) ::

x_i \in \

(if

x_i=1

then

S_i

is selected for the cover). A greedy algorithm will no longer produce solutions with a performance guarantee. Namely, the worst case behavior of this algorithm might be very far from the optimal solution. The approximation algorithm is extended by the following way. First, define a modified greedy algorithm, that selects the set

S_i

that has the best ratio of weighted uncovered elements to cost. Second, among covers of cardinality

1, 2, ..., k-1

, find the best cover that does not violate the budget. Call this cover

H_1

. Third, find all covers of cardinality

k

that do not violate the budget. Using these covers of cardinality

k

as starting points, apply the modified greedy algorithm, maintaining the best cover found so far. Call this cover

H_2

. At the end of the process, the approximate best cover will be either

H_1

H_2

. This algorithm achieves an approximation ratio of

1-

for values of

k \geq 3

. This is the best possible approximation ratio unless

NP \subseteq DTIME(n^)

Generalized maximum coverage

In the generalized maximum coverage version every set

S_i

has a cost

c(S_i)

, element

e_j

has a different weight and cost depending on which set covers it. Namely, if

e_j

is covered by set

S_i

the weight of

e_j

w_i(e_j)

and its cost is

c_i(e_j)

. A budget

B

is given for the total cost of the solution. :maximize

\sum_ w_i(e_j) \cdot y_

. (maximizing the weighted sum of covered elements in the sets in which they are covered). :subject to

\sum + \sum  \leq B

; (the cost of the selected sets cannot exceed

B

). ::

\sum_ y_ \leq 1

; (element

e_j=1

can only be covered by at most one set). ::

\sum_ x_i \geq y_

; (if

y_j > 0

then at least one set

e_j \in S_i

is selected). ::

y_ \in \

; (if

y_=1

then

e_j

is covered by set

S_i

) ::

x_i \in \

(if

x_i=1

then

S_i

is selected for the cover).

Generalized maximum coverage algorithm

The algorithm uses the concept of residual cost/weight. The residual cost/weight is measured against a tentative solution and it is the difference of the cost/weight from the cost/weight gained by a tentative solution. The algorithm has several stages. First, find a solution using greedy algorithm. In each iteration of the greedy algorithm the tentative solution is added the set which contains the maximum residual weight of elements divided by the residual cost of these elements along with the residual cost of the set. Second, compare the solution gained by the first step to the best solution which uses a small number of sets. Third, return the best out of all examined solutions. This algorithm achieves an approximation ratio of

1-1/e - o(1)

Notes

References

* {{Cite book , last=Vazirani , first=Vijay V. , author-link=Vijay Vazirani , title=Approximation Algorithms , year=2001 , publisher=Springer-Verlag , isbn=978-3-540-65367-7 Families of sets NP-complete problems

ILP formulation

Greedy algorithm

Known extensions

Weighted version

Budgeted maximum coverage

Generalized maximum coverage

Generalized maximum coverage algorithm

Related problems

Notes

References