The vertex ''k''-center problem is a classical
NP-hard
In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
problem in
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
. It has application in
facility location Facility location is a name given to several different problems in computer science and in game theory:
* Facility location problem
The study of facility location problems (FLP), also known as location analysis, is a branch of operations research ...
and
clustering. Basically, the vertex ''k''-center problem models the following real problem: given a city with
facilities, find the best
facilities where to build fire stations. Since firemen must attend any emergency as quickly as possible, the distance from the farthest facility to its nearest fire station has to be as small as possible. In other words, the position of the fire stations must be such that every possible fire is attended as quickly as possible.
Formal definition
The vertex ''k''-center problem is a classical
NP-Hard
In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
problem in
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
. It was first proposed by Hakimi in 1964. Formally, the vertex ''k''-center problem consists in: given a complete undirected
graph
Graph may refer to:
Mathematics
*Graph (discrete mathematics), a structure made of vertices and edges
**Graph theory, the study of such graphs and their properties
*Graph (topology), a topological space resembling a graph in the sense of discre ...
in a
metric space
In mathematics, a metric space is a set together with a notion of '' distance'' between its elements, usually called points. The distance is measured by a function called a metric or distance function. Metric spaces are the most general sett ...
, and a positive integer
, find a subset
such that
and the objective function
is minimized. The distance
is defined as the distance from the vertex
to its nearest center in
.
Approximation algorithms
If
, the vertex ''k''-center problem can not be (optimally) solved in polynomial time. However, there are some polynomial time
approximation algorithm
In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned solu ...
s that get near-optimal solutions. Specifically,
2-approximated solutions. Actually, if
the best possible solution that can be achieved by a polynomial time algorithm is a 2-approximated one.
In the context of a minimization problem, such as the vertex ''k''-center problem, a 2-approximated solution is any solution
such that
, where
is the size of an optimal solution. An algorithm that guarantees to generate 2-approximated solutions is known as a 2-approximation algorithm. The main 2-approximated algorithms for the vertex ''k''-center problem reported in the literature are the Sh algorithm,
the HS algorithm,
and the Gon algorithm.
Even though these algorithms are the (polynomial) best possible ones, their performance on most benchmark datasets is very deficient. Because of this, many
heuristics
A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate, ...
and
metaheuristic
In computer science and mathematical optimization, a metaheuristic is a higher-level procedure or heuristic designed to find, generate, or select a heuristic (partial search algorithm) that may provide a sufficiently good solution to an optimizatio ...
s have been developed through the time. Contrary to common sense, one of the most practical (polynomial) heuristics for the vertex ''k''-center problem is based on the CDS algorithm, which is a 3-approximation algorithm
The Sh algorithm
Formally characterized by
David Shmoys
David Bernard Shmoys (born 1959) is a Professor in the School of Operations Research and Information Engineering and the Department of Computer Science at Cornell University. He obtained his Ph.D. from the University of California, Berkeley in 1 ...
in 1995,
the Sh algorithm takes as input a complete undirected graph
, a positive integer
, and an assumption
on what the optimal solution size is. The Sh algorithm works as follows: selects the first center
at random. So far, the solution consists of only one vertex,
. Next, selects center
at random from the set containing all the vertices whose distance from
is greater than
. At this point,
. Finally, selects the remaining
centers the same way
was selected. The complexity of the Sh algorithm is
, where
is the number of vertices.
The HS algorithm
Proposed by
Dorit Hochbaum and
David Shmoys
David Bernard Shmoys (born 1959) is a Professor in the School of Operations Research and Information Engineering and the Department of Computer Science at Cornell University. He obtained his Ph.D. from the University of California, Berkeley in 1 ...
in 1985, the HS algorithm takes the Sh algorithm as basis.
By noticing that the value of
must equals the cost of some edge in
, and since there are
edges in
, the HS algorithm basically repeats the Sh algorithm with every edge cost. The complexity of the HS algorithm is
. However, by running a
binary search
In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the m ...
over the ordered set of edge costs, its complexity is reduced to
.
The Gon algorithm
Proposed independently by
Teofilo Gonzalez,
and by Martin Dyer and
Alan Frieze in 1985, the Gon algorithm is basically a more powerful version of the Sh algorithm. While the Sh algorithm requires a guess
on
, the Gon algorithm prescinds from such guess by noticing that if any set of vertices at distance greater than
exists, then the farthest vertex must be inside such set. Therefore, instead of computing at each iteration the set of vertices at distance greater than
and then selecting a random vertex, the Gon algorithm simply selects the farthest vertex from every partial solution
. The complexity of the Gon algorithm is
, where
is the number of vertices.
The CDS algorithm
Proposed by García Díaz et al. in 2017,
the CDS algorithm is a 3-approximation algorithm that takes ideas from the Gon algorithm (farthest point heuristic), the HS algorithm (parametric pruning), and the relationship between the vertex ''k''-center problem and the
Dominating Set
In graph theory, a dominating set for a graph is a subset of its vertices, such that any vertex of is either in , or has a neighbor in . The domination number is the number of vertices in a smallest dominating set for .
The dominating set ...
problem. The CDS algorithm has a complexity of
. However, by performing a binary search over the ordered set of edge costs, a more efficiente heuristic named CDSh is proposed. The CDSh algorithm complexity is
. Despite the suboptimal performance of the CDS algorithm, and the heuristic performance of CDSh, both present a much better performance than the Sh, HS, and Gon algorithms.
Parameterized approximations
It can be shown that the ''k''-Center problem is
W hard">hard to approximate within a factor of 2 − ε for any ε > 0, when using ''k'' as the parameter. This is also true when parameterizing by the
doubling dimension (in fact the dimension of a
Manhattan metric
A taxicab geometry or a Manhattan geometry is a geometry whose usual distance function or metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the absolute differences of their Cartesian co ...
), unless
P=NP
The P versus NP problem is a major unsolved problem in theoretical computer science. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved.
The informal term ''quickly'', used abov ...
. When considering the combined parameter given by ''k'' and the
doubling dimension, ''k''-Center is still W
hard but it is possible to obtain a
parameterized approximation scheme. This is even possible for the variant with vertex capacities, which bound how many vertices can be assigned to an opened center of the solution.
Experimental comparison
Some of the most widely used benchmark datasets for the vertex ''k''-center problem are the pmed instances from OR-Lib., and some instances from TSP-Lib. Table 1 shows the mean and standard deviation of the experimental approximation factors of the solutions generated by each algorithm over the 40 pmed instances from OR-Lib
Polynomial heuristics
Greedy pure algorithm
The greedy pure algorithm (or Gr) follows the core idea of
greedy algorithm
A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locall ...
s: to take optimal local decisions. In the case of the vertex ''k''-center problem, the optimal local decision consists in selecting each center in such a way that the size of the solution (covering radius) is minimum at each iteration. In other words, the first center selected is the one that solves the 1-center problem. The second center selected is the one that, along with the previous center, generates a solution with minimum covering radius. The remaining centers are selected the same way. The complexity of the Gr algorithm is
. The empirical performance of the Gr algorithm is poor on most benchmark instances.
Scoring algorithm
The Scoring algorithm (or Scr) was introduced by Jurij Mihelič and Borut Robič in 2005.
This algorithm takes advantage of the reduction from the vertex ''k''-center problem to the minimum dominating set problem. The problem is solved by pruning the input graph with every possible value of the optimal solution size and then solving the minimum dominating set problem heuristically. This heuristic follows the ''lazy principle,'' which takes every decision as slow as possible (opossed to the greedy strategy). The complexity of the Scr algorithm is
. The empirical performance of the Scr algorithm is very good on most benchmark instances. However, its running time rapidly becomes impractical as the input grows. So, it seems to be a good algorithm only for small instances.
References
{{reflist
NP-hard problems
Graph theory
Combinatorial optimization
Approximation algorithms