
The knapsack problem is a problem in
combinatorial optimization
Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combi ...
: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size
knapsack
A backpack—also called knapsack, schoolbag, rucksack, rucksac, pack, sackpack, booksack, bookbag or backsack—is, in its simplest frameless form, a fabric sack carried on one's back and secured with two straps that go over the shoulders ...
and must fill it with the most valuable items. The problem often arises in
resource allocation
In economics, resource allocation is the assignment of available resources to various uses. In the context of an entire economy, resources can be allocated by various means, such as markets, or planning.
In project management, resource allocati ...
where the decision-makers have to choose from a set of non-divisible projects or tasks under a fixed budget or time constraint, respectively.
The knapsack problem has been studied for more than a century, with early works dating as far back as 1897. The name "knapsack problem" dates back to the early works of the mathematician
Tobias Dantzig Tobias Dantzig (; February 19, 1884 – August 9, 1956) was an American mathematician, the father of George Dantzig, and the author of '' Number: The Language of Science (A critical survey written for the cultured non-mathematician)'' (1930) and ''A ...
(1884–1956), and refers to the commonplace problem of packing the most valuable or useful items without overloading the luggage.
Applications
Knapsack problems appear in real-world decision-making processes in a wide variety of fields, such as finding the least wasteful way to cut raw materials, selection of
investments and
portfolios, selection of assets for
asset-backed securitization, and generating keys for the
Merkle–Hellman and other
knapsack cryptosystems Knapsack cryptosystems are cryptosystems whose security is based on the hardness of solving the knapsack problem. They remain quite unpopular because simple versions of these algorithms have been broken for several decades. However, that type of cr ...
.
One early application of knapsack algorithms was in the construction and scoring of tests in which the test-takers have a choice as to which questions they answer. For small examples, it is a fairly simple process to provide the test-takers with such a choice. For example, if an exam contains 12 questions each worth 10 points, the test-taker need only answer 10 questions to achieve a maximum possible score of 100 points. However, on tests with a heterogeneous distribution of point values, it is more difficult to provide choices. Feuerman and Weiss proposed a system in which students are given a heterogeneous test with a total of 125 possible points. The students are asked to answer all of the questions to the best of their abilities. Of the possible subsets of problems whose total point values add up to 100, a knapsack algorithm would determine which subset gives each student the highest possible score.
A 1999 study of the Stony Brook University Algorithm Repository showed that, out of 75 algorithmic problems, the knapsack problem was the 19th most popular and the third most needed after
suffix tree
In computer science, a suffix tree (also called PAT tree or, in an earlier form, position tree) is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow parti ...
s and the
bin packing problem
The bin packing problem is an optimization problem, in which items of different sizes must be packed into a finite number of bins or containers, each of a fixed given capacity, in a way that minimizes the number of bins used. The problem has ma ...
.
Definition
The most common problem being solved is the 0-1 knapsack problem, which restricts the number ''
'' of copies of each kind of item to zero or one. Given a set of ''
'' items numbered from 1 up to ''
'', each with a weight ''
'' and a value ''
'', along with a maximum weight capacity ''
'',
: maximize
: subject to
and
.
Here ''
'' represents the number of instances of item ''
'' to include in the knapsack. Informally, the problem is to maximize the sum of the values of the items in the knapsack so that the sum of the weights is less than or equal to the knapsack's capacity.
The bounded knapsack problem (BKP) removes the restriction that there is only one of each item, but restricts the number
of copies of each kind of item to a maximum non-negative integer value
:
: maximize
: subject to
and
The unbounded knapsack problem (UKP) places no upper bound on the number of copies of each kind of item and can be formulated as above except that the only restriction on
is that it is a non-negative integer.
: maximize
: subject to
and
One example of the unbounded knapsack problem is given using the figure shown at the beginning of this article and the text "if any number of each box is available" in the caption of that figure.
Computational complexity
The knapsack problem is interesting from the perspective of computer science for many reasons:
* The
decision problem
In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whethe ...
form of the knapsack problem (''Can a value of at least ''V'' be achieved without exceeding the weight ''W''?'') is
NP-complete
In computational complexity theory, a problem is NP-complete when:
# it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryin ...
, thus there is no known algorithm that is both correct and fast (polynomial-time) in all cases.
* While the decision problem is NP-complete, the optimization problem is not, its resolution is at least as difficult as the decision problem, and there is no known polynomial algorithm which can tell, given a solution, whether it is optimal (which would mean that there is no solution with a larger ''V'', thus solving the NP-complete decision problem).
* There is a
pseudo-polynomial time In computational complexity theory, a numeric algorithm runs in pseudo-polynomial time if its running time is a polynomial in the ''numeric value'' of the input (the largest integer present in the input)—but not necessarily in the ''length'' of t ...
algorithm using
dynamic programming
Dynamic programming is both a mathematical optimization method and a computer programming method. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics.
I ...
.
* There is a
fully polynomial-time approximation scheme A fully polynomial-time approximation scheme (FPTAS) is an algorithm for finding approximate solutions to function problems, especially optimization problems. An FPTAS takes as input an instance of the problem and a parameter ε > 0. It r ...
, which uses the pseudo-polynomial time algorithm as a subroutine, described below.
* Many cases that arise in practice, and "random instances" from some distributions, can nonetheless be solved exactly.
There is a link between the "decision" and "optimization" problems in that if there exists a polynomial algorithm that solves the "decision" problem, then one can find the maximum value for the optimization problem in polynomial time by applying this algorithm iteratively while increasing the value of k. On the other hand, if an algorithm finds the optimal value of the optimization problem in polynomial time, then the decision problem can be solved in polynomial time by comparing the value of the solution output by this algorithm with the value of k. Thus, both versions of the problem are of similar difficulty.
One theme in research literature is to identify what the "hard" instances of the knapsack problem look like,
[Pisinger, D. 2003]
Where are the hard knapsack problems?
Technical Report 2003/08, Department of Computer Science, University of Copenhagen, Copenhagen, Denmark. or viewed another way, to identify what properties of instances in practice might make them more amenable than their worst-case NP-complete behaviour suggests.
The goal in finding these "hard" instances is for their use in
public key cryptography
Public-key cryptography, or asymmetric cryptography, is the field of cryptographic systems that use pairs of related keys. Each key pair consists of a public key and a corresponding private key. Key pairs are generated with cryptographic alg ...
systems, such as the
Merkle-Hellman knapsack cryptosystem.
Furthermore, notable is the fact that the hardness of the knapsack problem depends on the form of the input. If the weights and profits are given as integers, it is
weakly NP-complete In computational complexity, an NP-complete (or NP-hard) problem is weakly NP-complete (or weakly NP-hard) if there is an algorithm for the problem whose running time is polynomial in the dimension of the problem and the magnitudes of the data in ...
, while it is
strongly NP-complete In computational complexity, strong NP-completeness is a property of computational problems that is a special case of NP-completeness. A general computational problem may have numerical parameters. For example, the input to the bin packing problem ...
if the weights and profits are given as rational numbers.
However, in the case of rational weights and profits it still admits a
fully polynomial-time approximation scheme A fully polynomial-time approximation scheme (FPTAS) is an algorithm for finding approximate solutions to function problems, especially optimization problems. An FPTAS takes as input an instance of the problem and a parameter ε > 0. It r ...
.
Solving
Several algorithms are available to solve knapsack problems, based on the dynamic programming approach,
the
branch and bound
Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutio ...
approach
[S. Martello, P. Toth, Knapsack Problems: Algorithms and Computer Implementations,
John Wiley and Sons, 1990] or
hybridizations of both approaches.
[S. Martello, D. Pisinger, P. Toth]
Dynamic programming and strong bounds for the 0-1 knapsack problem
''Manag. Sci.'', 45:414–424, 1999.
Dynamic programming in-advance algorithm
The unbounded knapsack problem (UKP) places no restriction on the number of copies of each kind of item. Besides, here we assume that
:
: subject to
and
Observe that