bucket queue
   HOME

TheInfoList



OR:

A bucket queue is a
data structure In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, ...
that implements the
priority queue In computer science, a priority queue is an abstract data-type similar to a regular queue or stack data structure in which each element additionally has a ''priority'' associated with it. In a priority queue, an element with high priority is se ...
abstract data type In computer science, an abstract data type (ADT) is a mathematical model for data types. An abstract data type is defined by its behavior (semantics) from the point of view of a ''user'', of the data, specifically in terms of possible values, pos ...
: it maintains a dynamic collection of elements with numerical priorities and allows quick access to the element with minimum (or maximum) priority. In the bucket queue, the priorities must be
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the languag ...
s, and it is particularly suited to applications in which the priorities have a small range. A bucket queue has the form of an array of
buckets A bucket is typically a watertight, vertical cylinder or truncated cone or square, with an open top and a flat bottom, attached to a semicircular carrying handle called the ''bail''. A bucket is usually an open-top container. In contrast, a ...
: an
array data structure In computer science, an array is a data structure consisting of a collection of ''elements'' (values or variables), each identified by at least one ''array index'' or ''key''. An array is stored such that the position of each element can be ...
, indexed by the priorities, whose cells contain collections of items with the same priority as each other. With this data structure, insertion of elements and changes of their priority take constant time. Searching for and removing the minimum-priority element takes time proportional to the number of buckets or, by maintaining a pointer to the most recently found bucket, in time proportional to the difference in priorities between successive operations. The bucket queue is the priority-queue analogue of
pigeonhole sort __NOTOC__ Pigeonhole sorting is a sorting algorithm that is suitable for sorting lists of elements where the number ''n'' of elements and the length ''N'' of the range of possible key values are approximately the same. It requires O(''n'' + ''N'' ...
(also called bucket sort), a sorting algorithm that places elements into buckets indexed by their priorities and then concatenates the buckets. Using a bucket queue as the priority queue in a
selection sort In computer science, selection sort is an in-place comparison sorting algorithm. It has an O(''n''2) time complexity, which makes it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is no ...
gives a form of the pigeonhole sort algorithm. Bucket queues are also called bucket priority queues or bounded-height priority queues. When used for quantized approximations to
real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every ...
priorities, they are also called untidy priority queues or pseudo priority queues. They are closely related to the calendar queue, a structure that uses a similar array of buckets for exact prioritization by real numbers. Applications of the bucket queue include computation of the degeneracy of a graph, fast
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s for
shortest path In graph theory, the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized. The problem of finding the shortest path between ...
s and widest paths for graphs with weights that are small integers or are already sorted, and greedy
approximation algorithm In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned sol ...
s for the
set cover problem The set cover problem is a classical question in combinatorics, computer science, operations research, and complexity theory. It is one of Karp's 21 NP-complete problems shown to be NP-complete in 1972. Given a set of elements (called the un ...
. The quantized version of the structure has also been applied to scheduling and to marching cubes in
computer graphics Computer graphics deals with generating images with the aid of computers. Today, computer graphics is a core technology in digital photography, film, video games, cell phone and computer displays, and many specialized applications. A great de ...
. The first use of the bucket queue. See also p. 157 for the history and naming of this structure. was in a shortest path algorithm by ..


Operation


Basic data structure

A bucket queue can handle elements with integer priorities in the range from 0 or 1 up to some known bound , and operations that insert elements, change the priority of elements, or extract (find and remove) the element that has the minimum (or maximum) priority. It consists of an array of container data structures; in most sources these containers are
doubly linked list In computer science, a doubly linked list is a linked data structure that consists of a set of sequentially linked records called nodes. Each node contains three fields: two link fields (references to the previous and to the next node in the s ...
s but they could alternatively be
dynamic array In computer science, a dynamic array, growable array, resizable array, dynamic table, mutable array, or array list is a random access, variable-size list data structure that allows elements to be added or removed. It is supplied with standard li ...
s or dynamic sets. The container in the th array cell stores the collection of elements whose priority A bucket queue can handle the following operations: * To insert an element with priority , add to the container at . * To change the priority of an element, remove it from the container for its old priority and re-insert it into the container for its new priority. * To extract an element with the minimum or maximum priority, perform a
sequential search In computer science, a linear search or sequential search is a method for finding an element within a list. It sequentially checks each element of the list until a match is found or the whole list has been searched. A linear search runs in at ...
in the array to find the first or last non-empty container, respectively, choose an arbitrary element from this container, and remove it from the container. In this way, insertions and priority changes take constant time, and extracting the minimum or maximum priority element takes time ...


Optimizations

As an optimization, the data structure can start each sequential search for a non-empty bucket at the most recently-found non-empty bucket instead of at the start of the array. This can be done in either of two different ways, lazy (delaying these sequential searches until they are necessary) or eager (doing the searches ahead of time). The choice of when to do the search affects which of the data structure operations is slowed by these searches. Dial's original version of the structure used a lazy search. This can be done by maintaining an index that is a
lower bound In mathematics, particularly in order theory, an upper bound or majorant of a subset of some preordered set is an element of that is greater than or equal to every element of . Dually, a lower bound or minorant of is defined to be an elemen ...
on the minimum priority of any element currently in the queue. When inserting a new element, should be updated to the minimum of its old value and the new element's priority. When searching for the minimum priority element, the search can start at instead of at zero, and after the search should be left equal to the priority that was found in the search. Alternatively, the eager version of this optimization keeps updated so that it always points to the first non-empty bucket. When inserting a new element with a priority smaller than , the data structure sets to the new priority, and when removing the last element from a bucket with priority , it performs a sequential search through larger indexes until finding a non-empty bucket and setting to the priority of the resulting bucket. In either of these two variations, each sequential search takes time proportional to the difference between the old and new values of . This could be significantly faster than the time bound for the searches in the un-optimized version of the data structure. In many applications of priority queues such as
Dijkstra's algorithm Dijkstra's algorithm ( ) is an algorithm for finding the shortest paths between nodes in a graph, which may represent, for example, road networks. It was conceived by computer scientist Edsger W. Dijkstra in 1956 and published three years ...
, the minimum priorities form a
monotonic sequence In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of order ...
, allowing a monotone priority queue to be used. In these applications, for both the lazy and eager variations of the optimized structure, the sequential searches for non-empty buckets cover disjoint ranges of buckets. Because each bucket is in at most one of these ranges, their numbers of steps add to at most . Therefore, in these applications, the total time for a sequence of operations is , rather than the slower time bound that would result without this optimization. A corresponding optimization can be applied in applications where a bucket queue is used to find elements of maximum priority, but in this case it should maintain an index that upper-bounds the maximum priority, and the sequential search for a non-empty bucket should proceed downwards from this upper bound. Another optimization (already given by ) can be used to save space when the priorities are monotonic and, throughout the course of an algorithm, always fall within a range of values rather than extending over the whole range from 0 to . In this case, one can index the array by the priorities modulo rather than by their actual values. The search for the minimum priority element should always begin at the previous minimum, to avoid priorities that are higher than the minimum but have lower moduli. In particular, this idea can be applied in Dijkstra's algorithm on graphs whose edge lengths are integers in the range from 1 to . Because creating a new bucket queue involves initializing an array of empty buckets, this initialization step takes time proportional to the number of priorities. A variation of the bucket queue described by Donald B. Johnson in 1981 instead stores only the non-empty buckets in a linked list, sorted by their priorities, and uses an auxiliary search tree to quickly find the position in this linked list for any new buckets. It takes time to initialize this variant structure, constant time to find an element with minimum or maximum priority, and time to insert or delete an element, where is the difference between the nearest smaller and larger priorities to the priority of the inserted or deleted element.


Example

For example, consider a bucket queue with four priorities, the numbers 0, 1, 2, and 3. It consists of an array A whose four cells each contain a collection of elements, initially empty. For the purposes of this example, A can be written as a bracketed sequence of four sets: A = emptyset, \emptyset, \emptyset, \emptyset/math>. Consider a sequence of operations in which we insert two elements x and y with the same priority 1, insert a third element z with priority 3, change the priority of x to 3, and then perform two extractions of the minimum-priority element. *After inserting x with priority 1, A = emptyset, \, \emptyset, \emptyset/math>. *After inserting y with priority 1, A = emptyset, \, \emptyset, \emptyset/math>. *After inserting z with priority 3, A = emptyset, \, \emptyset, \/math>. *Changing the priority of x from 1 to three involves removing it from A /math> and adding it to A /math>, after which A = emptyset, \, \emptyset, \/math>. *Extracting the minimum-priority element, in the basic version of the bucket queue, searches from the start of A to find its first non-empty element: A /math> is empty but A = \, a non-empty set. It chooses an arbitrary element of this set (the only element, y) as the minimum-priority element. Removing y from the structure leaves A = emptyset, \emptyset, \emptyset, \/math>. *The second extract operation, in the basic version of the bucket queue, searches again from the start of the array: A = \emptyset, A = \emptyset, A = \emptyset, A = non-empty. In the improved variants of the bucket queue, this search starts instead at the last position that was found to be non-empty, A /math>. In either case, A = \ is found to be the first non-empty set. One of its elements is chosen arbitrarily as the minimum-priority element; for example, z might be chosen. This element is removed, leaving A = emptyset, \emptyset, \emptyset, \/math>.


Applications


Graph degeneracy

A bucket queue can be used to maintain the vertices of an
undirected graph In discrete mathematics, and more specifically in graph theory, a graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related". The objects correspond to mathematical abstractions called '' ve ...
, prioritized by their degrees, and repeatedly find and remove the vertex of minimum degree. This
greedy algorithm A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally ...
can be used to calculate the degeneracy of a given graph, equal to the largest degree of any vertex at the time of its removal. The algorithm takes
linear time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
, with or without the optimization that maintains a lower bound on the minimum priority, because each vertex is found in time proportional to its degree and the sum of all vertex degrees is linear in the number of edges of the graph.


Dial's algorithm for shortest paths

In Dijkstra's algorithm for
shortest path In graph theory, the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized. The problem of finding the shortest path between ...
s in
directed graph In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs. Definition In formal terms, a directed graph is an ordered pa ...
s with edge weights that are positive integers, the priorities are monotone,. and a monotone bucket queue can be used to obtain a time bound of , where is the number of edges, is the diameter of the network, and is the maximum (integer) link cost.; see in particular Section 8.3.3.6, "Dial's implementation", pp. 194–195. This variant of Dijkstra's algorithm is also known as Dial's algorithm, after Robert B. Dial, who published it in 1969. The same idea also works, using a quantized bucket queue, for graphs with positive real edge weights when the ratio of the maximum to minimum weight is at most . In this quantized version of the algorithm, the vertices are processed out of order, compared to the result with a non-quantized priority queue, but the correct shortest paths are still found. In these algorithms, the priorities will only span a range of width , so the modular optimization can be used to reduce the space to . A variant of the same algorithm can be used for the
widest path problem In graph algorithms, the widest path problem is the problem of finding a path between two designated vertices in a weighted graph, maximizing the weight of the minimum-weight edge in the path. The widest path problem is also known as the maximum ...
. In combination with methods for quickly partitioning non-integer edge weights into subsets that can be assigned integer priorities, it leads to near-linear-time solutions to the single-source single-destination version of the widest path problem.


Greedy set cover

The
set cover problem The set cover problem is a classical question in combinatorics, computer science, operations research, and complexity theory. It is one of Karp's 21 NP-complete problems shown to be NP-complete in 1972. Given a set of elements (called the un ...
has as its input a
family of sets In set theory and related branches of mathematics, a collection F of subsets of a given set S is called a family of subsets of S, or a family of sets over S. More generally, a collection of any sets whatsoever is called a family of sets, set fami ...
. The output should be a subfamily of these sets, with the same union as the original family, including as few sets as possible. It is
NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
, but has a greedy approximation algorithm that achieves a logarithmic approximation ratio, essentially the best possible unless
P = NP The P versus NP problem is a major unsolved problem in theoretical computer science. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved. The informal term ''quickly'', used above ...
. This approximation algorithm selects its subfamily by repeatedly choosing a set that covers the maximum possible number of remaining uncovered elements. A standard exercise in algorithm design asks for an implementation of this algorithm that takes linear time in the input size, which is the sum of sizes of all the input sets. This may be solved using a bucket queue of sets in the input family, prioritized by the number of remaining elements that they cover. Each time that the greedy algorithm chooses a set as part of its output, the newly covered set elements should be subtracted from the priorities of the other sets that cover them; over the course of the algorithm the number of these changes of priorities is just the sum of sizes of the input sets. The priorities are monotonically decreasing integers, upper-bounded by the number of elements to be covered. Each choice of the greedy algorithm involves finding the set with the maximum priority, which can be done by scanning downwards through the buckets of the bucket queue, starting from the most recent previous maximum value. The total time is linear in the input size.. See in particular Section 2.4, "Priority Queue", p. 22.


Scheduling

Bucket queues can be used to schedule tasks with deadlines, for instance in
packet forwarding Packet forwarding is the relaying of packets from one network segment to another by nodes in a computer network. The network layer in the OSI model is responsible for packet forwarding. Models The simplest forwarding modelunicastinginvolve ...
for internet data with
quality of service Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network, or a cloud computing service, particularly the performance seen by the users of the network. To quantitat ...
guarantees. For this application, the deadlines should be quantized into discrete intervals, and tasks whose deadlines fall into the same interval are considered to be of equivalent priority. A variation of the quantized bucket queue data structure, the calendar queue, has been applied to scheduling of
discrete-event simulation A discrete-event simulation (DES) models the operation of a system as a ( discrete) sequence of events in time. Each event occurs at a particular instant in time and marks a change of state in the system. Between consecutive events, no change in t ...
s, where the elements in the queue are future events prioritized by the time within the simulation that the events should happen. In this application, the ordering of events is critical, so the priorities cannot be approximated. Therefore, the calendar queue performs searches for the minimum-priority element in a different way than a bucket queue: in the bucket queue, any element of the first non-empty bucket may be returned, but instead the calendar queue searches all the elements in that bucket to determine which of them has the smallest non-quantized priority. To keep these searches fast, this variation attempts to keep the number of buckets proportional to the number of elements, by adjusting the scale of quantization and rebuilding the data structure when it gets out of balance. Calendar queues may be slower than bucket queues in the worst case (when many elements all land in the same smallest bucket) but are fast when elements are uniformly distributed among buckets causing the average bucket size to be constant.


Fast marching

In
applied mathematics Applied mathematics is the application of mathematical methods by different fields such as physics, engineering, medicine, biology, finance, business, computer science, and industry. Thus, applied mathematics is a combination of mathemati ...
and
numerical method In numerical analysis, a numerical method is a mathematical tool designed to solve numerical problems. The implementation of a numerical method with an appropriate convergence check in a programming language is called a numerical algorithm. Mathem ...
s for the solution of
differential equation In mathematics, a differential equation is an equation that relates one or more unknown functions and their derivatives. In applications, the functions generally represent physical quantities, the derivatives represent their rates of change, ...
s, untidy priority queues have been used to prioritize the steps of the
fast marching method The fast marching methodJ.A. Sethian. A Fast Marching Level Set Method for Monotonically Advancing Fronts, Proc. Natl. Acad. Sci., 93, 4, pp.1591--1595, 1996/ref> is a numerical method created by James Sethian for solving boundary value problems ...
for solving
boundary value problem In mathematics, in the field of differential equations, a boundary value problem is a differential equation together with a set of additional constraints, called the boundary conditions. A solution to a boundary value problem is a solution to t ...
s of the
Eikonal equation An eikonal equation (from Greek εἰκών, image) is a non-linear first-order partial differential equation that is encountered in problems of wave propagation. The classical eikonal equation in geometric optics is a differential equation o ...
, used to model
wave propagation Wave propagation is any of the ways in which waves travel. Single wave propagation can be calculated by 2nd order wave equation ( standing wavefield) or 1st order one-way wave equation. With respect to the direction of the oscillation relative ...
. This method finds the times at which a moving boundary crosses a set of discrete points (such as the points of an integer grid) using a prioritization method resembling a continuous version of Dijkstra's algorithm, and its running time is dominated by its priority queue of these points. It can be sped up to linear time by rounding the priorities used in this algorithm to integers, and using a bucket queue for these integers. As in Dijkstra's and Dial's algorithms, the priorities are monototone, so fast marching can use the monotone optimization of the bucket queue and its analysis. However, the discretization introduces some error into the resulting calculations.


See also

* Soft heap, a different way of speeding up priority queues by using approximate priorities


References

{{reflist Priority queues