In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
, the predecessor problem involves maintaining a
set of items to, given an element, efficiently
query which element precedes or succeeds that element in an order.
Data structures
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, a ...
used to solve the problem include
balanced binary search trees,
van Emde Boas trees, and
fusion tree
In computer science, a fusion tree is a type of tree data structure that implements an associative array on -bit integers on a finite universe, where each of the input integer has size less than 2w and is non-negative. When operating on a collecti ...
s. In the static predecessor problem, the set of elements does not change, but in the dynamic predecessor problem, insertions into and deletions from the set are allowed.
The predecessor problem is a simple case of the
nearest neighbor problem, and data structures that solve it have applications in problems like
integer sorting
In computer science, integer sorting is the algorithmic problem of sorting a collection of data values by integer keys. Algorithms designed for integer sorting may also often be applied to sorting problems in which the keys are floating point num ...
.
Definition
The problem consists of maintaining a set , which contains a subset of integers. Each of these
integers
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
can be stored with a
word size
In computing, a word is the natural unit of data used by a particular processor design. A word is a fixed-sized datum handled as a unit by the instruction set or the hardware of the processor. The number of bits or digits in a word (the ''word ...
of , implying that
. Data structures that solve the problem support these operations:
*
predecessor(x)
, which returns the largest element in less than or equal to
*
successor(x)
, which returns the smallest element in greater than or equal to
In addition, data structures which solve the
dynamic
Dynamics (from Greek δυναμικός ''dynamikos'' "powerful", from δύναμις ''dynamis'' " power") or dynamic may refer to:
Physics and engineering
* Dynamics (mechanics)
** Aerodynamics, the study of the motion of air
** Analytical dyna ...
version of the problem also support these operations:
*
insert(x)
, which adds to the set
*
delete(x)
, which removes from the set
The problem is typically analyzed in a
transdichotomous model of computation
In computer science, and more specifically in computability theory and computational complexity theory, a model of computation is a model which describes how an output of a mathematical function is computed given an input. A model describes h ...
such as
word RAM In theoretical computer science, the word RAM (word random-access machine) model is a model of computation in which a random-access machine does bitwise operations on a word of bits. Michael Fredman and Dan Willard created it in 1990 to simulate ...
.
Data structures

One simple solution to this problem is to use a
balanced binary search tree, which achieves (in
Big O notation) a
running time
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by t ...
of
for predecessor queries. The
Van Emde Boas tree achieves a query time of
, but requires
space
Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually con ...
.
Dan Willard
Dan Edward Willard is an American computer scientist and logician, and is a professor of computer science at the University at Albany.
Education and career
Willard did his undergraduate studies in mathematics at Stony Brook University, graduating ...
proposed an improvement on this space usage with the
x-fast trie, which requires
space and the same query time, and the more complicated
y-fast trie
In computer science, a y-fast trie is a data structure for storing integers from a bounded domain. It supports exact and predecessor or successor queries in time ''O''(log log ''M''), using ''O''(''n'') space, where ''n'' is the number ...
, which only requires
space.
Fusion tree
In computer science, a fusion tree is a type of tree data structure that implements an associative array on -bit integers on a finite universe, where each of the input integer has size less than 2w and is non-negative. When operating on a collecti ...
s, introduced by
Michael Fredman
Michael Lawrence Fredman is an emeritus professor at the Computer Science Department at Rutgers University, United States. He earned his Ph.D. degree from Stanford University in 1972 under the supervision of Donald Knuth. He was a member of the ...
and Willard, achieve
query time and
for predecessor queries for the static problem.
The dynamic problem has been solved using
exponential trees with
query time, and with
expected time
In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case comp ...
using
hashing
Hash, hashes, hash mark, or hashing may refer to:
Substances
* Hash (food), a coarse mixture of ingredients
* Hash, a nickname for hashish, a cannabis product
Hash mark
* Hash mark (sports), a marking on hockey rinks and gridiron football fiel ...
.
[.]
Mathematical properties
There have been a number of papers proving
lower bounds on the predecessor problem, or identifying what the running time of
asymptotically optimal
In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor (independent of the input size) worse than the best possible algorithm. It is a term commonly en ...
solutions would be. For example, Michael Beame and
Faith Ellen
Faith Ellen (formerly known as Faith E. Fich) is a professor of computer science at the University of Toronto who studies distributed data structures and the theory of distributed computing.
She earned her bachelor's degree and masters from the ...
proved that
for all
In mathematical logic, a universal quantification is a type of quantifier, a logical constant which is interpreted as "given any" or "for all". It expresses that a predicate can be satisfied by every member of a domain of discourse. In other ...
values of ,
there exists
In predicate logic, an existential quantification is a type of quantifier, a logical constant which is interpreted as "there exists", "there is at least one", or "for some". It is usually denoted by the logical operator symbol ∃, which, wh ...
a value of with query time (in
Big Theta notation)
, and similarly, for all values of , there exists a value of such that the query time is
.
Other proofs of lower bounds include the notion of
communication complexity
In theoretical computer science, communication complexity studies the amount of communication required to solve a problem when the input to the problem is distributed among two or more parties. The study of communication complexity was first intro ...
.
For the static predecessor problem,
Mihai Pătrașcu and
Mikkel Thorup
Mikkel Thorup (born 1965) is a Danish computer scientist working at University of Copenhagen.
He completed his undergraduate education at Technical University of Denmark and his doctoral studies at Oxford University in 1993. From 1993 to 199 ...
showed the following lower bound for the optimal search time, in the
cell-probe model:
where the RAM has word length
, the set contains
integers of
bits each and is represented in the RAM using
words of space, and defining
.
In the case where
for
and
, the optimal search time is
and the
van Emde Boas tree achieves this bound.
[
]
See also
* Integer sorting
In computer science, integer sorting is the algorithmic problem of sorting a collection of data values by integer keys. Algorithms designed for integer sorting may also often be applied to sorting problems in which the keys are floating point num ...
* y-fast trie
In computer science, a y-fast trie is a data structure for storing integers from a bounded domain. It supports exact and predecessor or successor queries in time ''O''(log log ''M''), using ''O''(''n'') space, where ''n'' is the number ...
* Fusion tree
In computer science, a fusion tree is a type of tree data structure that implements an associative array on -bit integers on a finite universe, where each of the input integer has size less than 2w and is non-negative. When operating on a collecti ...
References
{{reflist
Data structures
Computational problems