In
computer science
Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, the predecessor problem involves maintaining a
set
Set, The Set, SET or SETS may refer to:
Science, technology, and mathematics Mathematics
*Set (mathematics), a collection of elements
*Category of sets, the category whose objects and morphisms are sets and total functions, respectively
Electro ...
of items to, given an element, efficiently
query which element precedes or succeeds that element in an order.
Data structures
In computer science, a data structure is a data organization and storage format that is usually chosen for efficient access to data. More precisely, a data structure is a collection of data values, the relationships among them, and the functi ...
used to solve the problem include
balanced binary search tree
In computer science, a self-balancing binary search tree (BST) is any node-based binary search tree that automatically keeps its height (maximal number of levels below the root) small in the face of arbitrary item insertions and deletions.Donald ...
s,
van Emde Boas trees, and
fusion tree
In computer science, a fusion tree is a type of tree data structure that implements an associative array on -bit integers on a finite universe, where each of the input integers has size less than 2w and is non-negative. When operating on a collect ...
s. In the static predecessor problem, the set of elements does not change, but in the dynamic predecessor problem, insertions into and deletions from the set are allowed.
The predecessor problem is a simple case of the
nearest neighbor problem, and data structures that solve it have applications in problems like
integer sorting
In computer science, integer sorting is the algorithmic problem of sorting a collection of data values by integer keys. Algorithms designed for integer sorting may also often be applied to sorting problems in which the keys are floating point numb ...
.
Definition
The problem consists of maintaining a set , which contains a subset of integers. Each of these
integers
An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
can be stored with a
word size
In computing, a word is any processor design's natural unit of data. A word is a fixed-sized datum handled as a unit by the instruction set or the hardware of the processor. The number of bits or digits in a word (the ''word size'', ''word wid ...
of , implying that
. Data structures that solve the problem support these operations:
*
predecessor(x)
, which returns the largest element in strictly smaller than
*
successor(x)
, which returns the smallest element in strictly greater than
In addition, data structures which solve the
dynamic version of the problem also support these operations:
*
insert(x)
, which adds to the set
*
delete(x)
, which removes from the set
The problem is typically analyzed in a
transdichotomous model of computation
In computer science, and more specifically in computability theory and computational complexity theory, a model of computation is a model which describes how an output of a mathematical function is computed given an input. A model describes how ...
such as
word RAM In theoretical computer science, the word RAM (word random-access machine) model is a model of computation in which a random-access machine does arithmetic and bitwise operations on a word of bits. Michael Fredman and Dan Willard created it in 1990 ...
.
Data structures

One simple solution to this problem is to use a
balanced binary search tree
In computer science, a self-balancing binary search tree (BST) is any node-based binary search tree that automatically keeps its height (maximal number of levels below the root) small in the face of arbitrary item insertions and deletions.Donald ...
, which achieves (in
Big O notation
Big ''O'' notation is a mathematical notation that describes the asymptotic analysis, limiting behavior of a function (mathematics), function when the Argument of a function, argument tends towards a particular value or infinity. Big O is a memb ...
) a
running time
In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations p ...
of
for predecessor queries. The
Van Emde Boas tree achieves a query time of
, but requires
space
Space is a three-dimensional continuum containing positions and directions. In classical physics, physical space is often conceived in three linear dimensions. Modern physicists usually consider it, with time, to be part of a boundless ...
.
Dan Willard
Dan Edward Willard (September 19, 1948 – January 21, 2023) was an American computer scientist and logician, and a professor of computer science at the University at Albany.
Education and career
Willard did his undergraduate studies in mathemat ...
proposed an improvement on this space usage with the
x-fast trie, which requires
space and the same query time, and the more complicated
y-fast trie
In computer science, a y-fast trie is a data structure for storing integers from a bounded domain. It supports exact and predecessor or successor queries in time ''O''(log log ''M''), using ''O''(''n'') space, where ''n'' is the number ...
, which only requires
space.
Fusion tree
In computer science, a fusion tree is a type of tree data structure that implements an associative array on -bit integers on a finite universe, where each of the input integers has size less than 2w and is non-negative. When operating on a collect ...
s, introduced by
Michael Fredman
Michael Lawrence Fredman is an emeritus professor at the Computer Science Department at Rutgers University, United States. He earned his Ph.D. from Stanford University in 1972 under the supervision of Donald Knuth. He was a member of the mathemat ...
and Willard, achieve
query time and
for predecessor queries for the static problem.
The dynamic problem has been solved using
exponential tree
An exponential tree is a type of search tree
In computer science, a search tree is a tree data structure used for locating specific keys from within a set. In order for a tree to function as a search tree, the key for each node must be greate ...
s with
query time, and with
expected time
In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource (typically time) used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case com ...
using
hashing
Hash, hashes, hash mark, or hashing may refer to:
Substances
* Hash (food), a coarse mixture of ingredients, often based on minced meat
* Hash (stew), a pork and onion-based gravy found in South Carolina
* Hash, a nickname for hashish, a cannab ...
.
[.]
Mathematical properties
There have been a number of papers proving
lower bounds on the predecessor problem, or identifying what the running time of
asymptotically optimal
In computer science, an algorithm is said to be asymptotically optimal if, roughly speaking, for large inputs it performs at worst a constant factor (independent of the input size) worse than the best possible algorithm. It is a term commonly en ...
solutions would be. For example, Michael Beame and
Faith Ellen proved that
for all
In mathematical logic, a universal quantification is a type of quantifier, a logical constant which is interpreted as "given any", "for all", "for every", or "given an arbitrary element". It expresses that a predicate can be satisfied by e ...
values of ,
there exists
There may refer to:
* ''There'' (film), a 2009 Turkish film (Turkish title: ''Orada'')
* ''There'' (virtual world)
*''there'', a deictic adverb in English
*''there'', an English pronoun used in phrases such as '' there is'' and ''there are''
{ ...
a value of with query time (in
Big Theta notation)
, and similarly, for all values of , there exists a value of such that the query time is
.
Other proofs of lower bounds include the notion of
communication complexity
In theoretical computer science, communication complexity studies the amount of communication required to solve a problem when the input to the problem is distributed among two or more parties. The study of communication complexity was first intro ...
.
For the static predecessor problem,
Mihai Pătrașcu and
Mikkel Thorup
Mikkel Thorup (born 1965) is a Danish computer scientist working at University of Copenhagen.
He completed his undergraduate education at Technical University of Denmark and his doctoral studies at Oxford University in 1993. From 1993 to 1998, h ...
showed the following lower bound for the optimal search time, in the
cell-probe model In computer science, the cell-probe model is a model of computation similar to the random-access machine, except that all operations are free except memory access. This model is useful for proving lower bounds of algorithms for data structure probl ...
:
where the RAM has word length
, the set contains
integers of
bits each and is represented in the RAM using
words of space, and defining
.
In the case where
for
and
, the optimal search time is
and the
van Emde Boas tree achieves this bound.
[
]
See also
* Integer sorting
In computer science, integer sorting is the algorithmic problem of sorting a collection of data values by integer keys. Algorithms designed for integer sorting may also often be applied to sorting problems in which the keys are floating point numb ...
* y-fast trie
In computer science, a y-fast trie is a data structure for storing integers from a bounded domain. It supports exact and predecessor or successor queries in time ''O''(log log ''M''), using ''O''(''n'') space, where ''n'' is the number ...
* Fusion tree
In computer science, a fusion tree is a type of tree data structure that implements an associative array on -bit integers on a finite universe, where each of the input integers has size less than 2w and is non-negative. When operating on a collect ...
References
{{reflist
Data structures
Computational problems