In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
, a 2–3–4 tree (also called a 2–4 tree) is a self-balancing
data structure
In computer science, a data structure is a data organization, management, and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the rel ...
that can be used to implement
dictionaries. The numbers mean a
tree
In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
where every
node
In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex).
Node may refer to:
In mathematics
* Vertex (graph theory), a vertex in a mathematical graph
* Vertex (geometry), a point where two or more curves, line ...
with children (
internal node
In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be con ...
) has either two, three, or four child nodes:
* a 2-node has one
data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:
# An identification such as a data element name
# A clear data element definition
# One or more representation terms
...
, and if internal has two child nodes;
* a 3-node has two data elements, and if internal has three child nodes;
* a 4-node has three data elements, and if internal has four child nodes;
Image:2-3-4-tree-2-node.svg, 2-node
Image:2-3-4-tree-3-node.svg, 3-node
Image:2-3-4-tree-4-node.svg, 4-node
2–3–4 trees are
B-tree
In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for ...
s of order 4; like B-trees in general, they can search, insert and delete in
O(log ''n'') time. One property of a 2–3–4 tree is that all external nodes are at the same depth.
2–3–4 trees are
isomorphic to
red–black tree
In computer science, a red–black tree is a kind of self-balancing binary search tree. Each node stores an extra bit representing "color" ("red" or "black"), used to ensure that the tree remains balanced during insertions and deletions.
When the ...
s, meaning that they are equivalent data structures. In other words, for every 2–3–4 tree, there exists at least one and at most one red–black tree with data elements in the same order. Moreover, insertion and deletion operations on 2–3–4 trees that cause node expansions, splits and merges are equivalent to the color-flipping and rotations in red–black trees. Introductions to red–black trees usually introduce 2–3–4 trees first, because they are conceptually simpler. 2–3–4 trees, however, can be difficult to implement in most programming languages because of the large number of special cases involved in operations on the tree.
Red–black tree
In computer science, a red–black tree is a kind of self-balancing binary search tree. Each node stores an extra bit representing "color" ("red" or "black"), used to ensure that the tree remains balanced during insertions and deletions.
When the ...
s are simpler to implement, so tend to be used instead.
Properties
* Every node (leaf or internal) is a 2-node, 3-node or a 4-node, and holds one, two, or three data elements, respectively.
* All leaves are at the same depth (the bottom level).
* All data is kept in sorted order.
Insertion
To insert a value, we start at the root of the 2–3–4 tree:
# If the current node is a 4-node:
#* Remove and save the middle value to get a 3-node.
#* Split the remaining 3-node up into a pair of 2-nodes (the now missing middle value is handled in the next step).
#* If this is the root node (which thus has no parent):
#** the middle value becomes the new root 2-node and the tree height increases by 1. Ascend into the root.
#* Otherwise, push the middle value up into the parent node. Ascend into the parent node.
# Find the child whose interval contains the value to be inserted.
# If that child is a leaf, insert the value into the child node and finish.
#* Otherwise, descend into the child and repeat from step 1.
Example
To insert the value "25" into this 2–3–4 tree:
:

* Begin at the root (10, 20) and descend towards the rightmost child (22, 24, 29). (Its interval (20, ∞) contains 25.)
* Node (22, 24, 29) is a 4-node, so its middle element 24 is pushed up into the parent node.
:

* The remaining 3-node (22, 29) is split into a pair of 2-nodes (22) and (29). Ascend back into the new parent (10, 20, 24).
* Descend towards the rightmost child (29). (Its interval (24, ∞) contains 25.)
:

* Node (29) has no leftmost child. (The child for interval (24, 29) is empty.) Stop here and insert value 25 into this node.
:
Deletion
The simplest possibility to delete an element is to just leave the element there and mark it as "deleted", adapting the previous algorithms so that deleted elements are ignored. Deleted elements can then be re-used by overwriting them when performing an insertion later. However, the drawback of this method is that the size of the tree does not decrease. If a large proportion of the elements of the tree are deleted, then the tree will become much larger than the current size of the stored elements, and the performance of other operations will be adversely affected by the deleted elements.
When this is undesirable, the following algorithm can be followed to remove a value from the 2–3–4 tree:
# Find the element to be deleted.
#* If the element is not in a leaf node, remember its location and continue searching until a leaf, which will contain the element's successor, is reached. The successor can be either the largest key that is smaller than the one to be removed, or the smallest key that is larger than the one to be removed. It is simplest to make adjustments to the tree from the top down such that the leaf node found is not a 2-node. That way, after the swap, there will not be an empty leaf node.
#* If the element is in a 2-node leaf, just make the adjustments below.
Make the following adjustments when a 2-node – except the root node – is encountered on the way to the leaf we want to remove:
# If a sibling on either side of this node is a 3-node or a 4-node (thus having more than 1 key), perform a rotation with that sibling:
#* The key from the other sibling closest to this node moves up to the parent key that overlooks the two nodes.
#* The parent key moves down to this node to form a 3-node.
#* The child that was originally with the rotated sibling key is now this node's additional child.
# If the parent is a 2-node and the sibling is also a 2-node, combine all three elements to form a new 4-node and shorten the tree. (This rule can only trigger if the parent 2-node is the root, since all other 2-nodes along the way will have been modified to not be 2-nodes. This is why "shorten the tree" here preserves balance; this is also an important assumption for the fusion operation.)
# If the parent is a 3-node or a 4-node and all adjacent siblings are 2-nodes, do a fusion operation with the parent and an adjacent sibling:
#* The adjacent sibling and the parent key overlooking the two sibling nodes come together to form a 4-node.
#* Transfer the sibling's children to this node.
Once the sought value is reached, it can now be placed at the removed entry's location without a problem because we have ensured that the leaf node has more than 1 key.
Deletion in a 2–3–4 tree is O(log n), assuming transfer and fusion run in constant time (O(1)).
Parallel operations
Since 2–3–4 trees are similar in structure to
red–black tree
In computer science, a red–black tree is a kind of self-balancing binary search tree. Each node stores an extra bit representing "color" ("red" or "black"), used to ensure that the tree remains balanced during insertions and deletions.
When the ...
s,
parallel algorithms for red–black trees can be applied to 2–3–4 trees as well.
See also
*
2–3 tree
In computer science, a 2–3 tree is a tree data structure, where every node with children (internal node) has either two children (2-node) and one data element or three children (3-nodes) and two data elements. A 2–3 tree is a B-tree of order 3 ...
*
Red–black tree
In computer science, a red–black tree is a kind of self-balancing binary search tree. Each node stores an extra bit representing "color" ("red" or "black"), used to ensure that the tree remains balanced during insertions and deletions.
When the ...
*
B-tree
In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for ...
*
Queap
In computer science, a queap is a priority queue data structure. The data structure allows insertions and deletions of arbitrary elements, as well as retrieval of the highest-priority element. Each deletion takes amortized time logarithmic in t ...
References
External links
Algorithms In Action, with 2–3–4 Tree animationLeft-leaning Red–Black Trees – Robert Sedgewick, Princeton University, 2008 Pat Morin
2–3–4 Trees: A Visual Introduction, 2017
{{DEFAULTSORT:2-3-4 tree
B-tree