computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

, a 2–3 tree is a

tree data structure In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be conn ...

, where every

node In general, a node is a localized swelling (a "knot") or a point of intersection (a vertex). Node may refer to: In mathematics * Vertex (graph theory), a vertex in a mathematical graph *Vertex (geometry), a point where two or more curves, lines ...

with children (

internal node In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected Node (computer science), nodes. Each node in the tree can be connected to many children (depending on the type of ...

) has either two children (2-node) and one

data element In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has: # An identification such as a data element name # A clear data element definition # One or more representation term ...

or three children (3-node) and two data elements. A 2–3 tree is a

B-tree In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing fo ...

of order 3. Nodes on the outside of the tree ( leaf nodes) have no children and one or two data elements. 2–3 trees were invented by John Hopcroft in 1970. 2–3 trees are required to be balanced, meaning that each leaf is at the same level. It follows that each right, center, and left subtree of a node contains the same or close to the same amount of data.

Definitions

We say that an internal node is a 2-node if it has ''one'' data element and ''two'' children. We say that an internal node is a 3-node if it has ''two'' data elements and ''three'' children. A 4-node, with three data elements, may be temporarily created during manipulation of the tree but is never persistently stored in the tree. Image:2-3-4 tree 2-node.svg, 2 node Image:2-3-4-tree 3-node.svg, 3 node We say that is a 2–3 tree if and only if one of the following statements hold: * is empty. In other words, does not have any nodes. * is a 2-node with data element . If has left child and right child , then ** and are 2–3 trees of the same

height Height is measure of vertical distance, either vertical extent (how "tall" something or someone is) or vertical position (how "high" a point is). For an example of vertical extent, "This basketball player is 7 foot 1 inches in height." For an e ...

; ** is greater than each element in ; and ** is less than each data element in . * is a 3-node with data elements and , where . If has left child , middle child , and right child , then ** , , and are 2–3 trees of equal height; ** is greater than each data element in and less than each data element in ; and ** is greater than each data element in and less than each data element in .

Properties

* Every internal node is a 2-node or a 3-node. * All leaves are at the same level. * All data is kept in sorted order.

Operations

Searching

Searching for an item in a 2–3 tree is similar to searching for an item in a

binary search tree In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a Rooted tree, rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left ...

. Since the data elements in each node are ordered, a search function will be directed to the correct subtree and eventually to the correct node which contains the item. # Let be a 2–3 tree and be the data element we want to find. If is empty, then is not in and we're done. # Let be the root of . # Suppose is a leaf. #* If is not in , then is not in . Otherwise, is in . We need no further steps and we're done. # Suppose is a 2-node with left child and right child . Let be the data element in . There are three cases: #* If is equal to , then we've found in and we're done. #* If

d < a

, then set to , which by definition is a 2–3 tree, and go back to step 2. #* If

d > a

, then set to and go back to step 2. # Suppose is a 3-node with left child , middle child , and right child . Let and be the two data elements of , where

a < b

. There are four cases: #* If is equal to or , then is in and we're done. #* If

d < a

, then set to and go back to step 2. #* If

a < d < b

, then set to and go back to step 2. #* If

d > b

, then set to and go back to step 2.

Insertion

Insertion maintains the balanced property of the tree. To insert into a 2-node, the new key is added to the 2-node in the appropriate order. To insert into a 3-node, more work may be required depending on the location of the 3-node. If the tree consists only of a 3-node, the node is split into three 2-nodes with the appropriate keys and children. 2-3 insertion

If the target node is a 3-node whose parent is a 2-node, the key is inserted into the 3-node to create a temporary 4-node. In the illustration, the key 10 is inserted into the 2-node with 6 and 9. The middle key is 9, and is promoted to the parent 2-node. This leaves a 3-node of 6 and 10, which is split to be two 2-nodes held as children of the parent 3-node. If the target node is a 3-node and the parent is a 3-node, a temporary 4-node is created then split as above. This process continues up the tree to the root. If the root must be split, then the process of a single 3-node is followed: a temporary 4-node root is split into three 2-nodes, one of which is considered to be the root. This operation grows the height of the tree by one.

Deletion

Deleting a key from a non-leaf node can be done by replacing it by its immediate predecessor or successor, and then deleting the predecessor or successor from a leaf node. Deleting a key from a leaf node is easy if the leaf is a 3-node. Otherwise, it may require creating a temporary 1-node which may be absorbed by reorganizing the tree, or it may repeatedly travel upwards before it can be absorbed, as a temporary 4-node may in the case of insertion. Alternatively, it's possible to use an algorithm which is both top-down and bottom-up, creating temporary 4-nodes on the way down that are then destroyed as you travel back up. Deletion methods are explained in more detail in the references."2-3 Trees"
Lyn Turbak, handout #26, course notes, CS230 Data Structures, Wellesley College, December 2, 2004. Accessed Mar. 11, 2024.

Parallel operations

Since 2–3 trees are similar in structure to red–black trees, parallel algorithms for red–black trees can be applied to 2–3 trees as well.

References

{{DEFAULTSORT:2-3 tree B-tree Amortized data structures