A B+ tree is an
m-ary tree with a variable but often large number of children per node. A B+ tree consists of a root, internal nodes and leaves.
The root may be either a leaf or a node with two or more children.
A B+ tree can be viewed as a
B-tree
In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for ...
in which each node contains only keys (not key–value pairs), and to which an additional level is added at the bottom with linked leaves.
The primary value of a B+ tree is in storing data for efficient retrieval in a
block-oriented storage context — in particular,
filesystems. This is primarily because unlike
binary search tree
In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left subtree and ...
s, B+ trees have very high fanout (number of pointers to child nodes in a node,
typically on the order of 100 or more), which reduces the number of I/O operations required to find an element in the tree.
History
There is no single paper introducing the B+ tree concept. Instead, the notion of maintaining all data in leaf nodes is repeatedly brought up as an interesting variant.
Douglas Comer
Douglas Earl Comer is a professor of computer science at Purdue University, where he teaches courses on operating systems and computer networks. He has written numerous research papers and textbooks, and currently heads several networking resear ...
notes in an early survey of B-trees (which also covers B+ trees) that the B+ tree was used in IBM's
VSAM data access software, and refers to an IBM published article from 1973.
Structure
As with other trees, B+ trees can be represented as a collection of three types of nodes: ''root'', ''internal'', and ''leaf''. These node types have the following properties:
* Individual nodes will have either ''keys'' or ''children'', but never both at the same time: this is the main distinction from a
B-tree
In computer science, a B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for ...
.
* The ''order'' or ''branching factor'' of a B+ tree measures the capacity of internal nodes, i.e. their maximum allowed number of direct child nodes. This value is constant over the entire tree.
* Internal nodes have no keys, but will always have nonzero children. The ''actual'' number of children for a given internal node is constrained such that
. Each child is then referred to as
for