HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, a binary tree is a k-ary k = 2
tree data structure In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be c ...
in which each node has at most two
children A child ( : children) is a human being between the stages of birth and puberty, or between the developmental period of infancy and puberty. The legal definition of ''child'' generally refers to a minor, otherwise known as a person younger ...
, which are referred to as the ' and the '. A recursive definition using just
set theory Set theory is the branch of mathematical logic that studies sets, which can be informally described as collections of objects. Although objects of any kind can be collected into a set, set theory, as a branch of mathematics, is mostly concern ...
notions is that a (non-empty) binary tree is a
tuple In mathematics, a tuple is a finite ordered list (sequence) of elements. An -tuple is a sequence (or ordered list) of elements, where is a non-negative integer. There is only one 0-tuple, referred to as ''the empty tuple''. An -tuple is defi ...
(''L'', ''S'', ''R''), where ''L'' and ''R'' are binary trees or the
empty set In mathematics, the empty set is the unique set having no elements; its size or cardinality (count of elements in a set) is zero. Some axiomatic set theories ensure that the empty set exists by including an axiom of empty set, while in othe ...
and ''S'' is a singleton set containing the root. Some authors allow the binary tree to be the empty set as well. From a
graph theory In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...
perspective, binary (and K-ary) trees as defined here are arborescences. A binary tree may thus be also called a bifurcating arborescence—a term which appears in some very old programming books, before the modern computer science terminology prevailed. It is also possible to interpret a binary tree as an
undirected In discrete mathematics, and more specifically in graph theory, a graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related". The objects correspond to mathematical abstractions called '' v ...
, rather than a directed graph, in which case a binary tree is an ordered,
rooted tree In graph theory, a tree is an undirected graph in which any two vertices are connected by ''exactly one'' path, or equivalently a connected acyclic undirected graph. A forest is an undirected graph in which any two vertices are connected by '' ...
. Some authors use rooted binary tree instead of ''binary tree'' to emphasize the fact that the tree is rooted, but as defined above, a binary tree is always rooted. A binary tree is a special case of an ordered
K-ary tree In graph theory, an ''m''-ary tree (also known as ''n''-ary, ''k''-ary or ''k''-way tree) is a rooted tree in which each node has no more than ''m'' children. A binary tree is the special case where ''m'' = 2, and a ternary tree is another c ...
, where ''K'' is 2. In mathematics, what is termed ''binary tree'' can vary significantly from author to author. Some use the definition commonly used in computer science, but others define it as every non-leaf having exactly two children and don't necessarily order (as left/right) the children either. In computing, binary trees are used in two very different ways: *First, as a means of accessing nodes based on some value or label associated with each node. Binary trees labelled this way are used to implement binary search trees and binary heaps, and are used for efficient searching and sorting. The designation of non-root nodes as left or right child even when there is only one child present matters in some of these applications, in particular, it is significant in binary search trees. However, the arrangement of particular nodes into the tree is not part of the conceptual information. For example, in a normal binary search tree the placement of nodes depends almost entirely on the order in which they were added, and can be re-arranged (for example by balancing) without changing the meaning. *Second, as a representation of data with a relevant bifurcating structure. In such cases, the particular arrangement of nodes under and/or to the left or right of other nodes is part of the information (that is, changing it would change the meaning). Common examples occur with Huffman coding and cladograms. The everyday division of documents into chapters, sections, paragraphs, and so on is an analogous example with n-ary rather than binary trees.


Definitions


Recursive definition

To actually define a binary tree in general, we must allow for the possibility that only one of the children may be empty. An artifact, which in some textbooks is called an ''extended binary tree,'' is needed for that purpose. An extended binary tree is thus recursively defined as: * the
empty set In mathematics, the empty set is the unique set having no elements; its size or cardinality (count of elements in a set) is zero. Some axiomatic set theories ensure that the empty set exists by including an axiom of empty set, while in othe ...
is an extended binary tree * if T1 and T2 are extended binary trees, then denote by T1 • T2 the extended binary tree obtained by by adding edges when these sub-trees are non-empty. Another way of imagining this construction (and understanding the terminology) is to consider instead of the empty set a different type of node—for instance square nodes if the regular ones are circles.


Using graph theory concepts

A binary tree is a
rooted tree In graph theory, a tree is an undirected graph in which any two vertices are connected by ''exactly one'' path, or equivalently a connected acyclic undirected graph. A forest is an undirected graph in which any two vertices are connected by '' ...
that is also an ordered tree (a.k.a. plane tree) in which every node has at most two children. A rooted tree naturally imparts a notion of levels (distance from the root), thus for every node a notion of children may be defined as the nodes connected to it a level below. Ordering of these children (e.g., by drawing them on a plane) makes it possible to distinguish a left child from a right child. But this still doesn't distinguish between a node with left but not a right child from a one with right but no left child. The necessary distinction can be made by first partitioning the edges, i.e., defining the binary tree as triplet (V, E1, E2), where (V, E1 ∪ E2) is a rooted tree (equivalently arborescence) and E1 ∩ E2 is empty, and also requiring that for all ''j'' ∈ every node has at most one E''j'' child. A more informal way of making the distinction is to say, quoting the Encyclopedia of Mathematics, that "every node has a left child, a right child, neither, or both" and to specify that these "are all different" binary trees. also in print as


Types of binary trees

Tree terminology is not well-standardized and so varies in the literature. * A binary
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
has a root node and every node has at most two children. * A binary tree (sometimes referred to as a proper or plane or strict binary tree) is a tree in which every node has either 0 or 2 children. Another way of defining a full binary tree is a recursive definition. A full binary tree is either: ** A single vertex. ** A tree whose root node has two subtrees, both of which are full binary trees. * A binary tree is a binary tree in which all interior nodes have two children ''and'' all leaves have the same ''depth'' or same ''level''. An example of a perfect binary tree is the (non-incestuous) ancestry chart of a person to a given depth, as each person has exactly two biological parents (one mother and one father). Provided the ancestry chart always displays the mother and the father on the same side for a given node, their sex can be seen as an analogy of left and right children, ''children'' being understood here as an algorithmic term. * A binary tree is a binary tree in which every level, ''except possibly the last'', is completely filled, and all nodes in the last level are as far left as possible. It can have between 1 and 2''h'' nodes at the last level ''h''. A perfect tree is therefore always complete but a complete tree is not necessarily perfect. An alternative definition is a perfect tree whose rightmost leaves (perhaps all) have been removed. Some authors use the term complete to refer instead to a perfect binary tree as defined above, in which case they call this type of tree (with a possibly not filled last level) an almost complete binary tree or nearly complete binary tree. A complete binary tree can be efficiently represented using an array. * In the infinite complete binary tree, every node has two children (and so the set of levels is
countably infinite In mathematics, a set is countable if either it is finite or it can be made in one to one correspondence with the set of natural numbers. Equivalently, a set is ''countable'' if there exists an injective function from it into the natural number ...
). The set of all nodes is countably infinite, but the set of all infinite paths from the root is uncountable, having the
cardinality of the continuum In set theory, the cardinality of the continuum is the cardinality or "size" of the set of real numbers \mathbb R, sometimes called the continuum. It is an infinite cardinal number and is denoted by \mathfrak c (lowercase fraktur "c") or , \math ...
. That's because these paths correspond by an order-preserving
bijection In mathematics, a bijection, also known as a bijective function, one-to-one correspondence, or invertible function, is a function between the elements of two sets, where each element of one set is paired with exactly one element of the other ...
to the points of the
Cantor set In mathematics, the Cantor set is a set of points lying on a single line segment that has a number of unintuitive properties. It was discovered in 1874 by Henry John Stephen Smith and introduced by German mathematician Georg Cantor in 1883. T ...
, or (using the example of a Stern–Brocot tree) to the set of positive irrational numbers. * A balanced binary tree is a binary tree structure in which the left and right subtrees of every node differ in height by no more than 1. One may also consider binary trees where no leaf is much farther away from the root than any other leaf. (Different balancing schemes allow different definitions of "much farther".) * A degenerate (or pathological) tree is where each parent node has only one associated child node. This means that the tree will behave like a
linked list In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes which ...
data structure.


Properties of binary trees

* The number of nodes n in a full binary tree is at least 2h+1 and at most 2^-1 , where h is the
height Height is measure of vertical distance, either vertical extent (how "tall" something or someone is) or vertical position (how "high" a point is). For example, "The height of that building is 50 m" or "The height of an airplane in-flight is ab ...
of the tree. A tree consisting of only a root node has a height of 0. * The number of leaf nodes l in a perfect binary tree, is l = (n + 1) / 2 because the number of non-leaf (a.k.a. internal) nodes n - l = \sum_^ 2^k = 2^ - 1 = l - 1. * This means that a full binary tree with l leaves has n = 2l - 1 nodes. * In a balanced full binary tree, h = \lceil \log_2(l)\rceil + 1 = \lceil \log_2((n + 1) / 2)\rceil + 1 = \lceil \log_2(n + 1)\rceil (see ceiling function). * In a perfect full binary tree, l = 2^ thus n = 2^ - 1. * The number of null links (i.e., absent children of the nodes) in a binary tree of ''n'' nodes is (''n''+1). * The number of internal nodes in a complete binary tree of ''n'' nodes is \lfloor n/2\rfloor . * For any non-empty binary tree with ''n''0 leaf nodes and ''n''2 nodes of degree 2, ''n''0 = ''n''2 + 1.


Combinatorics

In
combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many a ...
one considers the problem of counting the number of full binary trees of a given size. Here the trees have no values attached to their nodes (this would just multiply the number of possible trees by an easily determined factor), and trees are distinguished only by their structure; however, the left and right child of any node are distinguished (if they are different trees, then interchanging them will produce a tree distinct from the original one). The size of the tree is taken to be the number ''n'' of internal nodes (those with two children); the other nodes are leaf nodes and there are of them. The number of such binary trees of size ''n'' is equal to the number of ways of fully parenthesizing a string of symbols (representing leaves) separated by ''n'' binary operators (representing internal nodes), to determine the argument subexpressions of each operator. For instance for one has to parenthesize a string like , which is possible in five ways: : ((X*X)*X)*X,\qquad (X*(X*X))*X,\qquad (X*X)*(X*X),\qquad X*((X*X)*X),\qquad X*(X*(X*X)). The correspondence to binary trees should be obvious, and the addition of redundant parentheses (around an already parenthesized expression or around the full expression) is disallowed (or at least not counted as producing a new possibility). There is a unique binary tree of size 0 (consisting of a single leaf), and any other binary tree is characterized by the pair of its left and right children; if these have sizes ''i'' and ''j'' respectively, the full tree has size . Therefore, the number C_n of binary trees of size ''n'' has the following recursive description C_0=1, and \textstyle C_n=\sum_^C_iC_ for any positive integer ''n''. It follows that C_n is the Catalan number of index ''n''. The above parenthesized strings should not be confused with the set of words of length 2''n'' in the
Dyck language In the theory of formal languages of computer science, mathematics, and linguistics, a Dyck word is a balanced string of square brackets and The set of Dyck words forms the Dyck language. Dyck words and language are named after the mathemat ...
, which consist only of parentheses in such a way that they are properly balanced. The number of such strings satisfies the same recursive description (each Dyck word of length 2''n'' is determined by the Dyck subword enclosed by the initial '(' and its matching ')' together with the Dyck subword remaining after that closing parenthesis, whose lengths 2''i'' and 2''j'' satisfy ); this number is therefore also the Catalan number C_n. So there are also five Dyck words of length 6: : These Dyck words do not correspond to binary trees in the same way. Instead, they are related by the following recursively defined bijection: the Dyck word equal to the empty string corresponds to the binary tree of size 0 with only one leaf. Any other Dyck word can be written as (w_1)w_2, where w_1,w_2 are themselves (possibly empty) Dyck words and where the two written parentheses are matched. The bijection is then defined by letting the words w_1 and w_2 correspond to the binary trees that are the left and right children of the root. A bijective correspondence can also be defined as follows: enclose the Dyck word in an extra pair of parentheses, so that the result can be interpreted as a Lisp list expression (with the empty list () as only occurring atom); then the dotted-pair expression for that proper list is a fully parenthesized expression (with NIL as symbol and '.' as operator) describing the corresponding binary tree (which is, in fact, the internal representation of the proper list). The ability to represent binary trees as strings of symbols and parentheses implies that binary trees can represent the elements of a
free magma In abstract algebra, a magma, binar, or, rarely, groupoid is a basic kind of algebraic structure. Specifically, a magma consists of a set equipped with a single binary operation that must be closed by definition. No other properties are imposed. ...
on a singleton set.


Methods for storing binary trees

Binary trees can be constructed from
programming language A programming language is a system of notation for writing computer programs. Most programming languages are text-based formal languages, but they may also be graphical. They are a kind of computer language. The description of a programming ...
primitives in several ways.


Nodes and references

In a language with records and references, binary trees are typically constructed by having a tree node structure which contains some data and references to its left child and its right child. Sometimes it also contains a reference to its unique parent. If a node has fewer than two children, some of the child pointers may be set to a special null value, or to a special sentinel node. This method of storing binary trees wastes a fair bit of memory, as the pointers will be null (or point to the sentinel) more than half the time; a more conservative representation alternative is threaded binary tree. In languages with tagged unions such as ML, a tree node is often a tagged union of two types of nodes, one of which is a 3-tuple of data, left child, and right child, and the other of which is a "leaf" node, which contains no data and functions much like the null value in a language with pointers. For example, the following line of code in
OCaml OCaml ( , formerly Objective Caml) is a general-purpose, multi-paradigm programming language Programming paradigms are a way to classify programming languages based on their features. Languages can be classified into multiple paradigms. ...
(an ML dialect) defines a binary tree that stores a character in each node. type chr_tree = Empty , Node of char * chr_tree * chr_tree


Arrays

Binary trees can also be stored in breadth-first order as an implicit data structure in arrays, and if the tree is a complete binary tree, this method wastes no space. In this compact arrangement, if a node has an index ''i'', its children are found at indices 2i + 1 (for the left child) and 2i +2 (for the right), while its parent (if any) is found at index ''\left \lfloor \frac \right \rfloor'' (assuming the root has index zero). Alternatively, with a 1-indexed array, the implementation is simplified with children found at 2i and 2i+1, and parent found at \lfloor i/2 \rfloor. This method benefits from more compact storage and better locality of reference, particularly during a preorder traversal. However, it is expensive to grow and wastes space proportional to 2''h'' - ''n'' for a tree of depth ''h'' with ''n'' nodes. This method of storage is often used for binary heaps.


Encodings


Succinct encodings

A succinct data structure is one which occupies close to minimum possible space, as established by information theoretical lower bounds. The number of different binary trees on n nodes is \mathrm_, the nth Catalan number (assuming we view trees with identical ''structure'' as identical). For large n, this is about 4^; thus we need at least about \log_4^ = 2n bits to encode it. A succinct binary tree therefore would occupy 2n+o(n) bits. One simple representation which meets this bound is to visit the nodes of the tree in preorder, outputting "1" for an internal node and "0" for a leaf. If the tree contains data, we can simply simultaneously store it in a consecutive array in preorder. This function accomplishes this: function EncodeSuccinct(''node'' n, ''bitstring'' structure, ''array'' data) The string ''structure'' has only 2n + 1 bits in the end, where n is the number of (internal) nodes; we don't even have to store its length. To show that no information is lost, we can convert the output back to the original tree like this: function DecodeSuccinct(''bitstring'' structure, ''array'' data) More sophisticated succinct representations allow not only compact storage of trees but even useful operations on those trees directly while they're still in their succinct form.


Encoding general trees as binary trees

There is a one-to-one mapping between general ordered trees and binary trees, which in particular is used by Lisp to represent general ordered trees as binary trees. To convert a general ordered tree to a binary tree, we only need to represent the general tree in left-child right-sibling way. The result of this representation will automatically be a binary tree if viewed from a different perspective. Each node ''N'' in the ordered tree corresponds to a node ''N' '' in the binary tree; the ''left'' child of ''N' '' is the node corresponding to the first child of ''N'', and the ''right'' child of ''N' '' is the node corresponding to ''N''s next sibling --- that is, the next node in order among the children of the parent of ''N''. This binary tree representation of a general order tree is sometimes also referred to as a left-child right-sibling binary tree (also known as LCRS tree, doubly chained tree, filial-heir chain). One way of thinking about this is that each node's children are in a
linked list In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes which ...
, chained together with their ''right'' fields, and the node only has a pointer to the beginning or head of this list, through its ''left'' field. For example, in the tree on the left, A has the 6 children . It can be converted into the binary tree on the right. The binary tree can be thought of as the original tree tilted sideways, with the black left edges representing ''first child'' and the blue right edges representing ''next sibling''. The leaves of the tree on the left would be written in Lisp as: :(((N O) I J) C D ((P) (Q)) F (M)) which would be implemented in memory as the binary tree on the right, without any letters on those nodes that have a left child.


Common operations

There are a variety of different operations that can be performed on binary trees. Some are mutator operations, while others simply return useful information about the tree.


Insertion

Nodes can be inserted into binary trees in between two other nodes or added after a leaf node. In binary trees, a node that is inserted is specified as to whose child it will be.


Leaf nodes

To add a new node after leaf node A, A assigns the new node as one of its children and the new node assigns node A as its parent.


Internal nodes

Insertion on internal nodes is slightly more complex than on leaf nodes. Say that the internal node is node A and that node B is the child of A. (If the insertion is to insert a right child, then B is the right child of A, and similarly with a left child insertion.) A assigns its child to the new node and the new node assigns its parent to A. Then the new node assigns its child to B and B assigns its parent as the new node.


Deletion

Deletion is the process whereby a node is removed from the tree. Only certain nodes in a binary tree can be removed unambiguously.


Node with zero or one children

Suppose that the node to delete is node A. If A has no children, deletion is accomplished by setting the child of A's parent to null. If A has one child, set the parent of A's child to A's parent and set the child of A's parent to A's child.


Node with two children

In a binary tree, a node with two children cannot be deleted unambiguously. However, in certain binary trees (including binary search trees) these nodes ''can'' be deleted, though with a rearrangement of the tree structure.


Traversal

Pre-order, in-order, and post-order traversal visit each node in a tree by recursively visiting each node in the left and right subtrees of the root.


Depth-first order

In depth-first order, we always attempt to visit the node farthest from the root node that we can, but with the caveat that it must be a child of a node we have already visited. Unlike a depth-first search on graphs, there is no need to remember all the nodes we have visited, because a tree cannot contain cycles. Pre-order is a special case of this. See
depth-first search Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root node (selecting some arbitrary node as the root node in the case of a graph) and explores as far as possible a ...
for more information.


Breadth-first order

Contrasting with depth-first order is breadth-first order, which always attempts to visit the node closest to the root that it has not already visited. See breadth-first search for more information. Also called a ''level-order traversal''. In a complete binary tree, a node's breadth-index (''i'' − (2''d'' − 1)) can be used as traversal instructions from the root. Reading bitwise from left to right, starting at bit ''d'' − 1, where ''d'' is the node's distance from the root (''d'' = ⌊log(''i''+1)⌋) and the node in question is not the root itself (''d'' > 0). When the breadth-index is masked at bit ''d'' − 1, the bit values and mean to step either left or right, respectively. The process continues by successively checking the next bit to the right until there are no more. The rightmost bit indicates the final traversal from the desired node's parent to the node itself. There is a time-space trade-off between iterating a complete binary tree this way versus each node having pointer/s to its sibling/s.


See also


References


Citations


Bibliography

* Donald Knuth. ''
The Art of Computer Programming ''The Art of Computer Programming'' (''TAOCP'') is a comprehensive monograph written by the computer scientist Donald Knuth presenting programming algorithms and their analysis. Volumes 1–5 are intended to represent the central core of com ...
vol 1. Fundamental Algorithms'', Third Edition. Addison-Wesley, 1997. . Section 2.3, especially subsections 2.3.1–2.3.2 (pp. 318–348).


External links


binary trees
entry in th
FindStat
database


Balanced binary search tree on array How to create bottom-up an Ahnentafel list, or a balanced binary search tree on array

Binary trees and Implementation of the same with working code examples

Binary Tree JavaScript Implementation with source code
{{DEFAULTSORT:Binary Tree