HOME

TheInfoList



OR:

In the theory of optimal binary search trees, the interleave lower bound is a lower bound on the number of operations required by a Binary Search Tree (BST) to execute a given sequence of accesses. Several variants of this lower bound have been proven. This article is based on a variation of the first Wilber's bound. This lower bound is used in the design and analysis of Tango tree. Furthermore, this lower bound can be rephrased and proven geometrically,
Geometry of binary search trees In computer science, one approach to the dynamic optimality problem on online algorithms for binary search trees involves reformulating the problem geometrically, in terms of augmenting a set of points in the plane with as few additional points a ...
.


Definition

The bound is based on a fixed ''perfect BST'' P , called the lower bound tree, over the keys \. For example, for n = 7 , P can be represented by the following parenthesis structure: :: [12 [3">[1.html" ;"title="[1">[12 [3 4 ([5">[1.html"_;"title="[1">[1<_a>2_[3.html" ;"title="[1.html" ;"title="[1">[12 [3">[1.html" ;"title="[1">[12 [3 4 ([56 [7])] For each node y in P , define: * Left(y) to be the set of nodes in the left sub-tree of y , including y . * Right(y) to be the set of nodes in the right sub-tree of y . Consider the following access sequence: X = x_1, x_2, ..., x_m . For a fixed node y , and for each access x_i , define the label of x_i with respect to y as: * "''L''" - if x_i is in Left(y) . * "''R''" - if x_i is in Right(y) ; * Null - otherwise. The label of y is the concatenation of the labels from all the accesses. For example, if the sequence of accesses is: 7,6,3 then the label of the root (4) is: "RRL", the label of 6 is: "RL", and the label of 2 is: "L". For every node y , define the ''amount of interleaving through y'' as the number of alternations between L and R in the label of y . In the above example, the interleaving through 4 and 6 is 1 and the interleaving through all other nodes is 0 . The ''interleave bound'', \mathit(X), is the sum of the interleaving through all the nodes of the tree. The interleave bound of the above sequence is 2 .


The Lower Bound Statement and its Proof

The ''interleave bound'' is summarized by the following theorem. The following proof is based on.


Proof

Let X = x_1,x_2,...,x_m be an access sequence. Denote by T_i the state of an arbitrary BST at time i i.e. after executing the sequence x_1,x_2,...,x_i . We also fix a lower bound BST P. For a node y in P , define the ''transition point'' for y at time i to be the minimum-depth node z in the BST T_i such that the path from the root of T_i to z includes both a node from ''Left''(''y'') and a node from ''Right''(''y''). Intuitively, any BST algorithm on T_i that accesses an element from ''Right''(''y'') and then an element from ''Left''(''y'') (or vice versa) must touch the transition point of y at least once. In the following Lemma, we will show that transition point is well-defined. The second lemma that we need to prove states that the transition point is stable. It will not change until it is touched. The last Lemma toward the proof states that every node y \in P has its unique transition point. Now, we are ready to prove the theorem. First of all, observe that the number of touched transition points by the offline BST algorithm is a lower bound on its cost, we are counting less nodes than the required for the total cost. We know by Lemma 3 that at any time i , any node y in T_i can be only a transition for at most one node in P . Thus, It is enough to count the number of touches of a transition node of y , the sum over all y . Therefore, for a fixed node y \in P , let \ell and r to be defined as in Lemma 1. The transition point of y is among these two nodes. In fact, it is the deeper one. Let x_, x_, ..., x_ be a maximal ordered access sequence to nodes that alternate between Left(y) and Right(y) . Then p is the amount of interleaving through the node y . Suppose that the even indexed accesses are in the Left(y) , and the odd ones are in Right(y) i.e. x_ \in Left(y) and x_ \in Right(y) . We know by the properties of lowest common ancestor that an access to a node in Left(y) , it must touch \ell . Similarly, an access to a node in Right(y) must touch r . Consider every j \in , \lfloor p/2 \rfloor. For two consecutive accesses x_ and x_ , if they avoid touching the access point of y , then \ell and r must change in between. However, by Lemma 2, such change requires touching the transition point. Consequently, the BST access algorithm touches the transition point of y at least once in the interval of
_, i_ The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
. Summing over all j \in , \lfloor p/2 \rfloor , the best algorithm touches the transition point of y at least \lfloor p/2 \rfloor \geq p/2 - 1. Summing over all y , \sum_ p_y/2 - 1 \geq IB(X)/2 - n where p_y is the amount of interleave through y . By definition, the p_y's add up to IB(X) . That concludes the proof.


See also

* Tango tree * Optimal binary search tree *
Geometry of binary search trees In computer science, one approach to the dynamic optimality problem on online algorithms for binary search trees involves reformulating the problem geometrically, in terms of augmenting a set of points in the plane with as few additional points a ...


References

{{reflist Binary trees