optimal binary search trees with costs depending on the access paths

Optimal Binary Search Trees with CostsDepending on the Access PathsJayme L Szwarc�ter1 Gonzalo Navarro2Jo��sa de S. Oliveira3 Ricardo Baeza-Yates4Walter Cunto5 N��vio Ziviani6August 1998Key Words: algorithms, binary search trees, path dependent costs.ABSTRACTWe describe algorithms for constructing optimal binary search trees, in which theaccess cost to each key depends on the k preceeding keys which were reached inthe path to the desired key. Two kinds of optimal trees are considered, namelyoptimal worst case and weighted average case trees. The time complexities of thecorresponding algorithms are O(nk+2) and O(knk+2) respectively, while the spacecomplexity is O(nk+1), for both cases. The algorithms are based on a convenientdecomposition theorem, and on a characterization of sequences of keys which arepaths in some binary search tree. In addition, a characterization of sequences ofkeys which lead to a subtree formed by consecutive keys, in some binary searchtree. An exact analysis of the number of steps performed by the algorithms isincluded.1Universidade Federal do Rio de Janeiro, Instituto de Matem�atica, NCE and COPPE,Caixa Postal 2324, 20001-970 Rio de Janeiro, RJ, Brasil. E-mail: [email protected] de Chile, Departamento de Ciencia de la Computaci�on. E-mail:[email protected] Federal do Rio de Janeiro, COPPE, Caixa Postal 68511, Rio de Janeiro,RJ, Brasil. E-mail: [email protected] de Chile, Departamento de Ciencia de la Computaci�on. E-mail:[email protected] Simon Bolivar and IBM, Caracas, Venezuela. E-mail:[email protected] Federal de Minas Gerais, Departamento de Ciencia da Computa�c~ao, BeloHorizonte, MG, Brasil. E-mail: [email protected]

1 IntroductionBinary search trees form one of the topics most widely studied in computerscience, probably due to its range of application. Its importance can beassessed by reading [3] and [4]. Relevant papers on binary search trees dateback to the 50's, while a tutorial on the subject has recently appeared [5].In this paper we consider the problems of �nding optimal binary search treesin which the access cost to each key depends on the k preceeding keys whichwere reached in the path to the desired key. The classical optimal binarysearch tree construction by Gilmore and Moore [1] and Knuth [2] correspondsthus to the fundamental case k = 0. In this work we are concerned with thevalues k � 1. Two kinds of optimal trees are considered, namely optimalworst case and weighted average case trees. The inputs of these problemsare a number n of keys, the value k, 1 � k < n, and a cost associated to eachpossible sequence formed by at most k+1 keys, all of them distinct. For theweighted average case minimization problem, each key is additionally givena weight. We remark that the input size grows exponentially with k.We describe algorithms for solving the two problems above. Their complex-ities are O(nk+2) and O(k:nk+2), respectively for minimizing worst case andweighted average case. The algorithms are polynomial in the size of theinput.The optimal binary search tree for k = 0 and with uniform key access costs,as considered in [1, 2], is a model for situations in which the keys are in themain memory. Greater values of k and arbitrary access costs could modelthe cases in which other kind of memories are involved. For example, whenall keys are stored in a disk, the access cost to a given key depends on theposition on the disk of the key previously accessed. Therefore �nding anoptimal tree when all keys are stored in a disk would correspond to the casek = 1. In this situation, the input size is O(n2) and the complexity of theproposed algorithm is O(n3). Besides practical motivations, we believe thatsome of the concepts presented in this paper are of interest in the study ofsearch trees, in general.The following are some basic de�nitions.1

A binary tree is a rooted tree T in which every node z, other than the root,is labelled left child or right child, in such a way that any two brothershave di�erent labels. When z has no brothers it is called an only child.A path of T is a sequence of nodes z1; : : : ; zt, such that zq is the father ofzq+1. In this case, z1 is an ancestor of zt, while zt is a descendant of z1.When z1 6= zt they are called proper ancestor and proper descendant,respectively. A t-path is a path formed by t nodes. The notation N(T )represents the set of nodes of T . For z 2 N(T ), the binary tree de�ned in Tby all descendants of z is called the subtree of T rooted at z, and denotedby T (z). The left subtree of z is the binary tree formed in T by the leftchild of z and all of its descendants. Similarly, de�ne the right subtree of z.Represent by TL(z) and TR(z) the left and right subtrees of z, respectively.A binary tree de�ned in T by a subset of N(T ) is called a partial subtreeof T . A root path is a path starting at the root of T , while a root-leafpath starts at the root and ends at some leaf of T .Let fx1; : : : ; xng be a set of elements called keys, xq < xq+1. A binarysearch tree for fx1; : : : ; xng is a binary tree T in whichN(T ) = fx1; : : : ; xng,with every pair of keys xp; xq 2 N(T ) satisfying: xq 2 N(TL(xq)) impliesxq < xp, and xq 2 N(TR(xq)) implies xq > xp. A legal path is a sequenceof keys which is a path in some binary search tree.The described minimization problems are solved by dynamic programmingequations. The corresponding decompositions employ the concept of a le-gal path and that of a (i; j)-legal path. The latter means those legal pathsleading to a subtree formed by consecutive keys. We then describe charac-terizations for both legal and (i; j)-legal paths. The algorithms are obtainedby combining the decompositions and the characterizations. The decomposi-tions are presented in Section 2, the characterizations in Section 3, while thealgorithms and their analysis are described in 4. Some additional remarksform the last section.2

2 The decompositionsLet k � 1 be a given integer value and fx1; : : : ; xng a set of keys, xq < xq+1.For each xq and legal path y1; : : : ; yt, where 1 � t � k + 1 and xq = yt, it isgiven a real non-negative key cost c(y1; :::; yt) of yt relative to y1; : : : yt. Itcorresponds to the cost of reaching yt through the path y1; :::; yt. In adddition,each key xq is given a non-negative real weight w(xq). For a legal pathy1; : : : ; ym, de�ne its path cost asC(y1; : : : ; ym) = X1�q�m c(ymaxf1;q�kg; : : : ; yq) (1)Let T be a binary search tree for fx1; : : : ; xng. Denote by x�q the root pathto key xq. The values max1�q�nfC(x�q)g and P1�q�n w(xq):C(x�q) are calledworst case tree cost and weighted average case tree cost, respectively.The question consists of �nding the tree T which minimizes one of these twoabove costs, as desired. A minimizing tree is called optimal.Observe that subtrees of an optimal tree are not necessarily optimal, for anyk > 0. Consider the example having k = 1, n = 3, with key costs as givenby �gure 1(a) and having all weights equal to 1.legal paths x1 x2 x3 x1x2 x1x3 x2x1 x2x3 x3x1 x3x2key costs 0 0 0 0 2 3 2 3 1 (a)i x1@@@@ ix2 (b)@@@@ i x3Figure 13

The tree of �gure 1(b) is both worst and average case optimal, but T (x2) isnot optimal in any case. However, special kinds of partial subtrees are opti-mal, making it possible to solve our minimization problems by convenientlydecomposing them into smaller subproblems. We need more notation.First, introduce k additional keys fxn+1; : : : ; xn+kg, called dummy keys,also satisfying xq < xq+1, n � q < n + k. Each of these keys has weight 1.The key costs relative to paths containing dummy keys are de�ned as follows.Let y1; : : : ; yt be a legal path having at least one dummy key, 1 � t � k + 1.Thenc(y1; : : : ; yt) = 8>>>>>>>><>>>>>>>>:

0; when y1; : : : ; yt are all dummy keys (2)c(yq; : : : yt); when 9 q > 1 such that y1; : : : ; yq�1 aredummy keys, but yq; : : : yt are not (3)1; otherwise (4)Denote X = fx1; : : : ; xn+kg, X�i = fx1; : : : ; xig, X+i = fxi+1; : : : ; xn+kg,Xij = fxi+1; : : : ; xjg and Wij = Pi<q�j w(xq).Let i; j be a pair of integers, 0 � i � j � n. A path y1; : : : ; yk is (i; j)-legalwhen there exists a binary search tree T having node set X containing thepath y1; : : : yk and such that either i = j and yk is a leaf of T, or yk has achild x` 2 Xij satisfying N(T (x`)) = Xij.A root extension of a binary search tree T is a root path y1; : : : ; yt, suchthat each yq has at most one child in T , 1 � q � t.Let y1; : : : ; yk be a (i; j)-legal path. Denote by Tij(y1; : : : ; yk) an optimal bina-ry search tree having root extension y1; : : : ; yk and node set Xij[fy1; : : : ; ykg.Represent by Cij(y1; : : : ; yk) the (optimal) tree cost of Tij(y1; : : : ; yk). Interms of this notation, a solution to the stated minimization problems is thesubtree of T0n(xn+k; xn+k�1; : : : ; xn+1), having as root the child of xn+1.The following are decomposition theorems for the computation of the costsof the above optimal trees. For each (i; j)-legal path y1; : : : ; yk, Theorems 14

and 2 compute Cij(y1; : : : ; yk), for the worst and average case cost criteria,respectively. The theorems decompose Tij(y1; : : : ; yk) into the optimal treesTi;`�1(y2; : : : ; yk; x`) and T`;j(y2; : : : ; yk; x`), where x` is the child of yk inTij(y1; : : : ; yk). See �gure 2.Theorem 1 (Worst case minimization):Cij(y1; : : : ; yk) = 8>>><>>>: C(y1; : : : ; yk);when i = j. Otherwise, (5)mini<`�jfmaxfCi;`�1(y2; : : : ; yk; x`); C`j(y2; : : : ; yk; x`)g++C(y1; : : : ; yk; x`)� C(y2; : : : ; yk; x`)g; (6)for all 1 � i � j � n and (i; j)-legal k-paths y1; : : : ; yk, k � 1.Theorem 2 (Weighted average case minimization):Cij(y1; : : : ; yk) = 8>>>>>>>><>>>>>>>>:

P1�t�k w(yt):C(y1; : : : ; yt);when i = j. Otherwise, (7)Ci;`�1(y2; : : : ; yk; x`) + C`j(y2; : : : ; yk; x`)++Wij:C(y1; : : : ; yk; x`)� [Wij + w(x`)]:C(y2; : : : ; yk; x`)++P1<t�k w(yt):[C(y1; : : : ; yt)� 2:C(y2; : : : ; yt)]++w(y1):C(y1); (8)for all 1 � i � j � n and (i; j)-legal k-paths y1; : : : ; yk, k � 1.5

PROOF OF THEOREMS 1 and 2:By hypothesis, y1; : : : ; yk is a (i; j)-legal k-path, k � 1 and 0 � i � j � n.If i = j it is simple to verify that (5) and (7) are correct. Let i < j andi < ` � j. First, we show that y2; : : : ; yk; x` is (i; `�1)-legal. Since y1; : : : ; ykis (i; j)-legal, there exists a binary search tree T such that N(T ) = X, whiley1; : : : ; yk is a path of it, with x` 2 Xij a child of yk and N(T (x`)) = Xij.Hence T contains the path y2; : : : ; yk; x`. The latter implies that y2; : : : ; yk; x`is (i; `� 1)-legal whenever Xi;`�1 = ;. Consider now Xi;`�1 6= ;. In this case,x` has a left child xm 2 Xi;`�1 in T and N(T (xm)) = Xi;`�1. Consequently,y2; : : : ; yk; x` is a (i; `� 1)-legal k-path. Analogously, y2; : : : ; yk; x` is a (`; j)-legal k-path.Also, by hypothesis, Tij(y1; : : : ; yk) is an optimal binary search tree hav-ing root extension y1; : : : ; yk and node set Xij [ fy1; : : : ; ykg. Call treeTij(y1; : : : ; yk) as T 0 and let C 0 be its cost. Denote by T 00 and T 000 thepartial subtrees of T 0 with root y2, containing Xi;`�1 and X`j respectively,each of them containing additionally the path y2; : : : ; yk; x`. That is, T 00and T 000 are binary search trees with root extension y2; : : : ; yk; x` and nodesets Xi;`�1 [ fy2; : : : ; yk; x`g and X`j [ fy2; : : : ; yk; x`g, respectively. Sincey2; : : : ; yk; x` is both (i; `� 1)-legal and (`; j)-legal it follows that T 00 and T 000are exactly the kind of trees are needed for the construction of the subprob-lems. Denote by C 00, C 000 the costs of T 00 and T 000, respectively.WORST CASE MINIMIZATION:Let z 2 N(T 0) be a key of T 0 with maximum access cost through the rootpath z1; : : : ; zq to it, where z = zq. That is C 0 = C(z1; : : : ; zq). Sinceall costs are non-negative, we can assume that zq is a leaf of T 0. Hence,zq 2 Xij n fx`g. Assume that zq 2 Xi;`�1. Then z2; : : : ; zq is a root pathto zq in T 00. Compare C(z1; : : : ; zq) with C(z2; : : : ; zq). The contribution ofeach key of Xi;`�1 to these costs is the same in both cases. However, x`contributes with c(y1; : : : ; yk; x`) to the �rst cost, while with c(y2; : : : ; yk; x`)to the second. Similarly, for 1 � a � k, ya contributes with c(y1; : : : ; ya) toC(z1; : : : ; zq) and with c(y2; : : : ; ya) to C(z2; : : : ; zq), except that y1 does notcontribute to C(z2; : : : ; zq). That is, C 0 = C(z2; : : : ; zq) + C(y1; : : : ; yk; x`)�6

C(y2; : : : ; yk; x`). Note that z2; : : : ; zq must be a root path of maximal accesscost in T 00. Otherwise, if there exists a key zt such that the cost of theroot path in T 000 to zt is greater than that to zq, then C 0 > C(z1; : : : ; zq),a contradiction. By the same argument it follows that C 00 � C 000. HenceC 00 = C(z2; : : : ; zq). Since C 0 is minimal, it follows that C 00 must be so.That is, C 00 = Ti;`�1(y2; : : : ; yk; x`). Similarly, C 000 = T`j(y2; : : : ; yk; x`) andC 000 � C 00 when xq 2 X`j. Equation(6) follows.WEIGHTED AVERAGE CASE MINIMIZATION:The contribution of each key z 2 N(T 0) to the cost of T 00 is w(z):C(z�),where z� is the root path to z. Compare C 0 with C 00 + C 000. For each z 6=y1, let d(z) be the di�erence between the contributions of key z to C 0 andC 00 + C 000. The following are clear. When z 2 Xi;`�1 [ X`j the di�erenceis d(z) = w(z):[C(y1; : : : ; yk; x`) � C(y2; : : : ; yk; x`)]. For z = x`, we haved(x`) = w(x`):[C(y1; : : : ; yk; x`)� 2:C(y2; : : : ; yk; x`)].The latter follows from the fact that x` contributes twice to C 00 + C 000. Forz 2 fy2; : : : ; ykg, the value is d(z) = w(z):[C(y1; : : : ; yz)�2:C(y2; : : : ; yz)], bythis same reason. Finally, y1 contributes with w(y1):C(y1) to C 0 and nothingto C 00 +C 000. The di�erence d(z) in all cases does not depend on the costs ofthe trees. Hence C 00 and C 000 are minimal, otherwise C 0 would not be. Thatis, C 00 = Ci;`�1(y2; : : : ; yk; x`) and C 000 = C`j(y2; : : : ; yk; x`) By writing C 0 interms of C 00 + C 000, as above mentioned, equation(8) is obtained. 23 Characterizing legal pathsIn this section we describe characterizations for legal and (i; j)-legal paths.That is, for sequences of keys which are paths in some binary tree, and whichlead to subtrees formed by consecutive keys, respectively. The followingde�nition is useful.Let Y � X. An ordering y1; : : : ; ym of the keys of Y is calledmin-max wheneach yq is either minimal or maximal in fyq; : : : ; ymg. In this case, label eachyq, 1 � q � m, as min or max, respectively.7

1y

2y

ky

x

T (x )l

l

1y

ky x

lT ( ) ,i,l-1

LT (x )

l R

1y

ky x

lT ( ) ,l, j , ,... , ,...Figure 2: The decomposition of Tij(y1; : : : ; yk)The following characterizes legal paths.Theorem 3: A path is legal if and only if it is a min-max ordering.Proof: Let y1; : : : ; ym be a legal path. Then there exists a binary searchtree T , such that y1; : : : ; ym is a path of T . If it is not a min-max order-ing there exists a key yi which is neither the minimal nor the maximal offyi; yi+1; : : : ; ymg, i � m�2. If yi+1 is a left child in T then yi > yi+1; : : : ; ym,impliying that yi is a max key. Similarly, yi+1 can not be a right child, be-cause it would imply that yi is a min key. The contradiction implies thaty1; : : : ; ym is a min-max ordering. 8

Conversely, let y1; : : : ; ym be a min-max ordering. We construct a binary treeT such that y1; : : : ; ym is a path of it. For each i, 1 < i � m, let yi be eitherthe left or right child of yi�1 in T , according to whether yi is a min or maxkey, respectively. It follows that T is a binary search tree. Consequently,y1; : : : ; ym is a legal path. 2The following ordering is also of interest.For Y � X and 0 � i < j � n+k, an ordering Y 0 of Y is called (i; j)-inc-decwhen� Y � X�i [X+j ,� the keys of Y \ X�i are in increasing ordering in Y 0, while those ofY \X+j are in decreasing ordering.� Y \X�i 6= ; =) xi 2 Y , andY \X+j 6= ; =) xj+1 2 YLemma 1: A (i; j)-inc-dec ordering is necessarily a min-max ordering.Proof: Label the keys of Y \X�i as min, and as max those of Y \X+j . 2The next theorem characterizes (i; j)-legal paths.Theorem 4: For i = j a path is (i; j)-legal if and only if it is a min-maxordering. For i < j, a path is (i; j)-legal if and only if it is a (i; j)-inc-decordering.Proof: When i = j the results follows from Theorem 3. Let i < j. Byhypothesis, y1; : : : ; ym is a (i; j)-legal path. Then there is a binary searchtree T , having X as its node set, where y1; : : : ; ym is a path of it, ym thefather of some x` 2 Xij and the subtree T (x`) contains exactly the keys ofXij. Let Y = fy1; : : : ; ymg. First, clearly Y � X�i [ X+j . Second, supposethere exists a key yq 2 X�i \Y such that yq > yq+1 for some 1 � q < m. SinceT is a binary search tree, it follows that yq+1 is a key of the left subtree of yq.Since yq is an ancestor of ym, we know that x` also belongs to this subtree,contradicting x` > yq, implied by yq 2 X�i . Hence no such q can exist.9

Consequently, the keys of X�i \ Y are in decreasing ordering in y1; : : : ; ym.Similarly, we prove that those of X+j \ Y form a decreasing ordering. Third,suppose that X�i \ Y 6= ; and xi 62 Y . Denote by yt the maximal key ofX�i \ Y . Then yt < xi. We try to locate key xi in T . Suppose that xi is adescendant of yt. Then xi belongs to the right subtree R of yt. Consequently,T (x`) is also in R. If t = m then xi 2 T (x`), a contradiction. When t < mwe know that yt; : : : ; ym is a path of R. Because T is a binary search tree andthe maximality of yt in X�i it follows that yt+1; : : : ; ym 2 X+j . Consequently,because the keys of X+j \ Y are in decreasing ordering in y1; : : : ; ym, weconclude that yt+1 is a right child, but yt+2; : : : ; ym; x` are all left children.Because xi < yt+1; : : : ; ym; x` it follows that xi must belong to T (x`). Thelatter contradicts again the fact that T (x`) contains exactly Xij. Hence xi isnot a descendant of yt. Neither can xi be an ancestor of yt. Because in thiscase, yt belongs to the left subtree L of xi, implying that x` > xi belongs to L,a contradiction. The remaining possibility is that xi is neither a descendantnor an ancestor of yt. In this case, let z be the nearest common ancestor ofxi and yt. Denote by L and R the left and right subtrees of z, respectively. Ifxi is in L then yt must be in R, contradicting yt < xi. The other case is xi inR and yt in L, making it impossible the assumption xi < xi+1. Therefore thealternative that xi is neither a descendant nor an ancestor of yt can also notoccur. Consequently, X�i \Y 6= ; implies xi 2 Y . The proof that X+j \Y 6= ;implies xj+1 2 Y is similar.Conversely, suppose that y1; : : : ; ym is a (i; j)-inc-dec ordering, 0 � i < j �n + k. We construct a tree T 0 as follows. It contains the path y1; : : : ; ym.T 0 also contains a subtree T 0(x`), having an arbitrary root x` 2 Xij, andsatisfying the following property: T 0(x`) is a binary search tree containingexactly the keys of Xij. Finally, make x` the left or right child of ym, ac-cording to whether ym 2 X+j or ym 2 X�i , respectively. The construction ofT 0 is completed. Let Y = fy1; : : : ; ymg. Since y1; : : : ; ym is a (i; j)-inc-decordering, it follows that Y \Xij = ;. Hence the path y1; : : : ; ym and T (x`)are disjoint. The latter assures that T 0 is indeed a binary tree. We show thatit is a binary search tree. Let z1; z2 be keys of T 0, z1 belonging to the leftsubtree L of z2. Consider the possibilities:Case 1: z1; z2 2 YSince y1; : : : ; ym is a (i; j)-inc-dec ordering, by Lemma 1 it is a min-max10

ordering. By Theorem 3 it must be a legal path. Hence z1 being in L impliesz1 < z2.Case 2: z1 2 Xij and z2 2 YSuppose z2 = ym. Then x` must be the left child of ym. By the constructionof T 0, we conclude that z2 2 X+j . Hence z1 < z2. Suppose now z2 6= ym.By Case 1, we conclude that ym < z2. Suppose ym 2 X+j . Then z1 < ym,implying z1 < z2. Alternatively, consider ym 2 X�i . In this case, if z2 2 X�ithen z2; ym must appear in increasing ordering, because y1; : : : ; ym is a (i; j)-inc-dec ordering. Hence z2 < ym, a contradiction. Consequently, z2 2 X+j .That is, z1 < z2.Case 3: z1 2 Y and z2 2 XijThis case can not occur, because it implies that z2 is a descendant of z1. Thiscontradicts z1 belonging to L.Case 4: z1; z2 2 XijSince T (x`) is a binary search tree, z1 being in L implies z1 < z2.From the above cases, we can conclude that z1 belonging to TL(z2) impliesthat z1 < z2, for any z1; z2 2 Y [ Xij. Similarly, it can be proved that z1belonging to TR(z2) implies z1 > z2. Consequently, T 0 is a binary searchtree containing the keys N(T 0) = Y [ Xij. Let X 0 = X n N(T 0). We nowinclude in T 0 each key of X 0, as follows. If Y \ X�i = ; and i > 0 theninclude xi 2 X 0 in T 0 so as y1 becomes the right child of xi. Similarly, ifY \ X+j = ; and j < n + k then xj+1 2 X 0 is included in T 0 in such a waythat y1 is the left child of xj+1. Note that the above two conditions can notoccur simultaneouly. Next, for each key of X 0 not yet included in the tree,include it according to the rules of binary search tree insertion. Let T bethe �nal tree so obtained. Since T 0 is a binary search tree, T is so. Also,T 0 is a partial subtree of T . Clearly N(T ) = X and y1; : : : ; yk is a path ofT 0. In order to show that y1; : : : ; ym is (i; j)-legal, it remains to prove thatT (x`) = T 0(x`). Suppose the contrary. Then T (x`) contains some key z 2 X 0.Suppose z 2 X�i . The following alternatives exist.Case 1: Y \X�i 6= ;By the de�nition of (i; j)-inc-dec ordering, it follows that xi 2 Y . That is,xi is a proper ancestor of x` in T . Hence, z 6= xi. Since xi is the maximal11

key of X�i , it follows z < xi. Then the binary search tree insertion procedurewould not include z in the right subtree of xi. On the other hand, x` belongsto the right subtree of xi, as xi < x`. Hence z 62 N(T (x`)).Case 2: Y \X�i = ;If i = 0 then X�i = ;, contradicting z 2 X�i . When i > 0, y1 is the right childof xi, by the construction of T . Hence z < xi, implying that the binary searchtree insertion again could not include z in TR(xi). However x` 2 N(TR(xi)).That is, z 62 N(T (x`)).Consequently, z 2 X�i implies that z is not in T (x`). Similarly, we prove thatz 2 X+j also implies that z can not be in T (x`). Therefore T (x`) is formedexactly by the keys of Xij. Hence y1; : : : ; ym is a (i; j)-legal path, completingthe proof of Theorem 4. 24 The algorithms and analysisThe algorithms for �nding optimal worst case and weighted average casebinary search trees can now be described.The input consists of an integer k > 0, a set fx1; : : : ; xng of keys, xq <xq+1, and a key cost c(y1; : : : ; yt) for each legal t-path, 1 � t � k + 1.Alternatively, the input can consist of a function which enables to computethe key costs c(y1; : : : ; yt), whenever needed. In the latter case we assume thatthis computation can be done in constant time. In addition, in the weightedaverage case problem each key xq is also given a non-negative weight w(xq).The algorithms start by de�ning the dummy keys fxn+1; : : : ; xn+kg. Using(2) � (4), compute the key costs c(y1; : : : ; yt), for each legal t-path y1; : : : ytwith at least one dummy key, 1 � t � k+1. De�ne w(xq) = 1 for each n+1 �q � n+ k. For each (i; j)-legal t-path y1; : : : ; yt and 0 � i � j � n, computeCij(y1; : : : ; yt) by (5)� (6) and (7)� (8), respectively for the worst case andweighted average case problems. The �nal solution is C0n(xn+k; : : : ; xn+1).Below, we compute the exact number of steps performed by the worst case12

minimization algorithm.To count the total amount of work to do, consider that each di�erent subin-terval (i; j) of the array reached through a di�erent legal path must be pro-cessed. To process such interval, we must consider all its positions from ito j, and compute the worst-case or expected-case cost at each position. Tocompute such cost, we need the cost of some subintervals. Given that thosesubintervals are already computed, we work O(jj�ij) to solve the subinterval(i; j) given a previous access history of length k. Hence, what we have tocompute is the sum of jj � ij for all i � j for all access histories of length kwhich lead to the subinterval (i; j).Rethink the access history in this way: instead of considering a sequence ofyq min-max values, consider that the interval to work on, initially [1; n], isreduced k times, by either incrementing its left limit (min value) or decre-menting its right limit (max value). Hence, we have a sequence of incrementsand a sequence of decrements, where the sum of the steps is k. We can iden-tify the access history with the pair of sequences (accounting also for theform in which they are mixed). When the k steps are done, we work in timeproportional to the size of the interval left. See Figure 3.We use generating functions to count the total amount of work.The function to be used has three variables z, x, w. Let the variable z countthe total size of the array (n), x count the total number of accesses (k) andw the total amount of work. Our generating function is thusF (z; x; w) = Xn;k;r�0Wn;k;rznxkwrsuch that in an array of n elements there are Wn;k;r di�erent histories of ksteps which lead to an interval of size r (which costs O(r)). Therefore thetotal amount of work is the coe�cient of znxk in the functiong(z; x) = �F�w (z; x; 1) = Xn;k;r�0 rWn;k;rznxkThis is correct, since rWn;k;r is the total amount of work to do on an arrayof size n and histories of length k. 13

��

Array

Access path

z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z z zx x x x x xx

Increasing accesses Work here Decreasing accesses

wwwwwwwwwwwwFigure 3: Interpreting legal paths. Variables z, x and w correspond to thequantities to be counted.14

There are two important cases here. First, if an interval has increasing anddecreasing components, then we do not have to perform a di�erent compu-tation for all the possible original subintervals. For instance, suppose thatn = 100 and k = 2. The access history given by [25,75] for example, thatyields the subinterval [26,74] to work on does not depend on the original in-terval [1,100]. The �nal subinterval [26,74] would not need to be recomputed.If, on the other hand, both accesses at 25 and 75 are increasing then the �nalsubinterval is [76,100], which certainly depends on the initial interval [1,100].Hence, we must sum over all access histories with no regard to the initialsubintervals, except for those which have only increasing or only decreasingcomponents.To keep count of the size of the array (in z) and the number of steps (inx) at the same time, we consider the number of elements \skipped" in theconsecutive increments (see Figure 3). A single increasing step is representedby the function I(z; x) = xz1� z = xz + xz2 + xz3 + :::that is, one access is performed (x) after skipping over one or more elementsof the array (z's). There is at least one element, which is the array elementcompared. A sequence of zero or more increasing accesses is represented byI�(z; x) = 11� I(z; x) = 1 + I(z; x) + I(z; x)2 + I(z; x)3 + :::and the same formulas hold for D(z; x) = I(z; x) and D�(z; x) = I�(z; x). Asequence of intermingled increasing and decreasing accesses corresponds toID�(z; x) = 11� (I(z; x) +D(z; x))and the �nal sequence of elements of the set where we have to work is repre-sented by 11� wz = 1 + wz + w2z2 + w3z3 + :::(where we count one unit of work in w and one element of the array in z).On the other hand, a sequence of elements where we do not have to work issimply 1=(1� z) = 1 + z + z2 + :::. 15

We are now ready to state the general formula. Since we are disregarding theinitial and �nal ends of the array, we represent the sequence of accesses justby ID�(z; x). However, for the case of only increasing or decreasing elementswe have to subtract what we have added and replace it by a formula thatallows to consider all the possible initial right extremes (for I) and all possibleinitial left extremes (for D). In the case of increments (the decrements aresimilar), this is obtained by subtracting I�(z; x) from ID�(z; x) and thenadding I�(z; x)=(1 � z), since this allows to add an arbitrary number of z'sto the right, accounting for all possible positions of the sequence inside thearray.Finally, after a sequence of increasing and decreasing steps, there is a �nalcentral segment on which we work. Since we sum z's along all this process,we have in z the length of the resulting array. We add an x per step so wehave in x the number of steps. Finally, we have in w the amount of work. Atthe end, we select those processes which turn out to have n elements (zn),and k steps (xk). The formula isF (z; x; w) = �ID�(x; y)� I�(z; x) + I�(z; x) 11� z �D�(z; x)++D�(z; x) 11� z� 11� wzwhich is equal toF (z; x; w) = 11� 2xz1�z + 2z=(1� z)1� xz1�z ! 11� wzWe derive the above formula with respect to w and evaluate it at w = 1, toobtain g(z; x) = z(1� z)2 11� 2xz1�z + 2z=(1� z)1� xz1�z !To �nd the coe�cient that corresponds to xk in g(z; x), notice that thecoe�cient for 1=(1� ax) is ak. Hencegk(z) = z(1� z)2 2kzk(1� z)k + 2zk+1(1� z)k+1!16

and to obtain the coe�cient that corresponds to zn in gk(z), notice that thecoe�cient of 1=(1� z)m+1 is �n+mm �, and that the coe�cient of zn in zf(z) isthat of zn+1 in f(z). Consequently, the total amount of work is exactlygk;n = 2k nk + 1! + 2 nk + 2!which for instance shows that for k = 1 the amount of work is g1;n =n3=3 � n=3. To obtain a more easy to handle formula we can simplify thecombinatorials and conclude that the cost isgk;n = 2knk+1(k + 1)! + 2nk+2(k + 2)!! (1 +O(k=n)) � nk+2In fact, we should consider also the access paths with less than k elements,since in the initial accesses we do not have the full history. This is obtainedby summing up the above values for k from zero to its maximum values. Theresult is still upper bounded by nk+2.Notice that we have left aside the case of zero-length sequences, where bothends of the initial subinterval must be considered (not only the rightmost orleftmost). Because of this the analysis does not apply to k = 0, which gives11� z 11� wz 11� zi.e. g0;n = n3=6 + n2=2 + n=3.We consider space now. We have to store one cell for each di�erent accesspath. To count the number of access paths, we modify the above analysisso that we count now the number of paths in w. This time, the centralsubinterval left is not represented by 1=(1� wz), but by w=(1 � z) instead(i.e. we count only one w which represents the interval). We have nowF 0(z; x; w) = 11� 2xz1�z + 2z=(1� z)1� xz1�z ! w1� zand hence g0(z; x) = 11� z 11� 2xz1�z + 2z=(1� z)1� xz1�z !17

which gives g0k;n = 2k nk! + 2 nk + 1!and this can be simpli�ed tog0k;n = 2knkk! + 2nk+1(k + 1)!! (1 +O(k=n)) � nk+1which again is kept unchanged if we add up also the histories of length lessthan k.Hence, the total time complexity is O(nk+2) and the space complexity isO(nk+1).The analysis of the weighted average case minimization algorithm is similar.The corresponding time and space complexities are O(knk+2) and O(nk+1).5 ConclusionsWe have described algorithms for �nding optimal binary search trees for agiven set fx1; : : : ; xng of keys when the cost of each key xq depends on the(k + 1)�path leading to xq. The parameter k is a given arbitrary integer inthe range 1 � k < n. The optimality refers to a tree having either minimalworst case or weighted average case cost. The algorithms are polynomial inthe size of the input, although they are exponential in k. Their complexitiesare O(nk+2) and O(k:nk+2), respectively for the worst and weighted averagecases. It should be noted that these complexities are O(n) and O(k:n) timesthe input size.The algorithms make use of additional dummy keys fxn+1; : : : ; xn+kg, withcosts accordingly de�ned. It is simple to modify the algorithms to avoidcomputations with dummy keys. An idea is to impose that whenever xp andxq are dummy keys and xp is a proper ancestor of xq then p > q.The monotonicity principle by Knuth [2] made it possible to decrease thenumber of iterations for constructing an optimal binary search tree from18

O(n3) to O(n2). Unfortunately, the principle does not hold for k > 0, asshown by the following example. Let fx1; : : : ; xk+2g be the given set of keys,all with unit weights and costs de�ned as:c(xk+1; : : : ; x1) = c(xk+1; : : : ; x2) = : : : = c(xk+1) = 0,c(x1; : : : ; xk; xk+2) = c(x1; : : : ; xk) = : : : = c(x1) = 0,c(x2; : : : ; xk; xk+2; xk+1) = 0,while any other key cost is equal to 1. The solution of both minimizationproblems for the keys fx1; : : : ; xk+1g is the tree formed by the single pathxk+1; : : : ; x1. When adding the key xk+2, the optimal tree for fx1; : : : ; xk+2gis the path x1; : : : ; xk; xk+2; xk+1, meaning that the principle does not applyfor k > 0. In fact, it does not hold also for k = 0 and non uniform key costs.Finally, we mention that gap weights can be introduced following the tech-niques introduced in [2].References[1] E. N. Gilbert and E. F. Moore, Variable-lenght binary encoding, BellSystem Tech. J. 38 (1959), pp. 933-968.[2] D. E. Knuth, Optimum binary search trees, Acta Informatica 1 (1971),pp. 14-25.[3] D. E. Knuth, The Art of Computer Programming 1: Fundamental Al-gorithm, Addison-Wesley, Reading, MA, 1968, 2nd ed. 1973.[4] D. E. Knuth, The Art of Computer Programming 3: Sorting and Search-ing, Addison-Wesley, Reading, MA, 1973.[5] S. V. Nagaraj, Optimal binary search trees, Theoretical Computer Sci-ence 188 (1997), pp. 1-44.19

optimal binary search trees with costs depending on the access paths

Documents