structural joins: a primitive for efficient xml query pattern matching shurug al-khalifa, h. v....

30
Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava, Yuqing Wu Presented by Parag Abhyankar 08305017

Upload: jaren-pugmire

Post on 14-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

Structural Joins: A Primitive for Efficient XML Query Pattern Matching

Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava, Yuqing Wu

Presented by

Parag Abhyankar08305017

Page 2: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

2

Introduction

XQuery Specify patterns of Selection Predicate having Tree Structural Relationship. e.g.  book[title = ‘XML’] // author[. = ‘jane’]

The primitive tree structured relationships Parent-child : (book, title), (title,XML), (author, jane) Ancestor-descendant : (book, author)

Finding all occurrences of these relationships is a core operation for XML query processing.

Page 3: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

3

Representing XML Elements : (Background)

Element: (DocId, StartPos : EndPos, LevelNum) String: (DocId, StartPos, LevelNum) Inspired from 'Multi-Predicate Merge Join' by Zang

Page 4: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

4

Background continued..

Element E1(D1,S1:E1,L1) Element E2(D2,S2:E2,L2)

If D1=D2, S1<S2 and E2<E1 E1-E2 is ancestor-descendant

If D1=D2, S1<S2, E2<E1 and L1+1=L2 E1-E2 is parent-child

Page 5: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

5

Structural Joins Join Algorithms for matching Structural Relationship

tree-merge and stack-tree

Input: Lists of tree nodes sorted by (DocId, StartPos)

Output: Lists of sorted results joined according desired structural relationship.

Use in XML Query Pattern matching Query Tree Pattern decompose binary structural

relationships. Match each relationship with XML database ‘Stitching’ together basic matches

Page 6: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

6

Tree-Merge Join(O/p Sorted Ancestor/Parent order)

AList and DList lists of potential ancestors and descendants in sorted order.

For every node in AList do Skip all unmatchable d's (d starts before a) Output pair (a,d) till a ends after d.

Page 7: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

7

Example

Alist={Title_1}

Dlist={Book_1, XML_1, Jane_1}

Title_1

Skips Book_1 as it starts before Title_1. Pairs with XML_1 Do not consider Jane_1 as it ends after

Title_1.

Book

Author

Jane

Title

XML

AList

Title_1

DList

Book_1

XML_1

Jane_1

Page 8: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

8

Tree-Merge Join Detail Algorithm (O/p Sorted Ancestor/Parent order)

Page 9: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

9

Example

ai pairs with each dj where i <= j <= 2i-1

Worst Case scenario.

Complexity: O(|AList| + |DList| + |OutputList|)

Page 10: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

10

Tree-Merge Join(O/p Sorted Descendants order)

AList and DList lists of potential ancestors and descendants in sorted order.

For every node in DList do Skip all unmatchable a's (a ends before d starts) Output pair (a,d) till a starts before d starts.

Page 11: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

11

Example

Alist={Book_1, Title_1}

Dlist={Book_1, XML_1, Jane_1}

Book_1

doesn't have any matching a. XML_1

Pairs with Book_1, Title_1 Jane_1

Pairs with Book_1 Do not consider Title_1 (as Title_1 starts

before Jane_1)

Book

Author

Jane

Title

XML

AList

Book_1

Title_1

DList

Book_1

XML_1

Jane_1

Page 12: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

12

Tree-Merge Join Algorithm (O/p Sorted Descendent/Child order)

Page 13: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

13

Example

di pairs with ai and a0

Worst Case scenario.

Page 14: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

14

Stack-Tree Desc.(O/p sorted by Descendants)

Stack Contains Elements that can be ancestor of remaining ds

Consider elements from Alist and Dlist one by one

If top can not be ancestors, POP it out. If new 'a' has potential to be ancestor add to Stack Else new 'd' will pair with all elements for Stack (Bottom

to Top )

Page 15: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

15

Stack-Tree Desc.(O/p sorted by Descendants)

Page 16: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

16

Example

AList = {a1,a2,a3,…,an} DList = {d1,d2,d3,….d2n}

a a1 d d1

Stack Only ai s can go on Stack

Page 17: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

17

Example continued..

AList = {a2,a3,…,an} DList = {d1,d2,d3,….d2n}

As a starts before d a1 goes to stack a a2 d d1

a1

Stack

Page 18: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

18

AList = {a2,a3,…,an} DList = {d2,d2,d3,….d2n}

As d starts before a d1 pairs with all elements

from Stack

a a2 d d2

Example continued..

a1

Stack

Page 19: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

19

AList = {a3,a3,…,an} DList = {d2,d2,d3,….d2n}

As a starts before d a2 goes to stack

a a3 d d2

Example continued..

a2

a1

Stack

Page 20: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

20

AList = {a3,a3,…,an} DList = {d2,d2,d3,….d2n}

As d starts before a d2 pairs with all elements

from Stack

a a3 d d3

Example continued..

a2

a1

Stack

Page 21: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

21

AList = {} DList = {dn+2,….d2n}

d dn+2

dn+2 will pop an

As an ends before dn+2

Topan-1

Example continued..

an-1..

a2a1

Stack

dn+2

Page 22: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

22

Stack-Tree Anc.(O/p sorted by Ancestor)

Tricky: As join with top of stack can’t be added to o/p until join to it’s ancestor is added to o/p.

two lists are associated with each node on the stack: self-list is a list of result elements from the join of

this node with appropriate DList elements. inherit-list is a list of join results involving AList

elements that were descendants of the current node on the stack.

Page 23: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

23

Stack-Tree Anc.(O/p sorted by Ancestor)

Page 24: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

24

Example

AList = {a1,a2,a3,…,an} DList = {d1,d2,d3,….d2n}

a a1 d d1

Stack Only ai s can go on Stack

Page 25: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

25

Example continued..

AList = {a2,a3,…,an} DList = {d1,d2,d3,….d2n}

As a starts before d a1 goes to stack a a2 d d1

a1

Stack

Page 26: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

26

AList = {a2,a3,…,an} DList = {d2,d2,d3,….d2n}

As d starts before a d1 pairs with all elements

from Stack and added to their self-list

a a2 d d2

Example continued..

a1

Stack

SL= d1IL=

Page 27: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

27

AList = {a3,a3,…,an} DList = {d2,d2,d3,….d2n}

As a starts before d a2 goes to stack

a a3 d d2

Example continued..

a2

a1

Stack

SL= d1IL=

Page 28: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

28

AList = {a3,a3,…,an} DList = {d2,d2,d3,….d2n}

As d starts before a d2 pairs with all elements

from Stack and added to their self-list

a a3 d d3

Example continued..

a2

a1Stack

SL= d1, d2IL=

SL= d2IL=

Page 29: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

29

AList = {} DList = {dn+2,….d2n}

d dn+2

dn+2 will pop an

an’s SL appended to IL and IL appended to an-1’s SL

Topan-1

Example continued..

an-1..

a2a1

dn+2

SL= d1,d2..

SL= d2,d3…

SL= dn-1IL=(an-dn)..

IL=

IL=

The Last node coming out of Stack will append IL to OutputList

Page 30: Structural Joins: A Primitive for Efficient XML Query Pattern Matching Shurug Al-Khalifa, H. V. Jagadish, Nick Koudas, Jignesh M. Patel, Divesh Srivastava,

30

Experimental Evaluation

Results

– STJ-D outperforms other algorithms

• Single pass over i/p nodes, No intermediate file writes

– STJ-A showed better performance than TMJ-A, TMJ-D

– Performance of STJ-A is comparable with TMJs when result size is large.

• Writing to intermediate files