cse 636 data integration xml query languages xquery
Post on 20-Dec-2015
241 views
TRANSCRIPT
2
XQuery
• http://www.w3.org/TR/xquery/ (11/05)• Functional Programming Language• Operates on XML Sources• Returns XML
3
XQuery Components
• XQuery is composed of– Path expressions– Element constructors– FLWOR expressions– … and more …
4
Path Expressions
doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER
CUSTOMER_ORDERS
NAMESue
CUSTOMER
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
NAMETom
CUSTOMER
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1861
SKUC5
QTY1
ITEMCARRIERFEDEX
NAMEAnn
CUSTOMER
SKUP5
QTY1
ITEM
CUSTOMER
Evaluate expression bycollecting all elementswhich satisfy the path
5
Element Construction
<ORDERS> { doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER} </ORDERS>
ORDER
NO1861
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
SKUP5
QTY1
ITEM
ORDERS
A complete, executablequery returning the
ORDERS tree
1. Evaluate expression inside { ... }
2. Connect into tree
6
Introduction to for Expression
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER return $order} </ORDERS>
<ORDERS> { doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER} </ORDERS>
ORDER
NO1861
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
SKUP5
QTY1
ITEM
ORDERS
Our path query …
… can be rewritten using a for expression:
7
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
8
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return $order} </ORDERS>
ORDER
NO1861
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
SKUP5
QTY1
ITEM
ORDERS
Example with where
We take our previous query and add a where clause:
The output is the same as in the previous example, except non-UPS carriers are removed.
9
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return $order} </ORDERS>
FLWOR Expressions: The for Clause
The for variableranges over result
of in expression
CUSTOMER_ORDERS
NAMESue
CUSTOMER
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
NAMETom
CUSTOMER
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1861
SKUC5
QTY1
ITEMCARRIERFEDEX
NAMEAnn
CUSTOMER
SKUP5
QTY1
ITEM
CUSTOMER
10
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return $order} </ORDERS>
FLWOR Expressions: The where Clause
Selects only orders with UPS
as the carrier
CUSTOMER_ORDERS
NAMESue
CUSTOMER
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
NAMETom
CUSTOMER
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1861
SKUC5
QTY1
ITEMCARRIERFEDEX
NAMEAnn
CUSTOMER
SKUP5
QTY1
ITEM
CUSTOMER
11
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return $order} </ORDERS>
FLWOR Expressions: The return Clause
Every $orderthat qualified is added
to the return list:
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
SKUP5
QTY1
ITEM
12
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return $order} </ORDERS>
ORDERS
FLWOR Expressions: Final Result
The list coming fromthe FLWOR expression …
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
SKUP5
QTY1
ITEM
… is constructed intothe ORDERS element
to complete the example.
13
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return <ID> { data($order/NO) } </ID>} </ORDERS>
ORDERS
ID1897
ID1878
Example with Element Construction
Here, the return statementconstructs elements from values
• The “data” function returns the value of an element• The return statement also contains tags
• The next slide illustrates how the following result is created:
14
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
SKUP5
QTY1
ITEM
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return <ID> { data($order/NO) } </ID>} </ORDERS>
ORDERS
ID1897
ID1878
Return – Element Construction
1. Bring in selected items as before
2. Path selection
3. New element construction
4. Connect into tree
15
FLWOR Expressions: The let Clause
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $carrier := $order/CARRIER let $id := data($order/NO) where $carrier = "UPS" return <ID> { $id } </ID>} </ORDERS>
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return <ID> { data($order/NO) } </ID>} </ORDERS>
ORDERS
ID1897
ID1878
Our previous example can be rewritten using extra variable bindings to improve clarity:
16
<CUSTOMERS> { for $customer in doc("co")/CUSTOMER_ORDERS/CUSTOMER let $name := $customer/NAME order by $customer/NAME ascending return <CUSTOMER> {$customer/NAME} </CUSTOMER>} </CUSTOMERS>
NAMESue
CUSTOMER
NAMEAnn
CUSTOMERS
NAMETom
CUSTOMERCUSTOMER
FLWOR Expressions:The order by Clause
For this example, we prepare a list of customers sorted by customer name
17
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
18
• In the context of functions and operators, values are automatically extracted from elements:
Type Conversions
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return <ID> { concat("ORDER-", $order/NO) } </ID>} </ORDERS>
19
• $order/NO binds to an element• concat(…) requires a string
• Value of the element is automatically extracted
• Same happens to lists containing a single element or value
Type Conversions
20
Type Conversions
All other cases result in errors
<ORDERS> { <ID> { concat("ORDER-", doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER/NO) } </ID>} </ORDERS>
• Path expression above binds to lists • Cannot extract a value from a list of many
items!
21
• The data() function can be used to explicitly extract the value:
Type Conversions
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return <ID> { concat("ORDER-", data($order/NO)) } </ID>} </ORDERS>
22
• Automatic extraction of values does not occur in element construction
• In that case, the data() function is required:
Type Conversions
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER where $order/CARRIER = "UPS" return <ID> { data($order/NO) } </ID>} </ORDERS>
23
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
24
For-Let-Where-Order By-Return (FLWOR)
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $carrier := $order/CARRIER let $id := data($order/NO) where $carrier = "UPS" return <ID> { $id } </ID>} </ORDERS>
Let’s take a more in-depth look at the variable bindings in the query developed previously
return clauseis executed for each remaining
tuple, generating
a list of trees
return expr
for and letclauses generate a list of tuples of variable bindings,
preserving input order
for $var1 in expr
let $var2 := expr
where clauseapplies a predicate,eliminatingsome of the
tuples
where expr order by expr
order byclause
imposes anorder on theremaining
tuples
25
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $carrier := $order/CARRIER, $id := data($order/NO) where $carrier = "UPS" return <ID> { $id } </ID>} </ORDERS>
where
YES
YES
NO
ID1878
result
ORDERS
ID1897
ID1878
return
ID1897
$orderORDER
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY1
ITEM
SKUP5
QTY1
ITEMCARRIERUPS
NO1861
for/let$id
1897
1878
1861
$carrier
CARRIERFEDEX
CARRIERUPS
CARRIERUPS
FLWOR Variable Bindings
26
for vs. let
for• Binds node variables iterationfor $x in expr– binds $x to each element in the list expr
let• Binds collection variables one valuelet $x := expr– binds $x to the entire list expr– Useful for common subexpressions and for
aggregations
27
for vs. let
Returns: <result> <ORDER>…</ORDER></result> <result> <ORDER>…</ORDER></result> <result> <ORDER>…</ORDER></result> …
Returns: <result> <ORDER>…</ORDER> <ORDER>…</ORDER> <ORDER>…</ORDER> … </result>
for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDERreturn <result> { $order } </result>
let $order := doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDERreturn <result> { $order } </result>
28
for vs. let
<POPULAR_ITEMS> { for $sku in distinct-values(doc(“co")//ITEM/SKU) let $items := doc(“co")//ORDER/ITEM[SKU = $sku] let $qtyTotal := sum($items/QTY) where $qtyTotal > 1 return <ITEM> { $sku } </ITEM>} </POPULAR_ITEMS>
• distinct-values– a function that eliminates duplicate values– can be applied to simple elements and atomic values
• sum– a (aggregate) function that returns the sum of integers
29
<POPULAR_ITEMS> { for $sku in distinct-values(doc(“co")//ITEM/SKU) let $items := doc(“co")//ORDER/ITEM[SKU = $sku] let $qtyTotal := sum($items/QTY) where $qtyTotal > 1 return <ITEM> { $sku } </ITEM>} </POPULAR_ITEMS>
where
YES
NO
YES
result
POPULAR_ITEMS
return
ITEMC5
$sku
SKUB7
QTY2
ITEM
SKUP5
QTY1
ITEM
for/let
P5
B7
C5
$items
for vs. let
ITEMC5
SKUC5
QTY1
ITEM
SKUC5
QTY2
ITEM
$qtyTotal
1
2
3
ITEMB7
ITEMB7
30
for vs. let
Find items whose quantity is larger than average:let $avgQty := avg(doc(“co”)//ITEM/QTY)for $item in doc(“co”)//ITEMwhere $item/QTY > $avgQtyreturn $item
where
YES
NO
YES
return$avgQty
SKUB7
QTY2
ITEM
SKUP5
QTY1
ITEM
for/let
1.5
1.5
1.5
$items
SKUC5
QTY1
ITEM
SKUC5
QTY2
ITEM
$qtyTotal
1
2
2
1.5 1 NO
SKUB7
QTY2
ITEM
SKUC5
QTY2
ITEM
$avgQtylet
1.5
31
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
32
• Joins are expressed using a FLWOR with two loop variables– two for clauses
• A where condition specifies how the loop variables relate
Joins
33
NAMEUPS
PICKUP5PM
SHIPPER
NAMEFEDEX
PICKUP2PM
SHIPPER
ORDER
ID1878
DEADLINE5PM
ORDERS
ORDER
ID1897
DEADLINE5PM
ORDER
ID1861
DEADLINE2PM
ORDER
NO1861
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
SKUP5
QTY1
ITEM
Join Example
Combineorders…
… withshipper info …
… to produceorder deadlines
34
Join Example Query
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER for $shipper in doc("s")/SHIPPERS/SHIPPER let $id := data($order/NO) let $time := data($shipper/PICKUP) where $order/CARRIER = $shipper/NAME return <ORDER> <ID>{$id}</ID> <DEADLINE>{$time}</DEADLINE> </ORDER>} </ORDERS>
Uses multiple for statements to generate Cartesian product of tuples
Uses where statement to filter Cartesian product
35
$time$id
1897
1878 2PM
1861
2PM
2PM
1897
1878 5PM
1861
5PM
5PM
NO
NO
YES
YES
YES
NO
where
Join Conditions
$order
CARRIERFEDEX
ITEMNO1861
ORDER
for/let
…
CARRIERUPS
ITEMNO1878
ORDER
…
CARRIERUPS
ITEMNO1897
ORDER
… ITEM…
CARRIERFEDEX
ITEMNO1861
ORDER
…
CARRIERUPS
ITEMNO1878
ORDER
…
CARRIERUPS
ITEMNO1897
ORDER
… ITEM…
$shipper
NAMEFEDEX
SHIPPER
PICKUP2PM
NAMEFEDEX
SHIPPER
PICKUP2PM
NAMEFEDEX
SHIPPER
PICKUP2PM
NAMEUPS
SHIPPER
PICKUP5PM
NAMEUPS
SHIPPER
PICKUP5PM
NAMEUPS
SHIPPER
PICKUP5PM
ORDER
DEADLINE2PM
ID1861
ORDER
DEADLINE5PM
ID1897
ORDER
DEADLINE5PM
ID1878
return
36
Condensed Join Table
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER, $shipper in doc("s")/SHIPPERS/SHIPPER let $id := data($order/NO), $time := data($shipper/PICKUP) where $order/CARRIER = $shipper/NAME return <ORDER> <ID>{$id}</ID> <DEADLINE>{$time}</DEADLINE> </ORDER>} </ORDERS>
$time$id
1861 2PM
1897
1878 5PM
5PM
$order
CARRIERFEDEX
ITEMNO1861
ORDER
for/let
…
CARRIERUPS
ITEMNO1878
ORDER
…
CARRIERUPS
ITEMNO1897
ORDER
… ITEM…
$shipper
NAMEFEDEX
SHIPPER
PICKUP2PM
NAMEUPS
SHIPPER
PICKUP5PM
NAMEUPS
SHIPPER
PICKUP5PM
ORDER
DEADLINE2PM
ID1861
ORDER
DEADLINE5PM
ID1897
ORDER
DEADLINE5PM
ID1878
return
In future examples,non-joined rows are removed, as are join where conditions:
37
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
38
• Nested queries produce hierarchical results• An outer FLWOR loop contains
an inner FLWOR loop• Typically, a where condition in the inner
FLWOR specifies how the loops relate
Nested Queries
39
NAMEUPS
PICKUP5PM
SHIPPER
NAMEFEDEX
PICKUP2PM
SHIPPER
Nested Query Example
Combineshippers…
ORDER
NO1861
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
SKUP5
QTY1
ITEM… withorders …
SHIPPER_ORDERS
SHIPPER
ORDER1878
NAMEUPS
SHIPPER
ORDER1861
NAMEFEDEX
ORDER1897
… to produce orders for each shipper
40
<SHIPPER_ORDERS> { for $shipper in doc("s")/SHIPPERS/SHIPPER let $name := $shipper/NAME return <SHIPPER> { $name } { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $id := data($order/NO) where $name = $order/CARRIER return <ORDER> { $id } </ORDER> } </SHIPPER>} </SHIPPER_ORDERS>
Nested Query
• Outer loop binds $shipper and $name variables• For each $shipper, $name pair, inner loop binds $order and
$id variables• Inner where clause removes $order, $id pairs that don’t
match outer element• Inner loop constructs elements from inner variables• Outer loop constructs elements from outer variables and from
elements constructed in inner loop
41
Join Conditions
$name
NAMEFEDEX
NAMEUPS
$shipper
NAMEFEDEX
SHIPPER
PICKUP2PM
NAMEUPS
SHIPPER
PICKUP5PM
OUTER LOOP
SHIPPER
ORDER1861
NAMEFEDEX
return
SHIPPER
ORDER1878
NAMEUPS
ORDER1897
OUTER LOOP$id
1897
1878
1861
1897
1878
1861
$order
CARRIERFEDEX
ITEMNO1861
ORDER
…
CARRIERUPS
ITEMNO1878
ORDER
…
CARRIERUPS
ITEMNO1897
ORDER
… ITEM…CARRIERFEDEX
ITEMNO1861
ORDER
…CARRIER
UPSITEMNO
1878
ORDER…
CARRIERUPS
ITEMNO1897
ORDER
… ITEM…
INNER LOOP
NO
NO
YES
YES
YES
NO
where
ORDER1861
ORDER1897
ORDER1878
return
42
Condensed Nested Query Table
$name
NAMEFEDEX
NAMEUPS
$shipper
NAMEFEDEX
SHIPPER
PICKUP2PM
NAMEUPS
SHIPPER
PICKUP5PM
OUTER LOOP
SHIPPER
ORDER1861
NAMEFEDEX
return
SHIPPER
ORDER1878
NAMEUPS
ORDER1897
OUTER LOOP$id
1861
1897
1878
$order
CARRIERFEDEX
ITEMNO1861
ORDER
…
CARRIERUPS
ITEMNO1878
ORDER…
CARRIERUPS
ITEMNO1897
ORDER
… ITEM…
INNER LOOP
YES
YES
YES
where
ORDER1861
ORDER1897
ORDER1878
return
<SHIPPER_ORDERS> { for $shipper in doc("s")/SHIPPERS/SHIPPER let $name := $shipper/NAME return <SHIPPER> { $name } { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $id := data($order/NO) where $name = $order/CARRIER return <ORDER> { $id } </ORDER> } </SHIPPER>} </SHIPPER_ORDERS>
In future examples,non-matched inner rows
are removed, as are where conditions:
43
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
44
Boolean Expressions
• In this section we examine various types of Boolean expressions that may appear in WHERE clauses
45
where
NO
YES
NO
result
ORDERS_IDS
ID1897
return
ID1897
$orderORDER
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY1
ITEM
SKUP5
QTY1
ITEMCARRIERUPS
NO1861
for/let$id
1897
1878
1861
$lc
1
2
1
Functions in Boolean Expressions<ORDER_IDS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $id := data($order/NO), $lc := count($order/ITEM) where $lc > 1 return <ID> { $id } </ID>} </ORDER_IDS>
46
<ORDER_IDS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $id := data($order/NO), $lc := count($order/ITEM ) where $lc > 1 or $order/CARRIER = "FEDEX" return <ID> { $id } </ID>} </ORDER_IDS>
where
NO
YES
YES
$orderORDER
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY1
ITEM
SKUP5
QTY1
ITEMCARRIERUPS
NO1861
for/let$id
1897
1878
1861
$lc
1
2
1
Disjunctions
return
ID1897
ID1861
result
ORDERS_IDS
ID1861
ID1897
47
<ORDER_IDS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $id := data($order/NO) where some $sku in $order/ITEM/SKU satisfies $sku = "C5" return <ID> { $id } </ID>} </ORDER_IDS>
Existential Quantification
where
NO
YES
YES
return
ID1897
ID1861
result
ORDERS_IDS
ID1861
ID1897
$orderORDER
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY1
ITEM
SKUP5
QTY1
ITEMCARRIERUPS
NO1861
for/let$id
1897
1878
1861
$sku
SKUC5
SKUB7
SKUP5
SKUC5
48
<ORDER_IDS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $id := data($order/NO) where every $sku in $order/ITEM/SKU satisfies $sku = "C5" return <ID> { $id } </ID>} </ORDER_IDS>
Universal Quantification
where
NO
NO
YES
return
ID1861
result
ORDERS_IDS
ID1861
$orderORDER
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY1
ITEM
SKUP5
QTY1
ITEMCARRIERUPS
NO1861
for/let$id
1897
1878
1861
$sku
SKUC5
SKUB7
SKUP5
SKUC5
49
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
50
NAMETom
CUSTOMER
NAMEAnn
CUSTOMER
NAMESue
CUSTOMER
Conditionals Example Tree
Combinecustomers …
NAMESue
STATUSGOLD
MEMBER
NAMETom
STATUSGOLD
MEMBER
NAMEBob
STATUSSILVER
MEMBER… with member info …
CUSTOMER
NAMETom
CUSTOMERS
CUSTOMER
NAMESue
MEMBERNO
CUSTOMER
NAMEAnn
MEMBERYES
MEMBERYES
… to add MEMBER tag to customer data
51
<CUSTOMERS> { for $customer in doc("co")/CUSTOMER_ORDERS/CUSTOMER let $name := $customer/NAME return <CUSTOMER> {$name} { if (some $member in doc("m")/MEMBERS/MEMBER satisfies $member/NAME = $name) then <MEMBER>YES</MEMBER> else <MEMBER>NO</MEMBER> } </CUSTOMER>} </CUSTOMERS>
Conditionals Example Query
• For each customer, the existential quantification statement checks for the existence of a matching member
• If a matching member is found, the MEMBER YES tags are output; otherwise, the MEMBER NO tags are output
52
$customer
CUSTOMER
NAMESue
CUSTOMER
NAMEAnnCUSTOMER
NAMETom
NAMESue
$name
NAMETom
NAMEAnn
MEMBER
STATUSSILVER
if/then/else
NAMESue
MEMBER
STATUSGOLD
MEMBERNO
resultsome $member
NAMETom
MEMBERYES
MEMBERYES
CUSTOMER
MEMBERYES
NAMESue
return
CUSTOMER
MEMBERNO
NAMEAnn
CUSTOMER
MEMBERYES
NAMETom
Conditionals Table
<CUSTOMERS> { for $customer in doc("co")/CUSTOMER_ORDERS/CUSTOMER let $name := $customer/NAME return <CUSTOMER> {$name} { if (some $member in doc("m")/MEMBERS/MEMBER satisfies $member/NAME = $name) then <MEMBER>YES</MEMBER> else <MEMBER>NO</MEMBER> } </CUSTOMER>} </CUSTOMERS>
53
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
54
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $id := data($order/NO) let $ic := count($order/ITEM) return <ORDER> <ID> {$id} </ID> <IC> {$ic} </IC> </ORDER>} </ORDERS>
Simple Aggregation
$orderORDER
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY1
ITEM
SKUP5
QTY1
ITEMCARRIERUPS
NO1861
for/let$id
1897
1878
1861
$ic
1
1
2
return
IC1
ORDER
ID1878
IC2
ORDER
ID1897
IC1
ORDER
ID1861
result
ORDERS
IC1
ORDER
ID1878
IC2
ORDER
ID1897
IC1
ORDER
ID1861
55
<ORDERS> { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $id := data($order/NO) let $items := for $i in $order/ITEM where $i/SKU = "C5" return $i let $ic := count($items) return <ORDER> <ID> {$id} </ID> <IC> {$ic} </IC> </ORDER>} </ORDERS>
Conditional Aggregation
ORDER
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY1
ITEM
SKUP5
QTY1
ITEMCARRIERUPS
NO1861
1897
1878
1861
$order $id$i return
QTY2
ITEM
SKUB7
QTY1
ITEM
SKUP5
QTY1
ITEM
SKUC5
QTY1
ITEM
SKUC5
$itemswhere
NO
NO
YES
YES
QTY1
ITEM
SKUC5
QTY1
ITEM
SKUC5
1
0
1
$ic
IC0
ORDER
ID1878
IC1
ORDER
ID1897
IC1
ORDER
ID1861
return
56
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
57
Missing Data Join Example
• We will link CUSTOMER_ORDERS with MEMBERS
• There are customers that are not members
58
NAMETom
CUSTOMER
NAMEAnn
CUSTOMER
NAMESue
CUSTOMER
Missing Data Join Trees
Combinecustomers …
NAMESue
STATUSGOLD
MEMBER
NAMETom
STATUSGOLD
MEMBER
NAMEBob
STATUSSILVER
MEMBER… withmember info …
CUSTOMER
NAMETom
CUSTOMERS
CUSTOMER
NAMESue
PRIORITYGOLD
CUSTOMER
NAMEAnn
PRIORITYSILVER
… to producePrioritizedcustomers
59
Missing Data Join Query
<CUSTOMERS> { for $customer in doc("co")/CUSTOMER_ORDERS/CUSTOMER for $member in doc("m")/MEMBERS/MEMBER let $name := $customer/NAME let $status := data($member/STATUS) where $name = $member/NAME return <CUSTOMER> {$name} <PRIORITY>{$status}</PRIORITY> </CUSTOMER>} </CUSTOMERS>
60
Missing Data Join Table
<CUSTOMERS> { for $customer in doc("co")/CUSTOMER_ORDERS/CUSTOMER for $member in doc("m")/MEMBERS/MEMBER let $name := $customer/NAME let $status := data($member/STATUS) where $name = $member/NAME return <CUSTOMER> {$name} <PRIORITY>{$status}</PRIORITY> </CUSTOMER>} </CUSTOMERS>
$member
MEMBER
STATUSSILVER
NAMESue
MEMBER
STATUSGOLD
NAMETom
CUSTOMER
PRIORITYSILVER
NAMESue
return
CUSTOMER
PRIORITYGOLD
NAMETom
$name
SILVER
GOLD
NAMESue
NAMETom
NAMEAnn
$status$customer
CUSTOMER
NAMESue
CUSTOMER
NAMEAnnCUSTOMER
NAMETom
for/let/join
Result for Annis missing!
61
CUSTOMER
NAMETom
CUSTOMERS
CUSTOMER
NAMESue
PRIORITYGOLD
PRIORITYSILVER
CUSTOMER
NAMETom
CUSTOMERS
CUSTOMER
NAMESue
PRIORITYGOLD
CUSTOMER
NAMEAnn
PRIORITYSILVER
Missing Data Join Problem
Wanted:
Got:
The result we want is analogous to an SQL outer join
62
<CUSTOMERS> { for $customer in doc("co")/CUSTOMER_ORDERS/CUSTOMER for $member in doc("m")/MEMBERS/MEMBER let $name := $customer/NAME let $status := data($member/STATUS) where $name = $member/NAME return <CUSTOMER> {$name} <PRIORITY>{$status}</PRIORITY> </CUSTOMER>} </CUSTOMERS>
Missing Data Join Solution Query
Our join query …
<CUSTOMERS> { for $customer in doc("co")/CUSTOMER_ORDERS/CUSTOMER let $name := $customer/NAME return <CUSTOMER> {$name} { for $member in doc("m")/MEMBERS/MEMBER let $status := data($member/STATUS) where $name = $member/NAME return <PRIORITY>{$status}</PRIORITY> } </CUSTOMER>} </CUSTOMERS>
… can be restructured into a nested query:
63
Missing Data Join Solution Table
<CUSTOMERS> { for $customer in doc("co")/CUSTOMER_ORDERS/CUSTOMER let $name := $customer/NAME return <CUSTOMER> {$name} { for $member in doc("m")/MEMBERS/MEMBER let $status := data($member/STATUS) where $name = $member/NAME return <PRIORITY>{$status}</PRIORITY> } </CUSTOMER>} </CUSTOMERS>
$name
NAMESue
NAMETom
NAMEAnn
$customer
CUSTOMER
NAMESue
CUSTOMER
NAMEAnnCUSTOMER
NAMETom
OUTER LOOP$member
MEMBER
STATUSSILVER
NAMESue
MEMBER
STATUSGOLD
NAMETom
return
SILVER
GOLD
$status
INNER LOOP
PRIORITYSILVER
PRIORITYGOLD
CUSTOMER
PRIORITYSILVER
NAMESue
CUSTOMER
PRIORITYGOLD
NAMETom
OUTER LOOPreturn
CUSTOMER
NAMEAnn
64
Missing Data Joins vs. Nested Queries
• In joins, tuples with any missing data are eliminated– equivalent to an SQL natural or inner join
• In nested queries, tuples are output in spite of missing data– equivalent to an SQL outer join
65
Nested Query Problem
• How to remove tuples that have some missing data
• How to force inner join functionality in a nested query
66
Missing Data Nested Query Example
• Suppose we want a list, by product, of all items on order– perhaps for pulling the items from stock
• For each product, we want bundles, separate quantities for each order
• We don’t want to list products with no items on order
67
SKUC4
NAMECase
PRODUCT
SKUC5
NAMECable
PRODUCT
SKUB7
NAMEBattery
PRODUCT
SKUP5
NAMEPhone
PRODUCT
Missing Data Nested Query Trees
Combineproducts …
SKUC5
QTY2
ITEM
SKUC5
QTY1
ITEM
SKUP5
QTY1
ITEM
SKUB7
QTY2
ITEM
… with orderitems …
ITEMS_ON_ORDER
PRODUCT
SKUC5
NAMECable
BUNDLE1
PRODUCT
SKUP5
NAMEPhone
BUNDLE1
BUNDLE2
PRODUCT
SKUB7
NAMEBattery
BUNDLE2
… to itemson order
68
Missing Data Nested Query
<ITEMS_ON_ORDER> { for $p in doc("p")/PRODUCTS/PRODUCT let $sku := $p/SKU let $name := $p/NAME return <PRODUCT> {$sku} {$name} { for $i in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER/ITEM let $qty := data($i/QTY) where $sku = $i/SKU return <BUNDLE> {$qty} </BUNDLE> } </PRODUCT>} </ITEMS_ON_ORDER>
69
Missing Data Nested Query Table
NAMEPhone
NAMECase
NAMEBattery
NAMECable
SKUP5
SKUC4
SKUB7
SKUC5
$sku$p
OUTER LOOP
PRODUCT
SKUC4
NAMECase
PRODUCT
SKUP5
NAMEPhone
PRODUCT
SKUB7
NAMEBattery
PRODUCT
SKUC5
NAMECable
$name
ITEM
QTY1
SKUC5
ITEM
QTY1
SKUP5
ITEM
QTY2
SKUB7
ITEM
QTY2
SKUC5
$i return
2
$qty
INNER LOOP
BUNDLE2
1
2
1
BUNDLE1
BUNDLE2
BUNDLE2
PRODUCT
SKUC4
NAMECase
PRODUCT
SKUB7
NAMEBattery
BUNDLE2
PRODUCT
SKUP5
NAMEPhone
BUNDLE2
PRODUCT
SKUC5
BUNDLE2
NAMECable
BUNDLE1
OUTER LOOPreturn
70
ITEMS_ON_ORDER
PRODUCT
SKUC5
NAMECable
BUNDLE1
PRODUCT
SKUP5
NAMEPhone
BUNDLE1
BUNDLE2
PRODUCT
SKUB7
NAMEBattery
BUNDLE2
ITEMS_ON_ORDER
PRODUCT
SKUC5
NAMECable
BUNDLE1
PRODUCT
SKUP5
NAMEPhone
BUNDLE1
BUNDLE2
PRODUCT
SKUB7
NAMEBattery
BUNDLE2
PRODUCT
SKUC4
NAMECase
Missing Data Nested Query Problem
Wanted:
Got:
The result we want is analogous to an SQL inner (natural) join
71
<ITEMS_ON_ORDER> { for $p in doc("p")/PRODUCTS/PRODUCT let $sku := $p/SKU let $name := $p/NAME return <PRODUCT> {$sku} {$name} { for $i in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER/ITEM let $qty := data($i/QTY) where $sku = $i/SKU return <BUNDLE> {$qty} </BUNDLE> } </PRODUCT>} </ITEMS_ON_ORDER>
Missing Data Nested Query Solution
Our nested query …
… and a where clause can be added to remove
outer elements with no inner elements
… can be restructured with the inner for loop moved to a variable in the outer loop …
<ITEMS_ON_ORDER> { for $p in doc("p")/PRODUCTS/PRODUCT let $sku := $p/SKU, $name := $p/NAME let $bundle := for $i in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER/ITEM let $qty := data($i/QTY) where $sku = $i/SKU return <BUNDLE> {$qty} </BUNDLE> where not(empty($bundle)) return <PRODUCT> {$sku} {$name} {$bundle} </PRODUCT>} </ITEMS_ON_ORDER>
72
Missing Data Nested Query Solution Table
NAMEPhone
NAMECase
NAMEBattery
NAMECable
SKUP5
SKUC4
SKUB7
SKUC5
$sku$p
PRODUCT
SKUC4
NAMECase
PRODUCT
SKUP5
NAMEPhone
PRODUCT
SKUB7
NAMEBattery
PRODUCT
SKUC5
NAMECable
$name
ITEM
QTY1
SKUC5
ITEM
QTY1
SKUP5
ITEM
QTY2
SKUB7
ITEM
QTY2
SKUC5
$i return
2
$qty
BUNDLE2
1
2
1
BUNDLE1
BUNDLE2
BUNDLE2
$bundle
PRODUCT
SKUB7
NAMEBattery
BUNDLE2
PRODUCT
SKUP5
NAMEPhone
BUNDLE2
PRODUCT
SKUC5
BUNDLE2
NAMECable
BUNDLE1
returnwhere
YES
NO
YES
YES
73
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
74
SKUC4
NAMECase
PRODUCT
SKUC5
NAMECable
PRODUCT
SKUB7
NAMEBattery
PRODUCT
SKUP5
NAMEPhone
PRODUCT
Advanced Example Trees
Combineproducts …
ORDER
NO1861
SKUC5
QTY1
ITEMCARRIERFEDEX
ORDER
NO1878
SKUB7
QTY2
ITEMCARRIERUPS
ORDER
NO1897
SKUC5
QTY2
ITEMCARRIERUPS
SKUP5
QTY1
ITEM… with
orders …
PRODUCT_ORDERS
PRODUCT
SKUC4
PRODUCT
ORDER1861
SKUC5
PRODUCT
ORDER1897
SKUP5
PRODUCT
ORDER1878
SKUB7
ORDER1897
… to produce orders for
each product
75
Advanced Example Query
(: By the way, this is a comment :)<PRODUCT_ORDERS> { for $product in doc("p")/PRODUCTS/PRODUCT return <PRODUCT> {$product/SKU} { for $order in doc("co")/CUSTOMER_ORDERS/CUSTOMER/ORDER let $id := data($order/NO) where some $sku in $order/ITEM/SKU satisfies $sku = $product/SKU return <ORDER> { $id } </ORDER> } </PRODUCT>} </PRODUCT_ORDERS>
For each product (outer for loop), loop through all orders (inner for loop)
Where statement filters out orders which don’t contain the product under consideration
76
Advanced Example Exercises
• Preparation of the query table (table of variable bindings) is left as an exercise
• How can the query be rewritten to– eliminate products with no orders?– add a <no_orders/> tag to products with no orders?– sort by SKU?– add a total quantity ordered count under each product?
77
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
78
Sequences
• Ordered lists of nodes, either element, attribute or text nodes, or a combination thereof
• Can be constructed in for/let clausesfor $product in doc("p")/PRODUCTS/PRODUCT
• Or manually in the return clausefor $product in doc(“p")/PRODUCTS/PRODUCTreturn ( <SKU>{data($product/SKU)}</SKU>, <NAME>{data($product/NAME)}</NAME> )
• Not needed if a parent element constructor is presentfor $product in doc(“p")/PRODUCTS/PRODUCTreturn <PRODUCT> <SKU>{data($product/SKU)}</SKU> <NAME>{data($product/NAME)}</NAME> </PRODUCT>
79
Sequences
• Concatenation($seq1, $seq2)
• Union$seq1 union $seq2$seq1 | $seq2
– Example: for $product in doc(“p")/PRODUCTS/PRODUCT union doc(“co")//ITEM return $product
• Intersection$seq1 intersect $seq2
• Difference$seq1 except $seq2
• Union, Intersection and Difference remove duplicates
80
Topics
• For-Let-Where-Order by-Return Expressions• Type Conversions• Variable Bindings• Joins• Nested Queries• Boolean Expressions• Conditionals• Aggregations• Missing Data in Joins and Nested Queries• Advanced Example• Sequences• Query Prolog
81
QueryProlog
User-Defined Functions
• Useful for recursion
declare function local:depth($e as element()) as xs:integer { if (empty($e/*)) then 1 else max( for $child in $e/* return local:depth($child) + 1 ) };
for $a in doc(“co")/CUSTOMER_ORDERSreturn local:depth($a)
• “local” prefix is reserved for user-defined functions
82
Global Variables
• Also declared in the query prolog
declare variable $threshold := 2;
for $order in doc(“co")//ORDER
let $totalQty := sum($order//QTY)
where $totalQty > $threshold
return $order
• Can be used to parameterize your queries
83
XQuery and XML Schemas
• XML Schemas can be used within XQuery to validate:– Input documents– Query Result
import schema namespace in="http://www.cse.buffalo.edu/in" at “in.xsd";
import schema namespace out="http://www.cse.buffalo.edu/out" at “out.xsd";
validate{ <out:CUSTOMER_ORDERS> { for $custs in doc(“co”)/in:CUSTOMER_ORDERS/* return $custs } </out:CUSTOMER_ORDERS>}