cache-and-query for wide area sensor databases
DESCRIPTION
Cache-and-Query for Wide Area Sensor Databases. Amol Deshpande, UC Berkeley Suman Nath, CMU Phillip Gibbons, Intel Research Pittsburgh Srinivasan Seshan, CMU. Outline. Overview of IrisNet Example application: Parking Space Finder Query processing in IrisNet Data partitioning - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/1.jpg)
IrisNet
Cache-and-Query for Wide Area Cache-and-Query for Wide Area
Sensor DatabasesSensor Databases
Amol Deshpande, UC Berkeley
Suman Nath, CMU
Phillip Gibbons, Intel Research Pittsburgh
Srinivasan Seshan, CMU
![Page 2: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/2.jpg)
June 2003 2IrisNet
OutlineOutline
•Overview of IrisNet
•Example application: Parking Space Finder
•Query processing in IrisNet• Data partitioning
• Distributed query execution
•Conclusions
![Page 3: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/3.jpg)
June 2003 3IrisNet
Internet-scale Resource-intensive Internet-scale Resource-intensive Sensor Network Services (IrisNet)Sensor Network Services (IrisNet)•Motivation
• Proliferation of resource-intensive sensors attached to powerful devices
• Webcams, pressure gauges, microphones
• Rich data sources with high data volumes
• Typically distributed over wide geographical areas
• Useful services utilizing such sensors missing
•IrisNet: An infrastructure to support deployment of
sensor services over such sensors
![Page 4: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/4.jpg)
June 2003 4IrisNet
IrisNet: Design GoalsIrisNet: Design Goals
• Ease of deployment of sensor services• Minimal requirements from the service provider
• Distributed data storage and querying for high
throughputs
• Ease of querying• XML as the data format, XPATH as the query language
• Natural geographical hierarchy on data as well as queries
• Continuously evolving data
• Location transparency
• Logical view of the entire distributed database as a single
centralized XML document
![Page 5: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/5.jpg)
June 2003 5IrisNet
IrisNet ArchitectureIrisNet Architecture
•Sensing Agents (SA)• PDA/PC-class processor, MBs–GBs storage
• Collect & process data from sensors, as dictated by “senselet” code uploaded by OAs
• Processed data sent to the OAs for update in-place
•Organizing Agents (OA)• PC/Server-class processor, GBs storage
• Provide data storage, discovery, querying facilities
• Use an off-the-shelf database to store data locally
• Interface with the local database using XPATH/XSLT
SA
SAOAs
![Page 6: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/6.jpg)
June 2003 6IrisNet
OutlineOutline
•Overview of IrisNet
•Example application: Parking Space Finder
•Query processing in IrisNet• Data partitioning
• Distributed query execution
•Conclusions
![Page 7: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/7.jpg)
June 2003 7IrisNet
Example Application : Parking Example Application : Parking Space Finder (PSF)Space Finder (PSF)• Webcams monitor parking spaces and provide real-time
information about their availability
• Image processing to extract availability information
• Natural geographical hierarchy on the dataCounty (Allegheny)
City (Pittsburgh)
Neighborhood(Oakland)
Block 1 Block 3Block 2
Parkingspace 1 Parkingspace 3Parkingspace 2
![Page 8: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/8.jpg)
June 2003 8IrisNet
Example XML Fragment for PSFExample XML Fragment for PSF<State id=“Pennysylvinia”>
<County id=“Allegheny”>
<City id=“Pittsburgh”>
<Neighborhood id=“Oakland”>
<total-spaces>200</total-spaces>
<Block id=“1”>
<GPS>…</GPS>
<pSpace id=“1”>
<in-use>no</in-use>
<metered>yes</metered>
</pSpace>
<pSpace id=“2”>
…
</pSpace>
</Block>
</Neighborhood>
<Neighborhood id=“Shadyside”>
…
![Page 9: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/9.jpg)
June 2003 9IrisNet
Example XML Fragment for PSFExample XML Fragment for PSF
Cityid = 'Pittsburgh'
Neighborhoodid = 'Oakland'
Blockid =' 1'
pSpaceid = '1'
in-use
no
metered
yes
Neighborhoodid = 'Shadyside'
Blockid =' 2'
Price
GPS pSpaceid = '2'
total-spaces
200
Stateid='Pennsylvania'
Countyid='Allegheny'
![Page 10: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/10.jpg)
June 2003 10IrisNet
Example QueriesExample Queries
• Users issue queries against the document as a whole
• Find all available parking spots in Oakland
/State[@id=“Pennsylvania”]/County[@id=“Allegheny”]/City[@id=“Pittsburgh”] /Neighborhood[@id=“Oakland”]/Block/pSpace[in-use =
“no”]
• Find all blocks in in Allegheny have more than 20 metered parking spots /State[@id=“Pennsylvania”]/County[@id=“Allegheny”]
//Block[count(./pSpace[metered = “yes”]) > 20]
• Find the cheapest parking spot in Oakland Block 1
/State[@id=“Pennsylvania”]/County[@id=“Allegheny”]/City[@id=“Pittsburgh”] /Neighborhood[@id=“Oakland”]/Block[@id=‘1’] /pSpace[not(../pSpace/price > ./price)]
• Challenge : Evaluate arbitrary XPATH queries against the document even though the document may be partitioned across multiple OAs
![Page 11: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/11.jpg)
June 2003 11IrisNet
Data Partitioning and Query Data Partitioning and Query Processing: OverviewProcessing: Overview
• Maintain data partitioning invariants• Used to guarantee that an OA always has sufficient
information to participate correctly in a query
• Use DNS to maintain the data distribution information and to route queries to data
• Convert the XPATH query to an XSLT query that :• Walks the document recursively
• Evaluates part of the query that can be done locally
• Gathers missing information by asking subqueries
![Page 12: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/12.jpg)
June 2003 12IrisNet
OutlineOutline
•Overview of IrisNet
•Example application: Parking Space Finder
•Query processing in IrisNet• Data partitioning
• Distributed query execution
•Conclusions
![Page 13: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/13.jpg)
June 2003 13IrisNet
Partitioning GranularityPartitioning Granularity
•Definition : An IDable node in the document• Has an “id” attribute with value unique among its
siblings
• All its ancestors in the document are IDableCity
id = 'Pittsburgh'
Neighborhoodid = 'Oakland'
Blockid =' 1'
pSpaceid = '1'
in-use
no
metered
yes
Neighborhoodid = 'Shadyside'
Blockid =' 2'
Price
GPS pSpaceid = '2'
total-spaces
200
![Page 14: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/14.jpg)
June 2003 14IrisNet
Partitioning GranularityPartitioning Granularity
•Definition : Local Information of an IDable node• All its attributes and all its non-IDable descendants
• IDs of all its IDable children
Cityid = 'Pittsburgh'
Neighborhoodid = 'Oakland'
Blockid =' 1'
pSpaceid = '1'
in-use
no
metered
yes
Neighborhoodid = 'Shadyside'
Blockid =' 2'
Price
GPS pSpaceid = '2'
total-spaces
200
![Page 15: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/15.jpg)
June 2003 15IrisNet
Partitioning GranularityPartitioning Granularity
•Definition : Local Information of an IDable node• All its attributes and all its non-IDable descendants
• IDs of all its IDable children
Cityid = 'Pittsburgh'
Neighborhoodid = 'Oakland'
Blockid =' 1'
pSpaceid = '1'
in-use
no
metered
yes
Neighborhoodid = 'Shadyside'
Blockid =' 2'
Price
GPS pSpaceid = '2'
total-spaces
200
![Page 16: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/16.jpg)
June 2003 16IrisNet
Data PartitioningData Partitioning
•Data storage, ownership always in units of local
informations corresponding to the IDable nodes in
the document• These form a nearly-disjoint partitioning of the overall
document
• Granularity can be controlled using the “id” attributes
• A partitioning unit can be uniquely identified using the “id”’s on the path to the root of the document
•Data ownership:• Each partitioning unit owned by exactly one OA
![Page 17: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/17.jpg)
June 2003 17IrisNet
Data PartitioningData Partitioning
• Data stored locally at each OA:• A document fragment consisting of union of partitioning units
• Constraints:
• Must store the document fragment it owns• If stored the “id” of an IDable node, must also store the
local information of all its ancestors
• We minimize the amount of information required to store (details in paper)
• Only need to store ID’s of all ancestors, and of their children
• Invariant :
• If an OA has the “id” of an IDable node, it either• Has the local information for the node, or• Has the “id”’s on the path to the root allowing it to locate
the local information for that node
![Page 18: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/18.jpg)
June 2003 18IrisNet
Data Partitioning: ExampleData Partitioning: Example
Neighborhoodid = 'Oakland'
Blockid =' 1'
Blockid =' 2'
pSpaceid = '1'
pSpaceid = '2'
Cityid = 'Pittsburgh'
pSpaceid = '3'
Countyid = 'Allegheny'
Neighborhoodid = 'Shadyside'OA 2 Owns
OA 1 Owns
![Page 19: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/19.jpg)
June 2003 19IrisNet
Data Partitioning: ExampleData Partitioning: Example
Data storage configuration at OA 1
Local information required
Local information optional
Neighborhoodid = 'Oakland'
Blockid =' 1'
Blockid =' 2'
pSpaceid = '1'
pSpaceid = '2'
Cityid = 'Pittsburgh'
pSpaceid = '3'
Countyid = 'Allegheny'
Neighborhoodid = 'Shadyside'
Local information optional
![Page 20: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/20.jpg)
June 2003 20IrisNet
Data Partitioning: ExampleData Partitioning: Example
Data storage configuration at OA 2
Local information required
Local information required
Neighborhoodid = 'Oakland'
Blockid =' 1'
Blockid =' 2'
pSpaceid = '1'
pSpaceid = '2'
Cityid = 'Pittsburgh'
pSpaceid = '3'
Countyid = 'Allegheny'
Neighborhoodid = 'Shadyside'
Local information optional
![Page 21: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/21.jpg)
June 2003 21IrisNet
Mapping Data to OAsMapping Data to OAs
• Mapping of nodes to physical OAs maintained using DNS
• For each IDable node, create a unique DNS-style name by concatenating the IDs on the path to the root
• Mapped to OA 1:• Allegheny-County….iris.net• Pittsburgh-City.Allegheny-County….iris.net
• Mapped to OA 2:• Oakland-Neighborhood.Pittsburgh-City. Allegheny-County….iris.net• 1-Block.Oakland-Neighborhood.Pittsburgh-City.Allegheny-County….iris.net• 1-pSpace.1-Block.Oakland-Neighborhood. Pittsburgh-City.Allegheny-County….iris.net• …
Neighborhoodid = 'Oakland'
Blockid =' 1'
Blockid =' 2'
pSpaceid = '1'
pSpaceid = '2'
Cityid = 'Pittsburgh'
pSpaceid = '3'
Countyid = 'Allegheny'
Neighborhoodid = 'Shadyside'OA 2 Owns
OA 1 Owns
![Page 22: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/22.jpg)
June 2003 22IrisNet
OutlineOutline
•Overview of IrisNet
•Example application: Parking Space Finder
•Query processing in IrisNet• Data partitioning
• Distributed query execution
•Conclusions
![Page 23: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/23.jpg)
June 2003 23IrisNet
Self-Starting Distributed QueriesSelf-Starting Distributed Queries
• Each query has a hierarchical prefix/State[@id=‘Pennsylvania’]/County[@id=‘Allegheny’]
/City[@id=‘Pittsburgh’]/ /Neighborhood[@id=‘Oakland’]/Block/pSpace
• Simple parsing of the query to extract the least
common ancestor (LCA) of the possible query result
• Send the query to Oakland-
Neighborhood.Pittsburgh-City. Allegheny-
County.Pennsylvania-State.parking.intel-iris.net
• Name extracted from query without any global or
per-service state
![Page 24: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/24.jpg)
June 2003 24IrisNet
QEG Details QEG Details
•Nesting depth of an XPATH query• Maximum depth at which a location path that traverses
over IDable nodes occurs in the query
•Examples :• /a[@id=‘x’]/b[@id=‘y’]/c 0
• /a[@id=‘x’]//c 0
• /a[./b/c]/b 1 (if b is IDable)
• /a[count(./b/[./c[@id=‘1’]]) 2
•Complexity of evaluating a query increases with
nesting depth
![Page 25: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/25.jpg)
June 2003 25IrisNet
Queries with Nesting Depth = 0Queries with Nesting Depth = 0
•Any predicate in the query can be evaluated using
just the local information for an IDable node• Example : …/Block[@id=‘1’][./available-spaces > 10]
•Sketch of the XSLT program :• Walk the document recursively
• If local information for the node under consideration available, evaluate the part of the query that refers to that node, otherwise tag the returned answer with the tag “asksubquery”
•Postprocessor finds the missing information by
asking subqueries
![Page 26: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/26.jpg)
June 2003 26IrisNet
CachingCaching
•A site can add to its document any fragment as
long as the data partitioning constraints are
satisfied
•We generalize subqueries to fetch the smallest
superset of the answer that satisfies the
constraints and cache it
•Data time-stamped at the time of caching
•Queries can specify freshness requirements
![Page 27: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/27.jpg)
June 2003 27IrisNet
Further Details in PaperFurther Details in Paper
•Queries with Nesting Depth > 0
•Schema changes
•Data partitioning changes
•Implementation details and experimental study
![Page 28: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/28.jpg)
June 2003 28IrisNet
ConclusionsConclusions
•Identified the challenges in query processing over
a distributed XML document
•Developed formal framework and techniques that • Allow for flexible document partitioning
• Integrate caching seamlessly
• Correctly and efficiently answer XPATH queries
•Experimental results demonstrate the advantages
of flexible data partitioning and caching
![Page 29: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/29.jpg)
June 2003 29IrisNet
Further InformationFurther Information
• IrisNet project website• http://www.intel-iris.net
![Page 30: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/30.jpg)
June 2003 30IrisNet
OutlineOutline
•Overview of IrisNet
•Example application: Parking Space Finder
•Query processing in IrisNet• Data partitioning
• Distributed query execution
•Performance Study
![Page 31: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/31.jpg)
June 2003 31IrisNet
Performance Study SetupPerformance Study Setup• Current prototype written in Java
• A cluster of 9 2GHz Pentium IV machines
• Apache Xindice used as the backend XML database
• Artificially generated database• 2400 parking spaces with 2 cities, 6 neighborhoods and
120 blocks
• Five query workloads• QW-1: Asking for a single block
• QW-2: Asking for two blocks from a single neighborhood
• QW-3: Asking for two blocks from two neighborhoods
• QW-4: Asking for two blocks from two cities
• QW-Mix: 40% of QW-1 and QW-2, 15% QW-3, 5%QW-4
![Page 32: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/32.jpg)
June 2003 32IrisNet
Architectures ComparedArchitectures Compared
Queries
SA Updates
Parking Space
Block
Neighborhood
City
Centralized
Queries
SA Updates
Neigh-borhood
City
Centralized querying,distributed update
Parking Space
Block
![Page 33: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/33.jpg)
June 2003 33IrisNet
CachingCaching
•Architecture already allows for caching data• An OA is allowed to store more data than that it owns
• Data time-stamped at the time of caching
• Queries can specify freshness tolerance
![Page 34: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/34.jpg)
June 2003 34IrisNet
Architectures ComparedArchitectures Compared
Distributed querying/update,fixed two-level organization
SA Updates
Neigh-borhood
City
DNSServer
Parking Space
Block
Distributed querying/updates,hierarchical organization
SA Updates
DNSServer
Neighborhood
City
Parking Space
Block
![Page 35: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/35.jpg)
June 2003 35IrisNet
Query ThroughputsQuery Throughputs
![Page 36: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/36.jpg)
June 2003 36IrisNet
Data Partitioning: Example 2Data Partitioning: Example 2
OA 1 OWNS
OA 2 OWNSNeighborhoodid = 'Oakland'
Blockid =' 1'
Blockid =' 2'
pSpaceid = '1'
pSpaceid = '2'
Cityid = 'Pittsburgh'
pSpaceid = '3'
Countyid = 'Allegheny'
• e.g. OA 2 must store local information of the County(Allegheny) node
![Page 37: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/37.jpg)
June 2003 37IrisNet
ConclusionsConclusions
•Location transparency • distributed DB hidden from user
•Flexible data partitioning
•Low latency queries & Query scalability• Direct query routing to LCA of the answer
• Query-driven caching, supporting partial matches
• Load shedding; No per-service state needed at web servers
•Support query-based consistency
•Use off-the-shelf DB components
![Page 38: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/38.jpg)
June 2003 38IrisNet
Example XML Fragment for PSFExample XML Fragment for PSF
…
<County id=“Allegheny”>
<City id=“Pittsburgh”>
<Neighborhood id=“Oakland”>
<available-spaces>8</available-spaces>
<Block id=“1”>
<pSpace id=“1”>
<in-use>no</in-use>
<metered>yes</metered>
</pSpace>
…
</Block>
</Neighborhood>
</City>
</County>
…
Neighborhoodid = 'Oakland'
Blockid =' 1'
Blockid =' 2'
pSpaceid = '1'
pSpaceid = '2'
Cityid = 'Pittsburgh'
pSpaceid = '3'
Countyid = 'Allegheny'
in-use GPS metered
yesno
![Page 39: Cache-and-Query for Wide Area Sensor Databases](https://reader036.vdocuments.mx/reader036/viewer/2022062422/56813bab550346895da4ddff/html5/thumbnails/39.jpg)
June 2003 39IrisNet
OutlineOutline
•Overview of IrisNet
•Example application: Parking Space Finder
•Query processing in IrisNet• Data partitioning
• Distributed query execution
•Performance Study