approximating sensor network queries using in-network summaries alexandra meliou carlos guestrin...

29
Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Post on 15-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Approximating Sensor Network Queries Using In-Network

Summaries

Alexandra Meliou

Carlos Guestrin

Joseph Hellerstein

Page 2: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Approximate Answer Queries Approximate representation of the world:

Discrete locations Lossy communication Noisy measurements

Applications do not expect accurate values (tolerance to noise)

Example: Return the temperature at all locations ±1C, with 95% confidence

Query Satisfaction: On expectation the requested portion of sensor values lies within the

error range

Page 3: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

In-network DecisionsQuery

Use in-network models to make routing decisions

No centralized planning

Page 4: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

In-network Summaries

Spanning tree T(V,E’)

+

Models Mv for all nodes v

Mv represents the whole subtree rooted at v.

Page 5: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Model Complexity

Need for compression

Gaussian distributions at the leaves:• good for modeling individual node

measurements

Page 6: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Talk “outline”

Compression

TraversalConstruction

In-network summaries

Page 7: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Collapsing Gaussian Mixtures Compress an m-size

mixture to a k-size mixture.

Look at simple case (k=1) Minimize KL-

divergence?

“Fake” mass

Page 8: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Quality of Compression

Depends on query workload

Query with acceptable error window WQuery with acceptable error window W’<W

Page 9: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Compression

Accurate mass inside interval

No guarantee on the tails

maxz

f (x)dxz−w

z+w

N(μ,σ 2)dxμ−w

μ+w

∫ = N i(μ i,σ i2)dx

μ−w

μ+w

∫i

Page 10: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Talk “outline”

Compression

TraversalConstruction

In-network summaries

Page 11: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Query Satisfaction

A response R={r1…rn} satisfies query Q(w,δ) if: In expectation the values of at least δn nodes lie

within [ri-w,ri+w]

f i(x)dxri −w

ri +w∫i

∑ ≥ δn

In-network summary

Q

R [r1, r2, r3, r4, r5, r6, r7, r8, r9, r10]

Within error bounds

Page 12: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Optimal Traversal Given: tree and models Find: subtree such that

T =G(V ,E)

Mv

G(V ',E '), E '⊆ E

Mass(Mv,w) ≥ δnleaves∑

Can be computed with Dynamic Programming

response [μleaves]

Page 13: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Greedy Traversal If local model satisfies

Return μ Else descend to child node

f (x)dxμ−w

μ+w

∫ ≥ δ

More conservative solution:enforces query satisfiability on every subtree instead of the whole tree

Page 14: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Traversal Evaluation

Page 15: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Talk “outline”

Compression

TraversalConstruction

In-network summaries

Page 16: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Optimal Tree Construction

Given a structure, we know how to build the models

But how do we pick the structure?

Page 17: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Traversal = cut

Theorem: In a fixed fanout tree, the cost of the traversal is where |C| is the size of the cut, and F the fanout

FF−1 |C | −1( )

Intuition: minimize cut size

Group nodes into a minimum number of groups which satisfy the query constraints

Clustering problem

Page 18: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Optimal Clustering

Given a query Q(w,δ), optimal clustering is NP-hard Related to the Group Steiner Tree Problem

Greedy algorithm with factor log(n) approximation Greedily pick max size cluster Issue: does not enforce connectivity of

clusters

Page 19: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Greedy Clustering Include extra nodes to enforce connectivity

Augment clusters only with accessible nodes (losing the logn guarantee)

Page 20: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Clustering comparison 2 distributed clustering algorithms are compared to the centralized

greedy clustering

Page 21: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Talk “outline”

Compression

TraversalConstruction

In-network summaries

Enriched models

Page 22: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Enriched models Support more complex models

k-mixtures• Compress to a k-size mixture instead of a SGM

Virtual nodes• Every component of the k-size mixture is stored as a

separate “virtual node” SGMs on multiple windows

• Maintain additional SGMs for different window sizes

More space, more expensive model updates

(SGM = Single Gaussian Model)

Page 23: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Evaluation of enriched models

SGM surprisingly effective in representing the underlying data

Page 24: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Sensitivity analysis

Talk “outline”

Compression

TraversalConstruction

In-network summaries

Page 25: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Tree Construction Parameters and Effect on Performance

Confidence Performance for workloads of different confidence

than the hierarchy design

Error window Broader vs narrower ranges of window sizes Assignment of windows across tree levels

Temporal changes How often should the models be updated

Page 26: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Confidence

Workload of 0.95 confidence

Design confidence does not have a big impact on performance

Page 27: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Error windows

A wide range is not always better, because it forces the traversal of more levels

Page 28: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Model Updates

Page 29: Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein

Sensitivity analysis

Conclusions

Analyzed compression schemes for in-network summaries

Evaluated summary traversal Studied optimal hierarchy construction Studied increased complexity models

Showed that simple SGM are sufficient Analyzed the effect on efficiency of various

parameters

Compression

TraversalConstruction

In-network summariesEnriched models