1 reed: robust, efficient filtering and event detection in sensor networks daniel abadi, samuel...
TRANSCRIPT
1
REED: Robust, Efficient Filtering and Event Detection in Sensor Networks
Daniel Abadi, Samuel Madden, Wolfgang Lindner
MIT
United States
VLDB 2005
2
What Problem Are We Trying To Solve?
• Complex data filtering in sensor networks
3
Example Filter Query
Timestamp Temp
3:05PM 74
MinTS MaxTS MinTemp MaxTemp
2:00PM 2:30PM 70 75
2:30PM 3:00PM 73 78
3:00PM 3:30PM 75 80
3:30PM 4:00PM 83 88
4:00PM 4:30PM 85 90
4:30PM 5:00PM 70 75
5:00PM 5:30PM 72 77
5:30PM 6:00PM 75 80
Join Predicate:
TS > MinTS && TS < MaxTS && (Temp < MinTemp || Temp > MaxTemp)
X
Sensor Data
Predicate Table
4
Constraints: Sensor Networks
• Sensor nodes are small, battery-powered devices
• Power conservation is important– Sensing and transmitting data typically dominate power
usage
Berkeley Mote
4Mhz uProc
900Mhz Radio (50-100 ft. range)
4 K RAM, 128 K Program Flash, 512 K Data Flash
5
Sensor Database Motivation
• Programming Apps is Hard– Limited power budget– Lossy, low bandwidth communication– Require long-lived, zero admin deployments– Distributed algorithms– Limited tools, debugging interfaces
• Solution: database style interface (e.g. TinyDB [Madden 2002], Cougar [Yao 2003])
6
TinyDB
Root 0Main PC Controller
1 2
3 4 5
6 7
How TinyDB Works:
1. Form a routing tree2. Distribute query to nodes3. Every time node
produces data tuple, filter by expression and pass result up tree, aggregating if necessary
7
Naïve Join Algorithm
• Send all tuples from data table to root; perform join at root
Root 0Main PC Controller
1 2
3 4 5
6 7
B
C
D
A
X
BX
X
Predicate Table
8
Ideal Join Algorithm
Root 0Main PC Controller
1 2
3 4 5
6 7
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
XXB X
XX
• Send join table to each node
• At node, perform join
• Problem: Severe Node Memory Constraints
9
X
REED Algorithm 1
• Cluster nodes into groups
• Store portion of predicate table in each group member
• Send sensor data tuples to every member of group
Root 0
1 2
3 4 5
6 7
X D
8
X
A
B
C
D
A
B
C
D
A
B
C
D
X
XXX
10
Group Formation
1
43
Neighbor list: {1, 2, 3, 4, 6}
Broadcast: Want to make group
Choose Me!
{1, 3, 4, 6}
Space: 4
Space: 4
CurrList: {1}
Potential: {1, 2, 3, 4, 6}
Space: 8
CurrList: {1, 4}
Potential: {1, 3, 4, 6}
Choose Me!
{1, 3, 4}
Space: 2
Space: 10
CurrList: {1, 3, 4}
Potential: {1, 3, 4}
Group Accepted:
{1, 3, 4}
6
Neighbor list: {1, 3, 4} Neighbor list: {1, 3, 4, 6}
Neighbor list:
{1, 4, 6}
11
Table Distribution
• Group members figure out amongst themselves how the table will be divided across group
• Table flooded to network
12
Bloom Filter Optimization
Temp: 20
Temp: 90
01000010
hash
hash
Bloom Filter
Step 1: Hash domain of sensor values onto Bloom Filter
Step 2: Send Bloom Filter to Each Sensor Node
Root 0
1
2
3 4
5
6 7
01000010
01000010 01000010
01000010
01000010
01000010 01000010
•Might produce false positives but never false negatives
•Can be used in conjunction with previous REED algorithm
XX
13
Cache DiffusionRoot 0
1 2
3 4 5
6 7
81-9011-20
23-5081-90
23-5060-70
60-7011-20
23-5060-70
23-5081-90
11-2023-50
2420
•Cache non-joining ranges on a per node basis
•Also will produce false positives but no false negatives
21
14
Results: Experimental Setup
•Ran experiments both in simulation and on real motes
•For simulation, 40 sensor nodes arranged in a grid
•Use TinyOS Packet Level Simulation
•Models CSMA backoff
•Carrier sense packet delivery model
•Overlap between 2 receptions leads to both being corrupted
•Use TinyOS MintRoute for MultiHop Routing Layer
15
REED Performs Well at Most Selectivities
0
20
40
60
80
100
120
140
160
180
0 0.2 0.4 0.6 0.8 1
Join Predicate Selectivity
Tota
l Tra
nsm
issi
ons
(100
0s)
Naïve
REED
REED +Bloom (.5)
16
REED Algorithm Overhead is Negligible
0
20
40
60
80
100
120
140
160
180
Selectivity
Nu
mb
er o
f T
ran
smis
sio
ns
(100
0s)
Total
Group ManagementOverheadOriginal TupleTransmissionsJoin Results
Forw arded Messages
17
Simulated Results Match Real Results from Motes
0
1000
2000
3000
4000
0 0.2 0.4 0.6 0.8 1Data Selectivity
Tot
al
Tra
nsm
issi
ons
Actual ResultsFrom MotesSimulatedResults
•Ran REED algorithm on a simple 5 node sensor network
18
Conclusion
• Contributions:– Complex filters table of expressions join– REED algorithms capable of
• Running with limited amounts of RAM
• Robustness in the face of message loss and node failure
– Experiments show benefits of doing complex join-based filters in the sensor network
19
Backup Slides
0
20
40
60
80
100
120
140
160
180
00.
10.
3
Selectivity
Nu
mb
er o
f T
ran
smis
sio
ns
(100
0s)
0
20
40
60
80
100
120
140
160
180
00.
10.
30.
50.
70.
9
Selectivity
Nu
mb
er
of
Tra
ns
mis
sio
ns
(1
00
0s
)
20
REED Performs Well even at low AVG node Depths
0
20
40
60
80
100
120
140
160
1 3 5 7 9 11
Average Node Depth
To
tal T
ran
sm
iss
ion
s
Naïve
REED (s = .5)
REED (s = .1)
REED+Bloom(p = .5, s = .1)
0
1
2
3
4
5
6
7
8
9
1.2 1.4 1.6 1.8 2 2.2 2.4
zx
21
Cache Diffusion Takes Advantage of Data Locality
05000
10000150002000025000
0 50 100Data Locality
Tra
nsm
issi
on
s Bloomjoin
Cache Diffusion
22
Distributed JoinGroup Formation
Root 0
1 2
3 4 5
6 7
Process:
1. Every node maintains list of nodes it can hear by listening in on packets
2. After a random interval, a node P which is not in a group broadcasts a form group request
3. Every node N which hears that request and is not currently in a group replies to P with a list of neighbors and amount free space
4. Node P collects the replies, and determines who should be in the group. For every node N which replied, P sends either a group reject or a group accept message.
5. Group accept message contains a list of nodes in the group
A Group is a set of nodes where every node is in broadcast range of every other node.
{1,2,3,4}
{3,1,4}
{4,1,3, 6}
{1,2,5}
{5,2,6,7}
{6,5,7, 4}
{7,5,6}
23
Distributed JoinJoin Table Distribution
Root 0
1 2
3 4 5
6 7
Process:
1. When a node enters a group, it sends a request to the root for join table data
2. Per group, the root gives out non-overlapping segments of the join table to every member
3. Once all the nodes in a group have received join tuples, they begin processing data tuples as a group
123
45
67
567
34
12
1234567Get me some
tuples! (3)
Get me some tuples! (2)
Get me some tuples! (4)
24
Distributed JoinOperation
Root 0
1 2
3 4 5
6 7
For nodes not in group:
1. When generating a data tuple or receiving data tuple from child, pass on to parent
2. When receiving a result from child, pass on to parent
123
45
67
567
34
12
1234567
For nodes in group:
1. When generating a data tuple or receiving data tuple from child, broadcast to group (including self).
2. Upon receiving data tuple broadcast from group, join with stored subset of join table and pass result up to parent.
3. When receiving a result from child, pass on to parent.
a
a
a
a 1
25
Related Work
• Gamma[8] and R* [15] systems both looked at ways to horizontally partitioning a table to perform a distributed join – Different optimization goals
• TinyDB [19,20,21] and Cougar [31] both present a range of distributed query processing techniques – No joins
• Bonfils and Bonnet [6] propose a scheme for join-operator placement within sensor networks – Look at joins of sensor data, not an external table
26
Motivating Applications
• Industrial Process control– Distributed sensors measure environmental variables
– Want to know if exceptional condition is reached
• Failure and Outlier Detection– Look for de-correlated sensor readings
• Power scheduling– Minimize power consumption by distributing work
across sensors
27
ResultsExperimental Setup
•Sensor Nodes in a 2 x 20 grid
•Use TinyOS Packet Level Simulation
•Models CMSA backoff
•Carrier sense packet delivery model
•Overlap between 2 receptions leads to both being corrupted
•Use TinyOS MintRoute for MultiHop Routing Layer
root
5 feet