accommodating bursts in distributed stream processing...

44
Accommodating Bursts in Distributed Stream Processing Systems Yannis Drougas, Vana Kalogeraki Distributed Real-time Systems Lab University of California, Riverside {drougas,vana}@cs.ucr.edu http://www.cs.ucr.edu/~{drougas,vana}

Upload: others

Post on 24-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Accommodating Bursts in Distributed Stream

Processing SystemsYannis Drougas, Vana KalogerakiDistributed Real-time Systems LabUniversity of California, Riverside

{drougas,vana}@cs.ucr.eduhttp://www.cs.ucr.edu/~{drougas,vana}

Page 2: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

• Large class of emerging applications in which data streams must be processed online

• Example applications include:– Stock Exchange data filtering– Traffic Monitoring– Surveillance– Sensor network data processing– Network monitoring

Stream Processing Applications

2

Page 3: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Distributed Stream Processing Systems

3

High-volume, continuous input streams

Processed result streams

On-line processing functions / continuous query operators implemented on each node:

Clustering Correlation Filtering Aggregation ...

Page 4: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

• Data is produced continuously, in large volumes and at high rates

• Data has to be processed in a timely manner, e.g. within a deadline

• Application input rates fluctuate notably and abruptly

Stream Processing Applications Characteristics

4

Clustering

Filtering

Join

Page 5: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Previous Work• The majority of previous work [FIT, MPRA,

Brown@VLDB 2006, Amini@ICDCS 2006, Xia@ICDCS 2007] has focused on optimizing a given utility function– Some solutions [FIT] employ data admission– Others [MPRA, Brown@VLDB 2006] consider

the optimal placement of tasks on nodes• The case where load patterns can be

predicted has also been studied [Borealis @ ICDE 2005]

• QoS management [RTStream] is another solution

5

Page 6: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Our Problem• We focus on the problem of addressing

bursts of input data rate– Devise a plan to thwart the burst– Provision for future bursts

• Benefits:– Lost data units due to bursts are minimized– No QoS degradation or data admission– No under-utilization, dynamic reservation used

• Challenges:– Highly dynamic / unpredictable environment– Multiple limiting resource types– Plan must be applied on time for the burst

6

Page 7: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Roadmap• Motivation and Background• System Architecture• Burst Handling Mechanism

– Feasible Region & Index Points– Application-based Reservation– Online system adjustment

• Experimental Evaluation• Conclusion

7

Page 8: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

System Architecture

8

Overlay network consisted of processing nodes. Built over a DHT (currently, Pastry).

The physical (IP) network.

Application layer: Execution of stream processing applications.

Page 9: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Application Execution• A stream processing application is

executed collaboratively by peers of the system that invoke the appropriate services.

• A service can be instantiated on more than one nodes.

• A service instantiation on a node is a component.

9

Application executed on the system

dest

Application submitted by the user

src s

c1

c2

destsrc

Page 10: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

System Architecture

10

Operating System

Application Execution and Burst Handling

InstantiationMonitoring DiscoveryScheduling

Components

Services

Streams

Page 11: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

• Each component is characterized by its resource requirements

• Selectivity is another component characteristic.

• Each node is characterized by the availability of its resources.

System Model

11

uci =

uci1

uci2

· · ·uci

J

An =

An1

An2

· · ·An

J

selci =average output rate

average input rate

Page 12: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Roadmap• Motivation and Background• System Architecture• Burst Handling Mechanism

– Feasible Region & Index Points– Application-based Reservation– Online system adjustment

• Experimental Evaluation• Conclusion

12

Page 13: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Optimization Problem• Capacity Constraints:

• Flow Conservation Constraints:

• We need to come up with a plan that satisfies the above constraints and minimizes the likelihood of missing data due to bursts.

13

∀n ∈ N ,∑

ci∈n

rci · ucij ≤ An

j , 1 ≤ j ≤ J

∀ci,∑

cj∈D(ci)

rcj = selci · rci

Page 14: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Feasible Region• Assume we have Q applications• The state of the system at any given time

can be described by a point in the Q-dimensional space

14

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

p

Page 15: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Feasible Region• The feasible region is the set of all points

(application input rates combinations) that nodes in the given distributed stream processing system can accommodate without any data unit being dropped.

• The form of linear constraints suggest that in the general case of Q applications, the feasible region is a convex polytope.

15

Page 16: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Feasible Region - Example

16

src2

src1 dest1s1

s2 dest2

Page 17: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Feasible Region - Example

16

src2

src1 dest1s1

s2 dest2

src2

Application 1

Application 2

c4

u = 10 ms

c3

u = 6.67 ms

Node B

dest2

u = 5 ms

src1c2

u = 6 ms

c1

u = 4 ms

Node A

dest1

u = 4 ms

Page 18: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Feasible Region - Example

16

rc1 · 4 + rc2 · 6 ≤ 1rc3 · 6.67 + rc4 · 10 ≤ 1

rdest1 · 4 ≤ 1rdest2 · 5 ≤ 1

rdest1 = rc1 + rc3

rdest2 = rc2 + rc4

src2

src1 dest1s1

s2 dest2

src2

Application 1

Application 2

c4

u = 10 ms

c3

u = 6.67 ms

Node B

dest2

u = 5 ms

src1c2

u = 6 ms

c1

u = 4 ms

Node A

dest1

u = 4 ms

Page 19: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Feasible Region - Example

16

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

Feasible Region

dest1 capacity constraint

dest2 capacity constraint

Capacity constraints of nodes A and B

rc1 · 4 + rc2 · 6 ≤ 1rc3 · 6.67 + rc4 · 10 ≤ 1

rdest1 · 4 ≤ 1rdest2 · 5 ≤ 1

rdest1 = rc1 + rc3

rdest2 = rc2 + rc4

src2

src1 dest1s1

s2 dest2

src2

Application 1

Application 2

c4

u = 10 ms

c3

u = 6.67 ms

Node B

dest2

u = 5 ms

src1c2

u = 6 ms

c1

u = 4 ms

Node A

dest1

u = 4 ms

Page 20: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

• A point p1 dominates a point p2 when for each application q,

• If the current system state is p2, one can apply the rate allocations calculated for p1

• A Pareto point is not dominated by any other point in the feasible region

Dominance & Pareto Points

17

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

p1

p2

p3

p4

p5p

p1(q) ≥ p2(q)

• Pareto points represent optimal solutions: There is no point that is “better” than a pareto point

Page 21: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

• If the input rate of application q increases by , the input rate of a component of q will increase by

• In order for a stream processing system to be able to sustain such an increase, the following must hold for each node:

Burst Handling

18

δq ci

δq · rci

rq

ci∈n

rci · ucij +

ci∈n∩Cq

δq · rci

rq· uci

j ≤ Anj} }

Initial resource requirements

Additional resource requirements due to single burst

Page 22: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

• To minimize the amount of dropped data, we wish to maximize

• If the current input rates are represented by p, we need to configure the system for:

• We assume each application has equal probability for a burst to appear. So, ’s must be as equal as possible:

Optimization Objective

19

p′ = p + δ =

r1 + δ1

· · ·rQ + δq

δq

p′ !

r1 + c

· · ·rQ + c

⇒ δ =

δ1

· · ·δQ

!

c

· · ·c

∑δq

Page 23: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Optimization Objective - Example

20

The optimal point p’ is the one for which δ1 = δ2

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

!2

!1

p1

p2p

p’

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

!2

!1

!2’

p1

p2

p

p’

The optimal point is p2, sinceδ′2 > δ2 ⇒ p2 ! p′

Page 24: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

BARRE• Incorporating bursts makes capacity

constraints non-linear.• Re-calculating the optimal component rate

assignment when a burst appears would be too slow.

• Instead, our solution:– Pre-calculates a small number of component

rate assignment plans during an offline phase.– Monitors application incoming rates and

resource availability during runtime.– When bursts occur, component rates are

assigned, based on the pre-calculated plans.21

Page 25: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Feasible Region Determination• We need to select some index points and

pre-calculate their optimal rate assignment• Index points are a subset of Pareto points:

They are the “vertices” of the feasible region.

• These component rate assignments will be used to construct the appropriate component rate allocation on the event of a burst.

• For each index point, a list of the feasible region’s sides that are adjacent to the index point, is kept.

22

Page 26: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

• Find the maximum rate for each application– By solving a max-flow problem, under the

capacity and flow conservation constraints• Consider the resulting points as the

vertices of a side with (Q-1) dimensions– This is since the Q-th dimension is a linear

combination of the others• Find the mid-point of the side, as well as

the normal (perpendicular) vector• Move to the direction of as long as

constraints are satisfied• Repeat as long mid points can be moved

Identifying the Index Points

23

p0

d0

p0 d0

Page 27: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Identifying Index Points - Example• Step 1: Find the maximum possible rate

for each application– Solve a max-flow problem, with the rates of all

but one applications set to 0.

24

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

Page 28: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Identifying Index Points - Example• Step 2: Find the mid-point of the resulting

plane and see how far it can go– Solve max-flow by also constraining direction

25

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

Page 29: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Identifying Index Points - Example• Step 3: For each of the resulting sides,

find its mid-point and repeat previous step– The upper left and lower right points are now

dominated by the newly found index points

26

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

Page 30: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Identifying Index Points - Example• Step 4: This is the resulting feasible region

– The mid-points of the sides cannot improve without breaking the capacity constraints

• We end up with triplets:

27

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

p1

p3

p2

< p, sol(p),F(p) >

Page 31: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Storing Index Points• Index points implicitly split the feasible

region into subregions– For each subregion, there is one index point p– That index point is the closest to the optimal

solution for points (application input rate combinations) in the subregion

• To identify the appropriate index point for any given application input rate, all points are projected on plane [1,1, ..., 1]– This way, the implicit split of the feasible

region is brought out

28

Page 32: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Storing Index Points• The closest to proj(p) the projection of p3

is, the closest to the optimal solution.– Search utilizes an R-tree indexed by the

projections of the index points

29

0

0.05

0.1

0.15

0.2

0 0.05 0.1 0.15 0.2 0.25

Rate

of a

pplic

atio

n 2

(ADU

s / m

sec)

Rate of application 1 (ADUs / msec)

proj(p1)

proj(p3)proj(p)

p1

p3

p2p

p’X

Page 33: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Online Component Rate Assignment• Upon a burst, the new application input

rates are represented by a point p. We determine the appropriate component input rate allocation plan for p, that will maximize , while respecting the constraints– There is a pareto point p’, the optimal solution

of which is the same as the one for p– Any pareto point can be expressed as a linear

combination of (at most) Q index points– The optimal solution of a pareto point is thus a

linear combination of the optimal solutions of (at most) Q index points

30

∑δq

Page 34: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Rate Assignment Algorithm• Given the current system state p:

– Find the index point closest to p– If is equal to p, return the solution pre-

calculated for– Otherwise, retrieve the sides of the feasible

region for which is a vertex.– The optimal point p’ is a linear combination of

the index points of one of these sides:

– The solution is also a linear combination of their solutions:

31

pi

pi

pi

pi

p′ = a1 · p1 + . . . + aQ · pQ, 0 ≤ aq ≤ 1,Q∑

q=1

aq = 1

sol(p) = sol(p′) = a1 · sol(p1) + . . . + aQ · sol(pQ)

Page 35: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Experimental Setup• Implementation over Synergy• Used FreePastry for service discovery and

collection of statistics• 5 repetitions of each experiment• 10 unique services in the system, 5 services

on a single node• Each application included 4 to 6 services• Comparison with:

– A burst-unaware method– Static Reservation of 20% of node resources– Simple Dynamic Adaptation, from previous work– A combination of the above

32

Page 36: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Index Point Pre-Calculation

33

Exponential Index point calculation time, this is why index points need to be pre-calculated.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

0 2 4 6 8 10 12

Inde

x Po

int D

atab

ase

Crea

tion

Tim

e (s

ec)

Number of applications

Page 37: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Index Point Database Size

34

The size of the Index Point Database increases linearly with the number of applications.

0

2

4

6

8

10

12

0 2 4 6 8 10 12

Inde

x Po

int D

atab

ase

Size

Number of applications

Page 38: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Optimal Point Search Time

35

0

0.5

1

1.5

2

0 2 4 6 8 10 12

Sear

ch T

ime

(mse

c)

Number of applications

Combining multiple plans speeds up search, because of the small number of index points.

Page 39: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

BARRE Operation

36

BARRE avoids missing data units during the burst and minimizes lost data units in the beginning of the burst. Also, resulting plan is “safer”, so it prevents future drops

200

100

50

10

5

0 0 20 40 60 80 100

Tota

l num

ber o

f miss

ed d

ata

units

time (sec)

Start of burst (reconfiguration)End of reconfiguration

End of burst

BARREDynamic Adaptation

Dyn. Adaptation + Reservation

Page 40: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Missed Data Units

37

0%

5%

10%

15%

20%

25%

0% 20% 40% 60% 80% 100%

Perc

enta

ge o

f miss

ed d

ata

units

Burst intensity

BARREDynamic Adaptation

Dynamic Adaptation + ReservationNo Burst Handling

Reservation

BARRE results in fewer data units dropped. It can sustain up to about 80% (vs. about 20%) application rate increase without dropping any data unit.

Page 41: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Data Units Delivered On Time

38

75%

80%

85%

90%

95%

100%

0% 20% 40% 60% 80% 100%

perc

enta

ge o

f del

ivere

d da

ta u

nits

Burst intensity

BARREDynamic Adaptation

Dynamic Adaptation + ReservationNo Burst Handling

Reservation

Despite splitting the work among many components, the arrival order of data units is not affected.

Page 42: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Average Delay

39

450

500

550

600

650

700

0% 20% 40% 60% 80% 100%

Aver

age

dela

y (m

sec)

Burst intensity

BARREDynamic Adaptation

Dynamic Adaptation + ReservationNo Burst Handling

Reservation

End-to-end delay is also decreased, since nodes are not overloaded. Other methods overload either powerful or underpowered nodes.

Page 43: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Yannis Drougas, Vana Kalogeraki Accommodating Bursts in Distributed Stream Processing Systems

Conclusions - Future Work• We proposed BARRE, a dynamic

reservation scheme to address bursts in a dynamic stream processing system.– Reservation is based on application needs

and readjusted according to system dynamics• BARRE utilizes multiple nodes for each

processing component, splitting the input rate among them whenever needed

• In our future work, we plan to evaluate BARRE in a fully dynamic environment with nodes entering / leaving the system.

40

Page 44: Accommodating Bursts in Distributed Stream Processing Systemsalumni.cs.ucr.edu/~drougas/publications/barre2009ipdps... · 2009. 6. 17. · Yannis Drougas, Vana Kalogeraki Accommodating

Thank You!

Yannis Drougas, Vana KalogerakiDistributed Real-time Systems LabUniversity of California, Riverside

{drougas,vana}@cs.ucr.eduhttp://www.cs.ucr.edu/~{drougas,vana}