development of a distributed stream processing system

Development of a Distributed Stream Processing System

Maycon Viana Bordin Final Assignment

Instituto de Informática

Universidade Federal do Rio Grande do Sul

CMP157 – PDP 2013/2, Claudio Geyer

What’s Stream Processing?

Stream Source: emits data continuously and

sequentially

Operators: count, join, filter, map

B

Data streams

B

Data Stream

Tuple -> (“word”, 55)

B

Tuples are ordered by a

timestamp or other attribute B 1 2 3 4 5 6 7

Data from the stream source may or may not be structured

The amount of data is usually unbounded in size

The input rate is variable and typically unpredictable

Operators

OP

Receives one or

more data

streams

Sends one or

more data

streams

Operators Classification

OPERATORS

OPERATORS

Stateless (map, filter)

OPERATORS


Stateful

OPERATORS


Stateful

Non-Blocking (count, sum)

OPERATORS


Stateful

Blocking (join, freq. itemset)

Non-Blocking (count, sum)

Blocking operators need all input in order to generate a result

but that’s not possible since data streams are unbounded

To solve this issue, tuples are grouped in windows

window start (ws)

window end (we)

Range in time units or number of tuples

old ws old we

new ws new we

advance

Implementation Architecture

master

slave

slave

slave

slave

client

submit/start/stop app

heartbeat

worker thread

The heartbeat carries the status of each worker in the slave

The heartbeat carries the status of each worker in the slave

Tuples processed

Throughput

Latency

Implementation Application

Applications are composed as a DAG (Directed Acyclic Graph)

To illustrate, let’s look at the graph of a Trending Topics application

extract hashtags

countmin sketch

File Sink

stream

extract hashtags

countmin sketch

File Sink

stream

Stream source emits

tweets in JSON

format

extract hashtags

countmin sketch

File Sink

stream

Extract the text from

the tweet and add a

timestamp to each

tuple

extract hashtags

countmin sketch

File Sink

stream

Extract and emit

each #hashtag in

the tweet

extract hashtags

countmin sketch

File Sink

stream

Constant time and space

approximate frequent

itemset [Cormode and Muthukrishnan, 2005]

extract hashtags

countmin sketch

File Sink

stream

Without a window, it

will emit all top-k

items each time a

hashtag is received

extract hashtags

countmin sketch

File Sink

stream

With a window the

number of tuples emitted

is reduced, but the

latency is increasead

The second step in building an application is to set the number of

instances of each operator:

extract

File Sink

stream

extract extract extract extract

countmin countmin countmin countmin countmin

But the user has to choose the way tuples are going to be partitioned

among the operators

All-to-All Partitioning

extract

File Sink

stream



Round-Robin Partitioning

extract

File Sink

stream



Field Partitioning

extract

File Sink

stream



(“foo”, 1)

The communication between operators is done with the pub/sub

pattern

extract

File Sink

stream


countmin countmin countmin countmin countmin U

extract

File Sink

stream


countmin countmin countmin countmin countmin U The operator subscribes

to all upstream

operators, with his ID as

a filter

extract

File Sink

stream


countmin countmin countmin countmin countmin U The operator will only

receive tuples with his

ID as prefix

The last step is to get each operator instance from the graph and assign it

to a node

extract

File Sink

stream



node-0 node-1 node-2

Currently the scheduler is static and only balances the number of

operators per node

Implementation Framework

trending-topics.js

Tests Specification

Application Trending Topics Dataset of 40GB from Twitter

GridRS - PUCRS

3 nodes

4 x 3.52 GHz (Intel Xeon)

2 GB RAM

Linux 2.6.32-5-amd64

Gigabit Ethernet

Test Environment

Variables

Metrics

Runtime

Latency: time to a tuple traverse the graph

Throughput: no. of tuples processed per sec.

Loss of Tuples

Methodology

5 runs per test.

Every 3s each operator sends its status with

no. of tuples processed.

The PerfMon sink collects a tuple every

100ms, and sends the average latency every

3s (and cleans up the collected tuples).

Number of nodes

Number of operator instances

Window size

Tests Number of Nodes

0.00

200.00

400.00

600.00

800.00

1000.00

1200.00

1400.00

1600.00

1800.00

2000.00

0.00

10.00

20.00

30.00

40.00

50.00

60.00

70.00

80.00

90.00

1 2 3

Late

ncy

(m

s)

Runti

me (

min

ute

s)

No. of nodes

Runtime vs Latency

runtime (min)

latency (ms)

0.00

5.00

10.00

15.00

20.00

25.00

0.00

10.00

20.00

30.00

40.00

50.00

60.00

70.00

80.00

90.00

1 2 3

Str

eam

rate

(M

B/s

)

Runti

me (

min

ute

s)

No. of nodes

Runtime vs Stream Rate

runtime (min)

stream rate (MB/s)

0.00

1000.00

2000.00

3000.00

4000.00

5000.00

6000.00

7000.00

1 2 3

Tup

les

per

seco

nd

(tp

s)

No. of nodes

Throughput stream

extractor

countmin

filesink

perfmon

-2000.00

0.00

2000.00

4000.00

6000.00

8000.00

10000.00

1 2 3

Lo

st t

up

les

No. of nodes

Loss of Tuples stream

extractor

countmin

filesink

perfmon

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

100000

0

2000

4000

6000

8000

10000

12000

14000

376

11

715

820

727

431

535

641

746

650

754

860

965

769

873

881

185

289

393

610

08

10

48

10

89

11

56

11

97

12

38

13

08

13

48

13

89

14

46

15

01

15

42

15

83

16

17

16

57

16

98

17

28

17

59

18

00

18

38

18

65

18

99

19

39

19

80

20

11

20

38

20

79

Late

ncy

(m

s)

Thro

ug

hp

ut

(tup

les/

ps)

Time (seconds)

Throughput and Latency Over Time (nodes=3, instances=5, window=20)

stream

extractor

countmin

filesink

latency

Tests Window Size

0.00

100.00

200.00

300.00

400.00

500.00

600.00

700.00

25.20

25.40

25.60

25.80

26.00

26.20

26.40

26.60

26.80

27.00

27.20

20 80 120 200

Late

ncy

(m

s)

Runti

me (

min

ute

s)

Window Size

Runtime vs Latency

runtime (min)

latency (ms)

Tests No. of Instances

0

5

10

15

20

25

30

35

0

5

10

15

20

25

30

35

40

1 5

Str

eam

rate

(M

B/s

)

Runti

me (

min

ute

s)

No. of Instances

Runtime vs Stream Rate

runtime (min)

stream rate (MB/s)

Conclusions

The system was able to process more data with the inclusion of more nodes

On the other hand, distributing the load increased the latency

The scheduler has to reduce the network communication

The communication between workers in the same node has to happen

through main memory

References Chakravarthy, Sharma. Stream data processing: a quality of service perspective: modeling, scheduling, load shedding, and complex event processing. Vol. 36. Springer, 2009. Cormode, Graham, and S. Muthukrishnan. "An improved data stream summary: the count-min sketch and its applications." Journal of Algorithms 55.1 (2005): 58-75. Gulisano, Vincenzo Massimiliano, Ricardo Jiménez Peris, and Patrick Valduriez. StreamCloud: An Elastic Parallel-Distributed Stream Processing Engine. Diss. Informatica, 2012.

Source code @ github.com/mayconbordin/tempest

development of a distributed stream processing system

Technology

prefixfile sink

filterfile sink

tweet countmin sketchfile

number of operators

upstream operators

operators classification

data streamsop

way tuples