lab e streambase aggregate operator - tibco software · 2020. 10. 25. · lab e: aggregate operator...

8
Lab E: Aggregate Operator TIBCO Software Inc. Page 1 Lab E StreamBase Aggregate Operator Overview In this lab, you extend your StreamBase application with an Aggregate operator. As the application runs, you will aggregate the values sent to compute an average, maximum, and minimum of one of the fields in the stream. Objectives Configure an aggregate operator Prerequisites Completion of Lab D, or access to the Lab D solution: LabD-FeedSimLabSolution.zip Directions Complete the exercises that follow

Upload: others

Post on 10-Aug-2021

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lab E StreamBase Aggregate Operator - TIBCO Software · 2020. 10. 25. · Lab E: Aggregate Operator TIBCO Software Inc. Page 1 Lab E StreamBase Aggregate Operator Overview In this

Lab E: Aggregate Operator

TIBCO Software Inc. Page 1

Lab E StreamBase Aggregate Operator

Overview In this lab, you extend your StreamBase application with an Aggregate operator. As the

application runs, you will aggregate the values sent to compute an average, maximum, and

minimum of one of the fields in the stream.

Objectives Configure an aggregate operator

Prerequisites Completion of Lab D, or access to the Lab D solution:

LabD-FeedSimLabSolution.zip

Directions Complete the exercises that follow

Page 2: Lab E StreamBase Aggregate Operator - TIBCO Software · 2020. 10. 25. · Lab E: Aggregate Operator TIBCO Software Inc. Page 1 Lab E StreamBase Aggregate Operator Overview In this

Developing EventFLow Applications with TIBCO Streaming 10.x

Page 2

Exercise 1: Add an Aggregate

Steps

1. Add Aggregate Operator

Return to StreamingAnalytics.sbapp

From the Operators and Adapters drawer of the Palette view, drag and drop an

Aggregate operator onto the canvas, and drop it exactly on top of the the arc between

the HighValues Filter and the HighValueOutput stream

Analysis: If you drop the new operator directly onto the arc between the

HighValues Filter and the HighValueOutput output stream, the new

operator is inserted between those two existing components.

If you drop anywhere else on the canvas, you will need to:

- disconnect the arc from Filter to output,

- reconnect Filter output to Aggregate input, and

- connect the Aggregate’s output port to the HighValueOutput input port

Tip: It can be a little tricky to place the dropped component exactly on

the arc – zoom in on the Editor canvas to increase your chances.

Rearrange the components on the canvas (Ctrl-L or on the Toolbar)

When the Aggregate is connected and the canvas re-arranged, you should see this:

Page 3: Lab E StreamBase Aggregate Operator - TIBCO Software · 2020. 10. 25. · Lab E: Aggregate Operator TIBCO Software Inc. Page 1 Lab E StreamBase Aggregate Operator Overview In this

Lab E: Aggregate Operator

TIBCO Software Inc. Page 3

2. Configure the Aggregate operator

Set the focus on the Aggregate operator to configure its properties:

Dimensions tab:

♦ Click Add…

Name: TimeDim

Type: Select Field from the Type pulldown

Do not select Time from the Type pulldown

Field: Select Time from the Field pulldown

Opening Policy: Select the Do not open window based on this

dimension radio button

Window size: Select Close and emit after and enter 5 seconds

Emission policy: Select No intermediate emissions based on this

dimension

Optional windows: Select Open only a single window for the first event

or following a gap in values

Page 4: Lab E StreamBase Aggregate Operator - TIBCO Software · 2020. 10. 25. · Lab E: Aggregate Operator TIBCO Software Inc. Page 1 Lab E StreamBase Aggregate Operator Overview In this

Developing EventFLow Applications with TIBCO Streaming 10.x

Page 4

♦ Click OK

Note: These settings cause non-overlapping windows to be opened and

closed for every five seconds’ worth of tuples. The time values that

control the windows’ behavior come from the Time field of the input

tuple – time is not sampled from the system clock (often referred to as

wall clock time) in this configuration.

Aggregate Functions Tab:

♦ There should already be a row in the Additional Expressions grid that looks like

the following:

Action: Add

Field name: *

Expression: lastval(*)

This setting causes all the most recent values processed in the

window (lastval) of the input fields to be emitted by the

Aggregate. Thus, the output stream will have fields named

Page 5: Lab E StreamBase Aggregate Operator - TIBCO Software · 2020. 10. 25. · Lab E: Aggregate Operator TIBCO Software Inc. Page 1 Lab E StreamBase Aggregate Operator Overview In this

Lab E: Aggregate Operator

TIBCO Software Inc. Page 5

resp_any_Clothing_WomensClothing,

hist_qty_Clothing_Accessories, etc. No matter how many tuples

fall into this window, the last-processed value for each field is

what will be sent further downstream.

♦ Click the green + to add an additional expression

Action: Add

Field name: MinHistSales

Expression: min(hist_qty_Clothing_WomensClothing)

♦ Click the green + to add an additional expression

Action: Add

Field name: MaxHistSales

Expression: max(hist_qty_Clothing_WomensClothing)

♦ Click the green + to add an additional expression

Action: Add

Field name: AvgHistSales

Expression: avg(hist_qty_Clothing_WomensClothing)

Group Options Tab:

♦ Leave empty

Concurrency Tab:

♦ Leave as-is

Page 6: Lab E StreamBase Aggregate Operator - TIBCO Software · 2020. 10. 25. · Lab E: Aggregate Operator TIBCO Software Inc. Page 1 Lab E StreamBase Aggregate Operator Overview In this

Developing EventFLow Applications with TIBCO Streaming 10.x

Page 6

Note: These settings cause the min, max, and average values to be

computed over all the values of the field

hist_qty_Clothing_WomensClothing that are processed by a window. For

these property settings, the Aggregate operator emits a result tuple

containing the calculated values when the window closes.

3. Save

Studio shows an asterisk on the tab of StreamingAnalytics.sbapp. This is a visual

indicator that there are changes not yet saved.

Select File > Save from the Studio menu bar.

4. Run/Test

Run your StreamingAnalytics.sbapp (Run As > EventFlow Fragment or Run As > Run

Configurations…)

Select the feed simulation and click Run

Allow the feed simulation to run until you see a tuple output on the HighValueOutput

stream.

Tip: You can set the Stream: dropdown to select a single Output Stream

to see displayed in the Output Streams view in order not to be visually

overwhelmed with many tuples:

The application is set up to only allow High Values to pass through the Filter. Then, once

through the Filter, the values are aggregated over periods of time (5 seconds).

Once you see three or more tuples output on the HighValuesOutput steam, pause the

feed simulation. Click on one of the HighValueOutput tuples in the Output Streams view

to show its fields in the field pane:

Page 7: Lab E StreamBase Aggregate Operator - TIBCO Software · 2020. 10. 25. · Lab E: Aggregate Operator TIBCO Software Inc. Page 1 Lab E StreamBase Aggregate Operator Overview In this

Lab E: Aggregate Operator

TIBCO Software Inc. Page 7

Analysis: We configured the feedsim to produce 2 tuples per second. We

configured the Aggregate operator to open a window for 5 seconds. Thus,

there should be 10 tuples processed by each window created by this

Aggregate operator. The average value of the

hist_qty_Clothing_WomensClothing for those 10 tuples is 0.25. The

maximum for that field: 1. The minimum for that field in the window was 0.

Page 8: Lab E StreamBase Aggregate Operator - TIBCO Software · 2020. 10. 25. · Lab E: Aggregate Operator TIBCO Software Inc. Page 1 Lab E StreamBase Aggregate Operator Overview In this

Developing EventFLow Applications with TIBCO Streaming 10.x

Page 8

Because we are using a data file to feed the feed simulation, and because we

are using the timestamp assigned to the tuple as the determination for the

size of the window (vs using wall clock), we should all have the same results.

The same 10 tuples should be in my Aggregate window as are in your

aggregate window.

At much higher tuple rates or volumes, there might be batching and

buffering going on between the feedsim client and the StreamBase server

that would cause the 5 second windowing results to vary versus the system

clock sample we took in the AddTime Map. For truly repeatable aggregation

results over time, the timestamps have to come from outside the

StreamBase application – they have to be part of the arriving input stream

data.

Click the double red square icon ( ) on the Toolbar along top of Studio (Terminate

EventFlow Fragment)