lab e streambase aggregate operator - tibco software · 2020. 10. 25. · lab e: aggregate operator...
TRANSCRIPT
Lab E: Aggregate Operator
TIBCO Software Inc. Page 1
Lab E StreamBase Aggregate Operator
Overview In this lab, you extend your StreamBase application with an Aggregate operator. As the
application runs, you will aggregate the values sent to compute an average, maximum, and
minimum of one of the fields in the stream.
Objectives Configure an aggregate operator
Prerequisites Completion of Lab D, or access to the Lab D solution:
LabD-FeedSimLabSolution.zip
Directions Complete the exercises that follow
Developing EventFLow Applications with TIBCO Streaming 10.x
Page 2
Exercise 1: Add an Aggregate
Steps
1. Add Aggregate Operator
Return to StreamingAnalytics.sbapp
From the Operators and Adapters drawer of the Palette view, drag and drop an
Aggregate operator onto the canvas, and drop it exactly on top of the the arc between
the HighValues Filter and the HighValueOutput stream
Analysis: If you drop the new operator directly onto the arc between the
HighValues Filter and the HighValueOutput output stream, the new
operator is inserted between those two existing components.
If you drop anywhere else on the canvas, you will need to:
- disconnect the arc from Filter to output,
- reconnect Filter output to Aggregate input, and
- connect the Aggregate’s output port to the HighValueOutput input port
Tip: It can be a little tricky to place the dropped component exactly on
the arc – zoom in on the Editor canvas to increase your chances.
Rearrange the components on the canvas (Ctrl-L or on the Toolbar)
When the Aggregate is connected and the canvas re-arranged, you should see this:
Lab E: Aggregate Operator
TIBCO Software Inc. Page 3
2. Configure the Aggregate operator
Set the focus on the Aggregate operator to configure its properties:
Dimensions tab:
♦ Click Add…
Name: TimeDim
Type: Select Field from the Type pulldown
Do not select Time from the Type pulldown
Field: Select Time from the Field pulldown
Opening Policy: Select the Do not open window based on this
dimension radio button
Window size: Select Close and emit after and enter 5 seconds
Emission policy: Select No intermediate emissions based on this
dimension
Optional windows: Select Open only a single window for the first event
or following a gap in values
Developing EventFLow Applications with TIBCO Streaming 10.x
Page 4
♦ Click OK
Note: These settings cause non-overlapping windows to be opened and
closed for every five seconds’ worth of tuples. The time values that
control the windows’ behavior come from the Time field of the input
tuple – time is not sampled from the system clock (often referred to as
wall clock time) in this configuration.
Aggregate Functions Tab:
♦ There should already be a row in the Additional Expressions grid that looks like
the following:
Action: Add
Field name: *
Expression: lastval(*)
This setting causes all the most recent values processed in the
window (lastval) of the input fields to be emitted by the
Aggregate. Thus, the output stream will have fields named
Lab E: Aggregate Operator
TIBCO Software Inc. Page 5
resp_any_Clothing_WomensClothing,
hist_qty_Clothing_Accessories, etc. No matter how many tuples
fall into this window, the last-processed value for each field is
what will be sent further downstream.
♦ Click the green + to add an additional expression
Action: Add
Field name: MinHistSales
Expression: min(hist_qty_Clothing_WomensClothing)
♦ Click the green + to add an additional expression
Action: Add
Field name: MaxHistSales
Expression: max(hist_qty_Clothing_WomensClothing)
♦ Click the green + to add an additional expression
Action: Add
Field name: AvgHistSales
Expression: avg(hist_qty_Clothing_WomensClothing)
Group Options Tab:
♦ Leave empty
Concurrency Tab:
♦ Leave as-is
Developing EventFLow Applications with TIBCO Streaming 10.x
Page 6
Note: These settings cause the min, max, and average values to be
computed over all the values of the field
hist_qty_Clothing_WomensClothing that are processed by a window. For
these property settings, the Aggregate operator emits a result tuple
containing the calculated values when the window closes.
3. Save
Studio shows an asterisk on the tab of StreamingAnalytics.sbapp. This is a visual
indicator that there are changes not yet saved.
Select File > Save from the Studio menu bar.
4. Run/Test
Run your StreamingAnalytics.sbapp (Run As > EventFlow Fragment or Run As > Run
Configurations…)
Select the feed simulation and click Run
Allow the feed simulation to run until you see a tuple output on the HighValueOutput
stream.
Tip: You can set the Stream: dropdown to select a single Output Stream
to see displayed in the Output Streams view in order not to be visually
overwhelmed with many tuples:
The application is set up to only allow High Values to pass through the Filter. Then, once
through the Filter, the values are aggregated over periods of time (5 seconds).
Once you see three or more tuples output on the HighValuesOutput steam, pause the
feed simulation. Click on one of the HighValueOutput tuples in the Output Streams view
to show its fields in the field pane:
Lab E: Aggregate Operator
TIBCO Software Inc. Page 7
Analysis: We configured the feedsim to produce 2 tuples per second. We
configured the Aggregate operator to open a window for 5 seconds. Thus,
there should be 10 tuples processed by each window created by this
Aggregate operator. The average value of the
hist_qty_Clothing_WomensClothing for those 10 tuples is 0.25. The
maximum for that field: 1. The minimum for that field in the window was 0.
Developing EventFLow Applications with TIBCO Streaming 10.x
Page 8
Because we are using a data file to feed the feed simulation, and because we
are using the timestamp assigned to the tuple as the determination for the
size of the window (vs using wall clock), we should all have the same results.
The same 10 tuples should be in my Aggregate window as are in your
aggregate window.
At much higher tuple rates or volumes, there might be batching and
buffering going on between the feedsim client and the StreamBase server
that would cause the 5 second windowing results to vary versus the system
clock sample we took in the AddTime Map. For truly repeatable aggregation
results over time, the timestamps have to come from outside the
StreamBase application – they have to be part of the arriving input stream
data.
Click the double red square icon ( ) on the Toolbar along top of Studio (Terminate
EventFlow Fragment)