microsoft streaminsight

37
SQL Server 2008 R2 StreamInsight Speaker: Mark Simms Microsoft SQLCAT Silicon Valley SQL Server User Group May, 2010 Mark Ginnebaugh, User Group Leader, [email protected]

Upload: mark-ginnebaugh

Post on 18-Nov-2014

3.949 views

Category:

Technology


2 download

DESCRIPTION

Microsoft StreamInsight, part of the recent SQL Server 2008 R2 release, is a new platform for building rich applications that can process high volumes of event stream data with near-zero latency. Mark Simms of Microsoft's SQLCAT will demonstrate the core skill sets and technologies needed to deliver StreamInsight enabled solutions, and discuss some of the core scenarios. Mark will provide a detailed walkthrough of the three major components of StreamInsight: input and output adapters, the StreamInsight engine runtime, and the semantics of the continuous standing queries hosted in the StreamInsight engine. This presentation includes hands-on demos, including building out a real-time data processing solution interacting with SQL Server and Sharepoint. You will learn: • The new capabilities StreamInsight brings to data processing and analytics, unlocking the ability to extract real time business intelligence from streaming data. • How StreamInsight interacts with and compliments other components of SQL Server and the rest of the Microsoft technology stack. • How to ramp up on the skills and technology necessary to build out end to end solutions leveraging streaming data sources.

TRANSCRIPT

Page 1: Microsoft StreamInsight

SQL Server 2008 R2

StreamInsight

Speaker: Mark Simms

Microsoft SQLCAT

Silicon Valley SQL Server User Group

May, 2010

Mark Ginnebaugh, User Group Leader,

[email protected]

Page 3: Microsoft StreamInsight
Page 4: Microsoft StreamInsight

100000

Custom-built solutions that carry huge development and

customization costs

Facts/sec.

Load barrier is dictated by current choices of the solution, e.g., loading into databases, persisting into files. This is intrinsic because in current approaches no processing can be done till the data is loaded.

Traditional DW Analytics

Active DW analytics

Present

Time of interest

100000

10000

1000

100

carry huge development and customization costs

years months days hrs min sec

Load time in ETLET time in ETL

Page 5: Microsoft StreamInsight

Analytical results need to reflect important changes in

business reality immediately and enable responses to them

with minimal latency

Database Applications Event-driven Applications

QueryParadigm

Ad-hoc queries or requests

Continuous standing queries

Latency Seconds, hours, days Milliseconds or less

5

Data Rate Hundreds of events/sec Tens of thousands of events/sec or more

Query Semantics

Declarative relational analytics

Declarative relational and temporal analytics

request

response

Eventoutput stream

input stream

Page 6: Microsoft StreamInsight

Relational Database Applications

Latency

Months

Days

hours

Minutes

Operational Analytics

Applications, e.g., Logistics,

etc.

StreamInsight

Target Scenarios

Data Warehousing

Applications

Financial trading

Applications

Aggregate Data Rate (Events/sec.)

0 10 100 1000 10000 100000 ~1million

Seconds

100 ms

< 1ms

Manufacturing

ApplicationsMonitoring

Applications

ApplicationsWeb Analytics Applications

6

Page 7: Microsoft StreamInsight

Da

ta S

tre

am

Da

ta S

tre

am

Power Utilities:• Energy

consumption• Outages• Smart grids• 100,000 events/sec

Visual trend-line and KPI monitoringBatch & product managementAutomated anomaly detectionReal-time customer segmentation

Web Analytics:• Click-stream data• Online customer

behavior• Page layout• 100,000 events /sec

Manufacturing:• Sensor on plant

floor• React through

device controllers• Aggregated data • 10,000 events/sec

Asset Instrumentation for Data Acquisition, Subscriptions to Data Feeds

Financial Services:• Stock & news feeds• Algorithmic trading• Patterns over time• Super-low latency• 100,000 events /sec

7

Da

ta S

tre

am

Stream Data Store & Archive

StreamInsight Engine

Da

ta S

tre

am

Asset Specs & Parameters

Real-time customer segmentation Algorithmic tradingProactive condition-based maintenance

• Threshold queries• Event correlation from

multiple sources• Pattern queries

Lookup

Page 8: Microsoft StreamInsight

Industry trends

• Data acquisition costs are negligible

• Raw storage costs

Manage business via

KPI-triggered actions

Monitor KPIsRecord raw

data (history)

StreamInsightadvantage

• Process data incrementally, i.e., while it is in flight• Raw storage costs

are small and continue to decrease

• Processing costs are non-negligible

• Data loading costs continue to be significant

actions

Mine historical dataDevise new KPIs

data (history) flight

• Avoid loading while still doing the processing you want

• Seamless querying for monitoring, managing and mining

8

Page 9: Microsoft StreamInsight

Event sources Event targets

Devices, Sensors Pagers &

StreamInsight Application at Runtime

StreamInsight Application Development

InputAdapters

OutputAdaptersStreamInsight Engine

Standing Queries

Query Logic

Devices, Sensors

Web servers

Event stores & Databases

Stock ticker, news feeds Event stores & Databases

Pagers &Monitoring devices

KPI Dashboards, SharePoint UI

Trading stations

Adapters AdaptersStreamInsight Engine

Query Logic

Query Logic

Page 10: Microsoft StreamInsight
Page 11: Microsoft StreamInsight
Page 12: Microsoft StreamInsight
Page 13: Microsoft StreamInsight
Page 14: Microsoft StreamInsight
Page 15: Microsoft StreamInsight
Page 16: Microsoft StreamInsight

SELECT COUNT(*) FROM ParkingLot

WHERE type = ‘AUTO’

AND color = ‘RED’

Page 17: Microsoft StreamInsight

red cars

last hour

Doesn’t seem like a great solution…

Page 18: Microsoft StreamInsight

This is the streaming data paradigm in a nutshell –ask questions about data in flight.

Page 19: Microsoft StreamInsight

Engine

AdaptersEngineEngine

Queries

Extensions

Host

visual debugger API

Page 20: Microsoft StreamInsight

expressed

questionquestiondata

dataquestion

Page 21: Microsoft StreamInsight

Tell me the just the color of each car that passes.

var result = from car in carStream

select new

{

car.Color

};

Page 22: Microsoft StreamInsight

Give me only trucks.

var result = from car in carStream

where car.Type == “Truck”

select car;

Page 23: Microsoft StreamInsight

Tell me the number of cars passedevery 10 seconds.

var result = from win in carStream.TumblingWindow(

TimeSpan.FromSeconds(10))

select new

{

count = win.Count()

};

Page 24: Microsoft StreamInsight

var result = from win in carStream.TumblingWindow(

TimeSpan.FromSeconds(10))

select new

{

count = win.Count()

};

Page 25: Microsoft StreamInsight
Page 26: Microsoft StreamInsight

Count the number of cars for each make separately every 10 seconds.

var result = from car in carStreamvar result = from car in carStream

group car by car.make into eachGroup

from win in carStream.TumblingWindow(

TimeSpan.FromSeconds(10))

select new

{

make = eachGroup.Key,

count = win.Count()

};

Page 27: Microsoft StreamInsight

application time

Current Time Indicators

Page 28: Microsoft StreamInsight
Page 29: Microsoft StreamInsight
Page 30: Microsoft StreamInsight

public void EnqueueEvent(SourceData d)

{

var ev = CreateInsertEvent();

ev.Payload = new MouseEvent { Id = d.id, Value = d.value };

ev.StartTime = d.timestamp;

Enqueue(ref ev);

}

Page 31: Microsoft StreamInsight

public void EnqueueEvent(SourceData d)

{

if AdapterState

return

var ev = CreateInsertEvent();

ev.Payload = new MouseEvent { Id = d.id, Value = d.value };

ev.StartTime = d.timestamp;

Enqueue(ref ev);

}

Page 32: Microsoft StreamInsight

public void EnqueueEvent(SourceData d)

{

if AdapterState

return

var ev = CreateInsertEvent();

if (ev == null) return;

ev.Payload = new MouseEvent { Id = d.id, Value = d.value };

ev.StartTime = d.timestamp;

Enqueue(ref ev);

}

Page 33: Microsoft StreamInsight

public void EnqueueEvent(SourceData d)

{

if AdapterState

return

var ev = CreateInsertEvent();

if (ev == null) return;

ev.Payload = new MouseEvent { Id = d.id, Value = d.value };

ev.StartTime = d.timestamp;

if (Enqueue(ref ev) == EnqueueOperationResult.Full)

{

Ready();

return;

}

}

Page 34: Microsoft StreamInsight

Use them wisely!

Page 35: Microsoft StreamInsight

public class TimeWeightedAverage : CepTimeSensitiveAggregate<double, double>

{

public override doubleGenerateOutput(IEnumerable<IntervalEvent<double>> events,

WindowDescriptor windowDescriptor)WindowDescriptor windowDescriptor)

{

double avg = 0;

foreach (IntervalEvent<double> ev in events)

{

avg += intervalEvent.Payload * (ev.EndTime - ev.StartTime).Ticks;

}

return = avg / (windowDescriptor.EndTime –windowDescriptor.StartTime).Ticks;

}

}

Page 36: Microsoft StreamInsight
Page 37: Microsoft StreamInsight

To learn more or inquire about speaking opportunities, please contact:

Mark Ginnebaugh, User Group Leader

[email protected]