iis server etl iis server this is operational analytics

26
park the future. May 4 – 8, 2015 Chicago, IL

Upload: susan-collins

Post on 19-Dec-2015

254 views

Category:

Documents


0 download

TRANSCRIPT

Spark the future.

May 4 – 8, 2015Chicago, IL

Operational Analytics in SQL ServerSunil AgarwalPrincipal Program Manager, SQL [email protected]

BRK4552

Definition and Value Prop Operational Analytics with Disk-Based

Tables Operational Analytics with In-Memory OLTP

Agenda

Refers to Operational Workload (i.e. OLTP)

Examples: Enterprise Resource Planning (ERP) – Inventory, Order, Sales, Machine Data – Data from machine operations on factory floor Online Stores (e.g. Amazon, Expedia) Stock/Security trades

Mission Critical No downtime (High Availability) – impact on revenue Low latency and high transaction throughput

What is Operational?

Analytics Studying past data (e.g. operational, social media) to identify potential

trends To analyze the effects of certain decisions or events (e.g. Ad campaign) Analyze past/current data to predict outcomes (e.g. credit score)

Goals Enhance the business by gaining knowledge to make improvements or

changes.

What is Analytics?

Source – MIT/SLOAN Management Review

Traditional Operational/Analytics Architecture

SQL Server

Database

Application Tier

Presentation Layer

IIS Server

SQL ServerRelational DW

Database

ETL

BI and analytics

SQL ServerAnalysis Server

Key Issues Complex

Implementation Requires two Servers

(CapEx and OpEx) Data Latency in

Analytics More businesses

demand/require real-time Analytics

Hourly, Daily, Weekly

Minimizing Data Latency for Analytics

SQL Server

Database

Application Tier

Presentation Layer

IIS Server

BI and analytics

Benefits No Data Latency No ETL No Separate DW

Challenges Analytics queries are resource

intensive and can cause blocking

How to minimize Impact on Operational workload

Sub-optimal execution of Analytics on relational schema

Add analytics specific indexes

This is OPERATIONAL ANALYTICS

SQL ServerAnalysis Server

8

Operational Analytics Ability to run Analytics Queries concurrently with

Operational workload using the same schema.

Not a replacement for• Extreme Analytics Queries performance possible using schemas customized (e.g. Star/Snowflake) and pre-aggregated cubes• Data coming from non-relational sources• Data coming from multiple relational sources requiring integrated analytics

Operational Analytics SQL Server 2016

Goals Minimal impact on Operational Workload with

concurrent analytics Performant Analytics on operational schema

SQL Server 2016: Operational Analytics

Achieved using columnstore Index

11

Columnstore Index: Why?

Improved compression:Data from same domain

compress better

Reduced I/O:

Fetch only columns needed

Data stored as rows Data stored as columns

Ideal for OLTP Efficient operation on small set of rows

C1 C2 C3 C5C4

Improved Performance:More data fits in memoryOptimized for CPU utilization

Ideal for DW Workload

12

Clustered Columnstore Performance: TPC-H

Operational Analytics Disk-Based Tables

14

Operational Analytics: With Columnstore Index

Key Points• Create an updateable non-clustered columnstore index (NCCI) for

analytics queries• Drop all other indexes that were created for analytics. • No Application changes. • ColumnStore index is maintained just like any other index• Query Optimizer will choose columnstore index where needed

Relational Table(Clustered Index/Heap)

Btree IndexD

ele

te b

itm

ap

Nonclustered columnstore index (NCCI)

Delta rowgroups

15

Operational Analytics: Columnstore Index OverheadDML Operations on OLTP workload

Operation BTREE (NCI) Non Clustered ColumnStore Index (NCCI)

Insert Insert row into btree Insert row into btree (delta store)

Delete (a)Seek row(s) to be deleted(b)Delete the row

(a)Seek for the row in the delta stores (there can be multiple)(b)If row found, then delete(c) Otherwise insert the key into delete row

buffer

Update (a)Seek the row(s) (b)Update

(a)Delete the row (steps same as above)(b)Insert the updated row into delta store

16

Operational Analytics: Minimizing Columnstore overhead

Key Points• Create Columnstore only on cold data – using filtered predicate to minimize

maintenance• Analytics query accesses both columnstore and ‘hot’ data transparently• Example – Order Management Application – create nonclustered columnstore index ….. where order_status = ‘SHIPPED’

Relational Table(Clustered Index/Heap)

Btree Index

Dele

te b

itm

ap

Nonclustered columnstore index (NCCI) – filtered index

HOT

Delta rowgroups

DML Operations

17

Operational Analytics: Minimizing Columnstore overhead

Key Points Mission Critical Operational Workloads typically configured for High

Availability using AlwaysOn Availability Groups You can offload analytics to readable secondary replica

PrimaryReplica

Log records

Log records

Log records

Secondary Replica

Secondary Replica

Secondary Replica

Analytic Queries AlwaysOn Availability Group

Demo

Sunil Agarwal

Operational Analytics In-Memory Tables

20

Operational Analytics: Columnstore on In-Memory Tables

SQL Server 2016 – CTP2 limitation You can create columnstore index on empty

table All columns must be included in the

columnstore

No explicit delta rowgroup Rows (tail) not in columnstore stay in in-memory

OLTP table No columnstore index overhead when operating on

tail Background task migrates rows from tail to

columnstore in chunks of 1 million rows not changed in last 1 hour.

Deleted Rows Table (DRT) – Tracks deleted rows

Columnstore data fully resident in memory Persisted together with operational data

No application changes required.

In-Memory OLTP Table

Updateable CCI

DRT Tail

Range Index

Hash Index

Hot

Like

Delta rowgroup

21

Operational Analytics: Columnstore OverheadDML Operations on In-Memory OLTP

Operation Hash or Range Index HK-CCI

Insert Insert row into HK Insert row into HK

Delete (a)Seek row(s) to be deleted(b)Delete the row

(a)Seek row(s) to be deleted(b)Delete the row in HK(c) If row in TAIL then return else insert <colstore-RID> into DRT

Update (a)Seek the row(s) (b)Update (delete/insert)

(a)Seek the row(s) (b)Update (delete/insert) in HK(c) If row in TAIL then return else insert <colstore-RID> into DRT

22

Operational Analytics: Minimizing Columnstore overheadDML Operations

In-Memory OLTP Table

Updateable CCI

DRT Tail

Range Index

Hash Index

Like

Delta rowgroup

Hot

Keep hot data only in in-memory tablesExample – data stays hot for 1 day, 1 week…

CTP2: Work-AroundUse TF – 9975 to disable auto-compressionForce compression using a spec-proc “sp_memory_optimized_cs_migration”

Analytics QueriesOffload Analytics to AlwaysON Readable Secondary

23

Summary – Operational Analytics

Analytics in real-time with no data latency Rich set of options to control impact on

Operational workload Industry leading solution Integrating in-

memory OLTP with in-memory Analytics No Application changes required

Visit Myignite at http://myignite.microsoft.com or download and use the Ignite Mobile App with the QR code above.

Please evaluate this sessionYour feedback is important to us!

© 2015 Microsoft Corporation. All rights reserved.

26

Operational Analytics: with CCI

CCI

Btree Index

delta

HOT DATA

SQL V.Next Allows creating one or more NCIs on CCI Allows locking @rowlevel for updates/deletes Ability to delay compression of rows in delta rowgroup

• Some Limitations No Triggers No transaction replication Cursor based access not allowed

• Comparison with NCCI Seek of the row in the compressed store is comparatively

expensive Short-range scans comparatively expensive Lower concurrency: when rowgroup is compressed (TM),

it is not available for UPDATE/DELETE