petabytes for peanuts! making sense of “ambient data”

15
Petabytes for Peanuts! Making sense of “Ambient Data” Dave Campbell & Friends Microsoft Corporation SVC04

Upload: keilah

Post on 24-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

SVC04. Petabytes for Peanuts! Making sense of “Ambient Data”. Dave Campbell & Friends Microsoft Corporation. Key Takeaways…. Massive shift in how we process data Incredible data volumes Remaking how we discover Changing the Scientific Method Reducing latency & impedance - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Petabytes for Peanuts!  Making sense of “Ambient Data”

Petabytes for Peanuts! Making sense of “Ambient Data”Dave Campbell & FriendsMicrosoft Corporation

SVC04

Page 2: Petabytes for Peanuts!  Making sense of “Ambient Data”

Key Takeaways…> Massive shift in how we process data> Incredible data volumes> Remaking how we discover

> Changing the Scientific Method> Reducing latency & impedance

> Extreme Scale Data Processing> Stream Processing (Several Views)> From “programs” to “queries”

> What’s up with this “anti-SQL” stuff anyhow?

Page 3: Petabytes for Peanuts!  Making sense of “Ambient Data”

“Free” Storage Power

1982Storage Cost: $~2000Transfer Time: 1 day

1997Storage Cost: $~1.00Transfer Time: ½ hour

2009Storage Cost: ~0.1₵Transfer Time: 8 sec.

Page 4: Petabytes for Peanuts!  Making sense of “Ambient Data”

Ambient Data?Over 84 percent of Americans have cell phones, according to Steve Largent, president and CEO of CTIA. While two trillion minutes were used in 2007, an 18 percent increase over 2006 talk times.

More than 48 billion text messages were sent in the month of December 2007, an average 1.6 billion messages per day. The rate of text messaging represented a 157 percent increase over December 2006 texting. http://www.clickz.com/3628985

Text Message Traffic in US: 160GB / day 58TB / year

Voice traffic in US (GSM encoding)

200PB / year

Page 5: Petabytes for Peanuts!  Making sense of “Ambient Data”

The Old World> Data volumes constrained

by human typing speed> App & Data formed closed

systemApp

DBAssume 200M people in US typing 8 hr / day @ 10K keystokes / hour:

2TB/hr or ~6PB / year

Page 6: Petabytes for Peanuts!  Making sense of “Ambient Data”

The Old New WorldAvailable

DataQuestions

toAnswerDesign Schema

Design ETL

DW Nirvana!

Available data exploded

What data shouldwe throw out?

What if we havea new question?

Page 7: Petabytes for Peanuts!  Making sense of “Ambient Data”

The New World of Abundant DataSave All Available

DataNew

Question to Answer

AlgorithmicProcessing

Interesting Read: The Petabyte Age: Because More Isn't Just More — More Is Differenthttp://www.wired.com/science/discoveries/magazine/16-07/pb_intro

Hypothesize Theorize Test

Correlation isEnough!

Run “query”over data…

Analyze reduced data

ExploitCorrelation…

The CMS front end of the Large Hadron Collider records 1TB/sec!

http://blogs.discovermagazine.com/cosmicvariance/2006/09/27/lhc-factoids/

Page 8: Petabytes for Peanuts!  Making sense of “Ambient Data”

Analyze Model Monitor

Analysis

Event Stream both stored and processed

1

Analysis produces event correlation models

2

Event Stream

Models installed in event processing engine

3

Produce real time alerts and action

4

Correlation Model

Event ProcessingEngine

Alerts & Action

Page 9: Petabytes for Peanuts!  Making sense of “Ambient Data”

StreamInsight

Roman SchindlauerProgram ManagerSQL Data Stream Engine

demo

Page 10: Petabytes for Peanuts!  Making sense of “Ambient Data”

Extreme Scale Data Processing

SourceSourceSourceSourceSource

ETL DW

Analysis / Reporting

Majority of data filtered or discarded

1

All data retained and reprocessed

2

DW

Analysis Analysis / Reporting

SourceSource

Non-

tradi

tiona

lSo

urce

s

Trad

ition

al D

ata

War

ehou

seEx

trem

e Sc

ale

Data

Pro

cess

ing

Page 11: Petabytes for Peanuts!  Making sense of “Ambient Data”

LINQ to “whatever”…

Erik MeijerArchitect (& more…)BPD Cloud Programmability Team

demo

Page 12: Petabytes for Peanuts!  Making sense of “Ambient Data”

YOUR FEEDBACK IS IMPORTANT TO US! Please fill out session evaluation

forms online atMicrosoftPDC.com

Page 13: Petabytes for Peanuts!  Making sense of “Ambient Data”

Learn More On Channel 9> Expand your PDC experience through

Channel 9

> Explore videos, hands-on labs, sample code and demos through the new Channel 9 training courses

channel9.msdn.com/learnBuilt by Developers for Developers….

Page 14: Petabytes for Peanuts!  Making sense of “Ambient Data”

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Page 15: Petabytes for Peanuts!  Making sense of “Ambient Data”