think big - how to design a big data information architecture

32
Grab some coffee and enjoy the pre-show banter before the top of the hour!

Upload: inside-analysis

Post on 03-Jun-2015

143 views

Category:

Technology


0 download

DESCRIPTION

Exploratory Webcast for the Big Data Information Architecture Research Project Live Webcast Jan. 22, 2014 Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=32304b307fc5359a2f97b173166ea07b Big Data is everywhere -- that's for sure. But the big question for today's savvy enterprise is where, exactly, should it fit within the Information Architecture? Making that decision correctly can save a lot of money while adding significant value to any number of enterprise operations. Business processes can be improved with critical new data sets; marketing can excel at hitting the right targets quickly; sales can hit home runs by having a much deeper understanding of key prospects; and senior executives can see the big picture more clearly than ever before. Register for this Exploratory Webcast to hear veteran Analyst Dr. Robin Bloor outline the current landscape of Big Data, and offer guidance for today's organizations to determine how, when and where to deploy this powerful if unwieldy information asset. This event will kick off The Bloor Group's Interactive Research Report for 2014 which will focus on illuminating optimal Big Data Information Architectures. The series will include a dozen interviews with today's Big Data visionaries, plus three interactive Webcasts and a detailed findings report. Visit InsideAnalysis.com for more information.

TRANSCRIPT

Page 1: Think Big - How to Design a Big Data Information Architecture

Grab some coffee and enjoy the pre-show banter before the top of the hour!

Page 2: Think Big - How to Design a Big Data Information Architecture

“Think Big: How to Design a Big Data Information Architecture” Exploratory Webcast | January 22, 2014

Page 3: Think Big - How to Design a Big Data Information Architecture

Guests

Robin Bloor Chief Analyst, The Bloor Group @robinbloor [email protected]

Eric Kavanagh CEO, The Bloor Group @eric_kavanagh [email protected]

Page 4: Think Big - How to Design a Big Data Information Architecture

Findings Webcast June 25, 2014

Big Data Information Architecture

Roundtable Webcast April 9, 2014

Exploratory Webcast January 22, 2014

#BigDataArch

Page 5: Think Big - How to Design a Big Data Information Architecture
Page 6: Think Big - How to Design a Big Data Information Architecture

Big Data Information Architecture

Page 7: Think Big - How to Design a Big Data Information Architecture

In Three Segments

The Big Data Curve?

Data Flow

Technology Disruption

PART ONE

PART THREE

PART TWO

Page 8: Think Big - How to Design a Big Data Information Architecture

Part 1: The Big Data Curve

Page 9: Think Big - How to Design a Big Data Information Architecture

The Visible “Big Data” Trend

u  Corporate data volumes grow at about 55% per annum - exponentially

u  Data has been growing at this rate for, maybe, 40 years

u  There is nothing new about big data. It clings to an established exponential trend

Page 10: Think Big - How to Design a Big Data Information Architecture

The Invisible Trend: Moore’s Law Cubed

u  The biggest databases are new databases

u  They grow at the cube of Moore’s Law

u  Moore’s Law = 10x every 6 years u  VLDB: 1000x every 6 years –  1991/2 megabytes –  1997/8 gigabytes –  2003/4 terabytes –  2009/10 petabytes –  2015/16 exabytes

Page 11: Think Big - How to Design a Big Data Information Architecture

Technology Evolution (Bloor Curve)

The Area OfAs-Yet-Unrealized

Applications

ApplicationMigration

Source: The Bloor Group

Page 12: Think Big - How to Design a Big Data Information Architecture

The Traditional Force of Disruption

u  Software architectures change: centralized, C/S, 3 tier/web, SOA, etc.

u  Applications migrate according to latencies

u  Dominant applications and software brands can die via “The innovator’s dilemma”

u  Wholly new applications appear because of lower latencies, e.g., VMs, CEP

The Area OfAs-Yet-Unrealized

Applications

ApplicationMigration

Source: The Bloor Group

Page 13: Think Big - How to Design a Big Data Information Architecture

This Curve is Compromised

The Area OfAs-Yet-Unrealized

Applications

ApplicationMigration

Source: The Bloor Group

Two DISRUPTIVE forces have changed

the curve:

PARALLELISM and

The CLOUD

Page 14: Think Big - How to Design a Big Data Information Architecture

It’s not really about

Big Data???

It’s about

Page 15: Think Big - How to Design a Big Data Information Architecture

Part 2: Technology Disruption

Page 16: Think Big - How to Design a Big Data Information Architecture

It’s Over for Spinning Disk

u  SSD is now on the Moore’s Law curve

u  Disk is not and never was (in respect of seek time)

u  All traditional databases were engineered for spinning disk and not for scale-out

u  This explains the new DBMS products…

Page 17: Think Big - How to Design a Big Data Information Architecture

In-Memory Disruption

u  Memory may gradually become the primary store for data (this impacts data flows)

u  Almost all applications are poorly built for this

u  Memory is an accelerator – as is CPU cache. This is becoming a factor

Page 18: Think Big - How to Design a Big Data Information Architecture

The Memory Cascade

u  On chip speed v RAM •  L1(32K) = 100x •  L2(246K) = 30x •  L3(8-20Mb) = 8.6x

u  RAM v SSD •  RAM = 300x

u  SSD v Disk •  SSD = 10x

Note: Vector instructions and data compression

Page 19: Think Big - How to Design a Big Data Information Architecture

u Computer u On-line u PC u Internet u Mobile u Internet of things

u Batch u Centralized u Client/server u Multi-tier u Service Orientation u Event Driven/Big

Data

Tech Revolutions

TECH REVOLUTION ARCHITECTURE

Page 20: Think Big - How to Design a Big Data Information Architecture

Event Driven/Big Data Architecture?

Page 21: Think Big - How to Design a Big Data Information Architecture

The Open Source Picture

u  The R Language •  Over 1 million

users u  Hadoop and its

Ecosystem •  Reduced latency

for analytics u  Machine Learning

Algorithms •  Raw power

None of these are engineered for performance

Page 22: Think Big - How to Design a Big Data Information Architecture

Part 3: Data Flow

Page 23: Think Big - How to Design a Big Data Information Architecture

What Is A Data Scientist?

u Project manager u Qualified statistician u Domain Business

expert u Experienced data

architect u Software engineer

(IT’S A TEAM)

Page 24: Think Big - How to Design a Big Data Information Architecture

A Process, Not an Activity

u  Data Analytics is a multi-disciplinary end-to-end process

u  Until recently it was a walled-garden. But recently the walls were torn down by…

•  Data availability •  Scalable technology •  Open source tools

Page 25: Think Big - How to Design a Big Data Information Architecture

The CRITICAL Workload Issue

u  Previously, we viewed database workloads as an i/o optimization problem

u  With analytics the workload is a very variable mix of i/o and calculation

u  No databases were built precisely for this – not even Big Data databases

Page 26: Think Big - How to Design a Big Data Information Architecture

Take Note

You can know more about a BUSINESS from

its data than by any other means

Page 27: Think Big - How to Design a Big Data Information Architecture

The Biological System

u  Our human control system works at different speeds: •  Almost instant reflex •  Swift response •  Considered response

u  Organizations will gradually implement similar control systems

u  This suggests a data-flow- based architecture

Page 28: Think Big - How to Design a Big Data Information Architecture

The Corporate Biological System

u  Right now this division into two different data flows is already occurring

u  Currently we can distinguish between: •  Real-time/Business time

applications •  Analytical applications

u  We should build specific architectures for this

Page 29: Think Big - How to Design a Big Data Information Architecture

Some Architectural Principles

u  The new atom of data is the event

u  SUSO, scale up before scale out

u  Take the processing to the data, if you can

u  Hadoop is a component not a solution

Page 30: Think Big - How to Design a Big Data Information Architecture

In Conclusion

The Big Data Curve?

Data Flow

Technology Disruption

PART ONE

PART THREE

PART TWO

Page 31: Think Big - How to Design a Big Data Information Architecture

Questions?

#BigDataArch or

USE THE Q&A

Page 32: Think Big - How to Design a Big Data Information Architecture

THANK YOU!

REGISTER FOR BDIA WEBCASTS AT: http://insideanalysis.com/research/big-data-information-architecture