bdia findings

31
Grab some coffee and enjoy the pre-show banter before the top of the hour!

Upload: inside-analysis

Post on 16-Jul-2015

70 views

Category:

Technology


0 download

TRANSCRIPT

Grab some coffee and enjoy the pre-show banter before the top of the hour!

“Making Way For Big Data” Findings Webcast | June 25, 2014

Featuring

Eric Kavanagh CEO, The Bloor Group

Robin Bloor Chief Analyst, The Bloor Group

Findings Webcast June 25, 2014

Big Data Information Architecture

Roundtable Webcast April 9, 2014

Exploratory Webcast January 22, 2014

#BigDataArch

The Sequence of Topics

The Great Disruption

Events

New Architectures for OLD

1. The Great Disruption

Moore’s Law Cubed

u  The biggest databases are NEW databases

u  They grow at the cube of Moore’s Law

u  Moore’s Law = 10x every 6 years u  VLDB: 1000x every 6 years • 1991/2 megabytes • 1997/8 gigabytes • 2003/4 terabytes • 2009/10 petabytes • 2015/16 exabytes

Technology Evolution

Observations…

u  Software architectures change: centralized, C/S, 3 tier/web, SOA, etc.

u  Applications migrate according to latencies

u  Wholly new applications appear because of lower latencies, e.g., VMs, CEP

u  THIS CURVE IS NO LONGER VALID…

Memory is Becoming Hierarchical Store

u On chip speed v RAM • L1(32K) = 100x • L2(246K) = 30x • L3(8-20Mb) = 8.6x

u RAM v SSD • RAM = 300x

u  SSD v Disk • SSD = 10x

u Disk will soon turn up its toes

Note: Vector instructions and data compression

Putting a SoC in IT

u  It’s possible that the CPU/Memory split will vanish, possibly soon

u This requires the emergence of the commodity SoC

u There are already ARM SoCs that run Linux

u Grids of SoCs would replace grids of servers

Parallelism: The Imp is Out of the Bottle

u Multicore chips enabled parallelism

u  It has changed the whole performance equation

u  It enabled Big Data

u  Big Data is really Big Processing

2. Events

u Computer u On-line u PC u Internet u Mobile u Internet of things

u Batch u Centralized u Client/server u Multi-tier u Service orientation u Event driven/Big Data

Tech Revolutions

TECH REVOLUTION ARCHITECTURE

Event Types

u Instantiation Event u A State Report u A Trigger Event u A Correction Event

We also need to consider: Data Refinement | Aggregations | Homogeneous collections | Derived Data

The Traffic Cop

The Evolution of Hadoop

u Hadoop is far too useful and popular to fade away

u YARN and Tez have changed the picture

u Hadoop will become the default scale out file system

u And a critical component of the DATA HUB

Hadoop as a Clip-On

3. New Architecture For Old?

There MAY be some Big Data applications that are not about

data analytics.

Big Data and Analytics

If so, nobody is talking about them…

A Process, Not an Activity

u  Data Analytics is a multi-disciplinary end-to-end process

u  Until recently it was a walled-garden. But the walls were torn down by: •  Data availability •  Scalable technology •  Open source tools

Data Flow (The Paradox)

Our Architectures need to cater for DATA FLOW,

not data at rest

However, DO NOT MOVE THE DATA unless you absolutely have to

The Corporate Data Flows

u There needs to be two data flows (at minimum)

u Currently we can distinguish between: • Real-time/business time applications • Analytical applications • We will build specific architectures for this

Data Flow

The role of Hadoop is as the STAGING AREA FOR REFINEMENT

And also as a A SCALE-OUT FILE SYSTEM

A BDIA in Overview

Think Logical, Implement Physical

The BDIA: Two Data Flows

Within the Data Hub

The CRITICAL Workload Issue

u  Previously, we viewed database workloads as an i/o optimization problem

u With the BDIA the workload is a variable MIX of i/o, transformation and calculation

u No databases were built precisely for this – not even Big Data databases

In Summary…

The Great Disruption

Events

New Architectures for OLD

Questions?

#BigDataArch or

USE THE Q&A

THANK YOU!

FIND ALL BDIA WEBCASTS & RESEARCH AT: http://insideanalysis.com/research/big-data-information-architecture