business data lake - dell emc · business data lake emc big data storage ... emc isilon scale-out...

21
1 © Copyright 2016 EMC Corporation. All rights reserved. BUSINESS DATA LAKE FADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST

Upload: letuong

Post on 25-Jun-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

1© Copyright 2016 EMC Corporation. All rights reserved.

BUSINESS DATA LAKEFADI FAKHOURI, SR. SYSTEMS ENGINEER, ISILON SPECIALIST

2© Copyright 2016 EMC Corporation. All rights reserved.

UNSTRUCTURED DATA GROWTH

Source: IDC

2

2015

71 EB

Total Capacity Shipped, Worldwide % of Unstructured Data

75%

78%

80%

2016

106 EB

2017

133 EB

3© Copyright 2016 EMC Corporation. All rights reserved.

SCALE-OUT DATA LAKES MAKE ANALYTICS EFFICIENTA DATA LAKE LETS YOU BRING YOUR ANALYTICS TOOLS TO YOUR DATA. DATA IS SHARED BETWEEN PROJECTS WITH CENTRALIZED CONTROL & STANDARDIZED ANALYTICS TOOLING.

Ingest Store Analyze Surface Act

Capture data from a wide range

of sources, traditional and

new.

Store everything in one

environment for cross data-set

analysis.

Use advanced algorithms to discover new,

predictive patterns.

Share insight with business domain

experts.

Build data-driven applications that meet business

needs.

4© Copyright 2016 EMC Corporation. All rights reserved.

THE BUSINESS DATA LAKE JOURNEYSTART

DISPARATE DATA SILOS

STEP 1:

CONSOLIDATE DATA

STEP 2:

ADDHADOOP

STEP 3:

IMPLEMENT ANALYTICS

STEP 4:

INTEGRATEAPP DEV

STEP 5:

BUSINESS DATA LAKE

EMC BIG DATA STORAGE

Install Hadoop(OR multiple distros)

Implement Ananlytics

(OR ALT. TOOLS)

PIVOTAL CLOUD FOUNDRY

BIG DATA VISION

WORKSHOP

PROOF OF VALUE

CONSULTINGFAST TRACK

TECHNOLOGY

APPS

ANALYTIC

S

EMC ENGINEERED SOFTWARE

5© Copyright 2016 EMC Corporation. All rights reserved.

SERVIC

E P

RO

VID

ER

EN

TERPRIS

E D

ATA C

EN

TER

A UNIQUE FEDERATION OF COMPANIESDELIVERING THE SOFTWARE-DEFINED ENTERPRISE. SOLUTIONS & CHOICE

BIG DATA SOLUTIONSPLATFORM AS A SERVICE

AGILE APPLICATION DEVELOPMENT

ENTERPRISE MOBILITYSOFTWARE-DEFINED DATA CENTER

INFORMATION INFRASTRUCTURECONVERGED INFRASTRUCTURE

PLATFORMAS A SERVICE

VIRTUALWORKSPACE

BUSINESSDATA LAKE

SECURITYANALYTICS

SOFTWARE DEFINED

DATA CENTER

Partners

vCloudHybrid Service

AD

VA

NC

ED

SEC

UR

ITY

6© Copyright 2016 EMC Corporation. All rights reserved.

Unstructured Data/Content

Data Lake Use CasesArchive/Compliance

VMware / Info Archive

application retirement

File Shares and Home directories

Cloud/Object

Video/Surveillance

Hadoop - Bigdata

Mobile Apps

Call Centre CVR

Splunk/M2M Log Files

SQL/DB Dumps

Broadcast/Content Streaming

Backup

VDI

BLOBS

Social Media Feeds

EDW – ETL Offload

HPC / Genomic Sequencing

7© Copyright 2016 EMC Corporation. All rights reserved.

NEXT-GEN ACCESS METHODS

FILE

FILE

7

HPC

Backup/Archive

Analytics

Mobile

File Shares

Cloud Apps

8© Copyright 2016 EMC Corporation. All rights reserved.

ISILON DATA LAKE – ENTERPRISE GRADE FEATURESEMC ISILON SCALE-OUT NAS

DATA PROTECTION

DATA SECURITYPERFORMANCE MANAGEMENT

DATA MANAGEMENTIsilonData Lake

S-Series X-Series

NL-Series HD-Series

8© Copyright 2015 EMC Corporation. All rights reserved.

9© Copyright 2016 EMC Corporation. All rights reserved.

S - Series X - Series

NL-Series

IsilonCloudPools

3RD PLATFORM CLOUD INNOVATION

HD-Series

9© Copyright 2015 EMC Corporation. All rights reserved.

HD-SeriesDeep archive

X-SeriesThroughput

NL-SeriesArchive

CloudCold archive

Cap

acit

y

$/TBHigh Low

S-SeriesPerformance

10© Copyright 2016 EMC Corporation. All rights reserved.

EXPANDED DATA LAKEFROM EDGE TO CORE TO CLOUD

EDGE CORE CLOUD

10© Copyright 2015 EMC Corporation. All rights reserved.

EXPAND DATA LAKE TO THE EDGE… AND TO THE CLOUD

11© Copyright 2016 EMC Corporation. All rights reserved.

Customers>7000

>20% YoY GrowthData Lake#1

#1 Scale-Out NASMarket Leader

>2000Big Data Analytics Customers

Hadoop Shared Storage#1

EMC ISILON BUSINESS MOMENTUM

12© Copyright 2016 EMC Corporation. All rights reserved.

ISILON ONEFS OPERATING SYSTEM

Single Volume/

File System

UnmatchedEfficiency

Simplicity &

Ease of Use

LinearScalability

EasyGrowth

HighPerformance

12

13© Copyright 2016 EMC Corporation. All rights reserved.

In-place analytics

• Native integration speeds time to insight

Enterprise data protection

• Fast snapshots, backup, and data recovery

• Simple, efficient data replication for disaster recovery

Lower costs

• Eliminates the need for dedicated Hadoop infrastructure

• Much more efficient than DAS-based approach

Increase flexibility

• Simultaneous support for any Apache-compliant Hadoop distribution

• Ambari integration for management, monitoring, and provisioning

THE ISILON ADVANTAGE FOR HADOOPSCALE-OUT STORAGE WITH NATIVE HADOOP INTEGRATION

14© Copyright 2016 EMC Corporation. All rights reserved.

Ethernet

HADOOP ARCHITECTURE – DAS VS ISILON

NameNode

Data Node + Compute Node

Data Node + Compute Node

Data Node + Compute Node

Data Node + Compute Node

Data Node + Compute Node

Data Node + Compute Node

Ethernet

Compute Node Compute Node Compute Node

Compute NodeCompute Node Compute Node

name node

name node

name node

data

node

15© Copyright 2016 EMC Corporation. All rights reserved.

Traditional “Share-Nothing” Hadoop

Existing Virtualized Data Center SHARE-NOTHING Hadoop Infrastructure

Unstructured Data

1

Existing Primary Storage

2 3 4 2 3 4 2 3 4 2 3 4

• Hadoop on a Stick (R=3) means 5 data copies ($$$$)

• Data has to copy to the Hadoop cluster before analysis can begin (Time to Results)

How will you maintain data consistency when a file changes on your primary storage?

16© Copyright 2016 EMC Corporation. All rights reserved.

Existing Virtualized Data Center

Existing Primary Storage

Isilon “Share-Everything” Hadoop

1

Start using Hadoop NOW with unused processing and RAM available in your VMware environment

No replication required (Use your existing data)

Access to same data via NAS and HDFS protocols

Time to results extremely fast using already existing data with NO COPIES or wasted $$$$

Analysis Can Begin with the

1st VM

New Hadoop Compute Nodes

Unstructured Data

Use Native HDFS Protocol

17© Copyright 2016 EMC Corporation. All rights reserved.

HADOOP JOB CYCLE TIMES: TIME TO RESULTS

Traditional Hadoop + DAS Workflow

Isilon Enabled Hadoop Workflow

Original Data

Stage Data

MapData

Reduce Data

Write Results

Copy Results

View Results

IngestData into

HDFS3x

Mirror

Deletedata from

HDFS

Acquire Data handling notrequired on Isilon

Original Data

Acquire

MapData

Reduce Data

Write Results

View Results

VERSUS

Reusable and extensible to:

17

18© Copyright 2016 EMC Corporation. All rights reserved.

NFS

HDFS

SMB, NFS, HTTP, FTP,

HDFS

NodereplyNodereplyNodereplyNodereply

name node

name node

name node

name node d

ata

node

NFS

SMB

SMB

NFSMAP Reduce

MAP Reduce

MAP Reduce

MAP Reduce

MAP Reduce

MAP Reduce

SUPPORT FOR MULTIPLE ANALYTICS APPLICATIONS

19© Copyright 2016 EMC Corporation. All rights reserved.

HADOOP WITH ISILON SCALE-OUT NAS STORAGE

1Multi Protocol Scale-Out Storage Platform

– NFS, CIFS, FTP, HTTP, HDFS

2Highly resilient, Predictable Scalability

– Distributed NameNode & DataNode

3Enterprise Data Protection & Governance

– SnapshotIQ, SyncIQ, SmartLock, ACLs..

4Industry-Leading Storage Efficiency

– >80% Storage Utilization

5Independent Scalability with Optimized QoS

– Optimally Scale Storage & Compute

6Consolidate Data Silos

– Industry Standard Protocols

– Bring Applications to Shared Data

20© Copyright 2016 EMC Corporation. All rights reserved.

Simple to manage Single file system, single volume, global namespace

Massively scalable From 16 TB to over 50 PB in a single cluster, or to Cloud-scale

Unmatched efficiencyOver 80% storage utilization, automated tiering and SmartDedupe

Enterprise data protectionEfficient backup and disaster recovery, and N+1 thru N+4 redundancy

Robust security and compliance optionsRBAC, WORM, SEDs, auditing, STIG, FIPS, CAC/PIV

Operational flexibilityMulti-protocol support as well as Object and OpenStack Swift

Deployment flexibilityEdge to Core to Cloud

ISILON SCALE-OUT NAS