mt30 best practices for data lake adoption

20
MT30 Best practices: Data lake adoption Matt Maccaux, Global Big Data Practice Lead

Upload: dell-emc-world

Post on 17-Feb-2017

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: MT30 Best practices for data lake adoption

MT30

Best practices: Data lake adoption

Matt Maccaux, Global Big Data Practice Lead

Page 2: MT30 Best practices for data lake adoption

2Dell - Internal Use - Confidential

Agenda

• Two models for big data

• Big data anti-patterns

• Big data best practice

• How to get started?

• Your questions

Page 3: MT30 Best practices for data lake adoption

3Dell - Internal Use - Confidential

Two models for big data

Exploratory analytics

• Full data set – batch

• Explore, test, refine,

iterate

• The output is an algorithm

that will be integrated into

new or existing

applications.

Operationalization

• Limited data set –

Streaming

• The algorithm is integrated

into applications that drive

business decisions.

Page 4: MT30 Best practices for data lake adoption

Big data anti-patterns

Page 5: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Page 6: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Page 7: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Page 8: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Page 9: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Page 10: MT30 Best practices for data lake adoption

Big data best practices

Page 11: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

me:~>_

CONTINUUM

Page 12: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Hadoop

Spark

Tableau

Python

TOOL CATALOG

Customer

Alert

Bills

Social

DATACATALOG

Duration

Performance

Normal

Analytics Request Portal

NONSampleData

SampleData

Page 13: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Data Lake

Discover/Map

Transform

Organize/Tag

CATALOG AND PROVISION

ENTERPRISE LOG ANALYSIS

Virtualisation

Page 14: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Virtualised Compute Pool

Page 15: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Data Pool

Meta

-dat

a T

aggi

ng

G

o

v

e

r

n

a

n

c

e

A

n

o

n

y

m

i

s

e

E

n

c

r

y

p

t

i

o

n

Pooln

Pooln

Pooln

Copy

Page 16: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

Virtualised Compute Pool

Page 17: MT30 Best practices for data lake adoption

Dell - Internal Use - Confidential

© Copyright 2016 EMC Corporation. All rights reserved.

CD \>_

CONTINUUM

Data Pool

G

o

v

e

r

n

a

n

c

e

A

n

o

n

y

m

i

s

e

E

n

c

r

y

p

t

i

o

n

Pooln

Pooln

Pooln

Copy

Virtualised Compute Pool

Page 18: MT30 Best practices for data lake adoption

18Dell - Internal Use - Confidential

How to get started?

Big Data Technology Advisory

• Interview stakeholders including business users and technical/functional

experts

• Document requirements and gaps

• Define a future-state reference architecture

• Provide a plan/roadmap for implementation

Page 19: MT30 Best practices for data lake adoption

Q&A

Page 20: MT30 Best practices for data lake adoption