mt30 best practices for data lake adoption
TRANSCRIPT
MT30
Best practices: Data lake adoption
Matt Maccaux, Global Big Data Practice Lead
2Dell - Internal Use - Confidential
Agenda
• Two models for big data
• Big data anti-patterns
• Big data best practice
• How to get started?
• Your questions
3Dell - Internal Use - Confidential
Two models for big data
Exploratory analytics
• Full data set – batch
• Explore, test, refine,
iterate
• The output is an algorithm
that will be integrated into
new or existing
applications.
Operationalization
• Limited data set –
Streaming
• The algorithm is integrated
into applications that drive
business decisions.
Big data anti-patterns
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Big data best practices
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
me:~>_
CONTINUUM
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Hadoop
Spark
Tableau
Python
TOOL CATALOG
Customer
Alert
Bills
Social
DATACATALOG
Duration
Performance
Normal
Analytics Request Portal
NONSampleData
SampleData
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Data Lake
Discover/Map
Transform
Organize/Tag
CATALOG AND PROVISION
ENTERPRISE LOG ANALYSIS
Virtualisation
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Virtualised Compute Pool
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Data Pool
Meta
-dat
a T
aggi
ng
G
o
v
e
r
n
a
n
c
e
A
n
o
n
y
m
i
s
e
E
n
c
r
y
p
t
i
o
n
Pooln
Pooln
Pooln
Copy
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
Virtualised Compute Pool
Dell - Internal Use - Confidential
© Copyright 2016 EMC Corporation. All rights reserved.
CD \>_
CONTINUUM
Data Pool
G
o
v
e
r
n
a
n
c
e
A
n
o
n
y
m
i
s
e
E
n
c
r
y
p
t
i
o
n
Pooln
Pooln
Pooln
Copy
Virtualised Compute Pool
18Dell - Internal Use - Confidential
How to get started?
Big Data Technology Advisory
• Interview stakeholders including business users and technical/functional
experts
• Document requirements and gaps
• Define a future-state reference architecture
• Provide a plan/roadmap for implementation
Q&A