spark in the hadoop ecosystem-(mike olson, cloudera)

9
1 © Cloudera, Inc. All rights reserved. Mike Olson | cofounder and chief strategy officer Spark in the Hadoop Ecosystem

Upload: spark-summit

Post on 16-Aug-2015

628 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)

1  ©  Cloudera,  Inc.  All  rights  reserved.  

Mike  Olson  |  co-­‐founder  and  chief  strategy  officer  

Spark  in  the  Hadoop  Ecosystem  

Page 2: Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)

2  ©  Cloudera,  Inc.  All  rights  reserved.  

Hadoop:  From  MapReduce  to  an  Enterprise  Data  Hub  

Hadoop  delivers:  • One  place  for  unlimited  data  • Unified,  mulM-­‐framework  data  access    

Enterprises  require:  •  Leading  Performance  •  Open  Source,  Open  Standards  •  Enterprise  Security  •  Data  Governance  •  Complete  Management  

Security  and  AdministraMon  

Unlimited  Storage  

Process   Discover   Model   Serve  

Deployment  Flexibility  

On-­‐Premises  Appliances  Engineered  Systems  

Public  Cloud  Private  Cloud  Hybrid  Cloud  

A  modern  data  plaSorm  plus  what  the  enterprise  requires.  

Page 3: Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)

3  ©  Cloudera,  Inc.  All  rights  reserved.  

Where  Spark  Fits  in  the  Hadoop  Ecosystem  

YARN: Shared resource management

HDFS and HBase: Shared storage

Impala Hive Pig

MapReduce2

Search

Spark

Spark Streaming

Hive (beta)

Pig (beta) …  

With  common    

•  Security  •  Data  governance  •  ConfiguraMon,  deployment  and  operaMons  

 

across  all  components  in  the  stack  

Page 4: Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)

4  ©  Cloudera,  Inc.  All  rights  reserved.  

Process  millions  of  equity  and  bond    market  posiMons,  and  evaluate  against  future  scenarios  in  minutes,  versus  days  with  MapReduce.  

Major  Global  Financial  InsMtuMon  

Page 5: Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)

5  ©  Cloudera,  Inc.  All  rights  reserved.  

Monitor  on-­‐line  user  acMvity  and  opMmize  content  delivery  and  search  results  in  real  Mme.  

Large  Consumer  Company  

Page 6: Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)

6  ©  Cloudera,  Inc.  All  rights  reserved.  

Ingest  and  analyze  complex  data  from  a  variety  of  sources  conMnually,  building  new  risk  and  value  models  in  real  Mme  

Page 7: Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)

7  ©  Cloudera,  Inc.  All  rights  reserved.  

Combine  genomic  and  phenotype  data  with  other  data  sources  to  understand  disease  onset  and  progression  

Page 8: Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)

8  ©  Cloudera,  Inc.  All  rights  reserved.  

Spark  extends  the  Hadoop  ecosystem  with  new  analyMc  and  processing  capabiliMes.  

8  ©  Cloudera,  Inc.  All  rights  reserved.  

Page 9: Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)

9  ©  Cloudera,  Inc.  All  rights  reserved.  

Thank  you!  Mike  Olson,  chief  strategy  officer  [email protected]  @mikeolson