m7 - enterprise-grade nosql

14
M7 – EnterpriseGrade NoSQL

Upload: sandisk

Post on 08-May-2015

379 views

Category:

Technology


1 download

DESCRIPTION

This presentation was given at 2013 Hadoop Summit North America by MapR Technologies.

TRANSCRIPT

Page 1: M7 - Enterprise-Grade NoSQL

1  ©MapR  Technologies  -­‐  Confiden6al  

M7  –  Enterprise-­‐Grade  NoSQL  

Page 2: M7 - Enterprise-Grade NoSQL

2  ©MapR  Technologies  -­‐  Confiden6al  

One  Pla9orm  for  Big  Data  

99.999%  HA  

Data  Protec6on  

Disaster  Recovery  

Scalability    &  

Performance  Enterprise  Integra6on  

Mul6-­‐tenancy  

Map  Reduce  

File-­‐Based  Applica6ons   SQL   Database   Search   Stream  

Processing  

Batch   Interac6ve   Real-­‐6me  

Batch  Log  file  Analysis  

Data  Warehouse  Offload  Fraud  Detec6on  

Clickstream  Analy6cs  

Real-­‐Time  Sensor  Analysis  “TwiUerscraping”  

Telema6cs  Process  Op6miza6on  

InteracEve  Forensic  Analysis  Analy6c  Modeling  BI  User  Focus  

Page 3: M7 - Enterprise-Grade NoSQL

3  ©MapR  Technologies  -­‐  Confiden6al  

One  Pla9orm  for  Big  Data  

99.999%  HA  

Data  Protec6on  

Disaster  Recovery  

Scalability    &  

Performance  Enterprise  Integra6on  

Mul6-­‐tenancy  

Map  Reduce  

File-­‐Based  Applica6ons   SQL   Search   Stream  

Processing  

Batch   Interac6ve   Real-­‐6me  

Batch  Log  file  Analysis  

Data  Warehouse  Offload  Fraud  Detec6on  

Clickstream  Analy6cs  

Real-­‐Time  Sensor  Analysis  “TwiUerscraping”  

Telema6cs  Process  Op6miza6on  

InteracEve  Forensic  Analysis  Analy6c  Modeling  BI  User  Focus  

Database  

Page 4: M7 - Enterprise-Grade NoSQL

4  ©MapR  Technologies  -­‐  Confiden6al  

M7  –  Enterprise-­‐Grade  NoSQL  on  Hadoop  

§  NoSQL  Columnar  Store  §  HBase  API  §  Integrated  with  Hadoop  

HBase  

JVM  

HDFS  

JVM  

ext3/ext4  

Disks  

Other  Distros  

Tables/Files  

Disks  

MapR  M7  

Page 5: M7 - Enterprise-Grade NoSQL

5  ©MapR  Technologies  -­‐  Confiden6al  

Performance   Easy  Administra6on  

Tradeoffs  with  Other  NoSQL  SoluEons  

Con6nuous  low  latency  with  horizontal  scaling  

Easy  day-­‐to-­‐day  management  with  minimal  learning  curve  

24x7  applica6ons  with  strong  data  consistency  

Reliability  

Page 6: M7 - Enterprise-Grade NoSQL

6  ©MapR  Technologies  -­‐  Confiden6al  

Reliability   Easy  Administra6on  Performance  Performance   Reliability   Easy  Administra6on  

Bullet-­‐proof  NoSQL  with  Zero  AdministraEon  

Benefit   Features  

High  Performance   Over  1  million  ops/sec  with  10  node  cluster    

ConEnuous  Low  Latency   No  I/O  storms,  No  compac6ons  

24x7  ApplicaEons   Instant  recovery,  online  schema  modifica6on,  snapshots,  mirroring  

Zero  AdministraEon   No  processes  to  manage,  automated  splits,  self-­‐tuning    

High  Scalability   1  trillion  tables  

Low  TCO   Files  and  tables  on  one  plaeorm  

Page 7: M7 - Enterprise-Grade NoSQL

7  ©MapR  Technologies  -­‐  Confiden6al  

M7:  Tables  for  Developers  

•  Users  can  create  and  manage  their  own  tables  –  Unlimited  #  of  tables  –  First  copy  local  

•  Tables  can  be  created  in  any  directory  –  Tables  count  towards  volume  and  user  quotas  

•  No  admin  interven6on  needed  –  Perform  tasks  on  the  fly  –  No  stopping/restar6ng  of  servers  

•  Automa6c  data  protec6on  and  disaster  recovery  –  Users  can  recover  from  snapshots/mirrors  on  their  own  

Page 8: M7 - Enterprise-Grade NoSQL

8  ©MapR  Technologies  -­‐  Confiden6al  

M7:  Volume  Based  Data  Management  

Page 9: M7 - Enterprise-Grade NoSQL

9  ©MapR  Technologies  -­‐  Confiden6al  

Case  Study  –  24x7  ApplicaEons  

Web  2.0  company  that  op6mizes  email,  mobile  &  social  campaigns  

•  42  hours  of  compac6on  every  weekend  

•  Long  cold-­‐starts  •  “Engineering  resources  stuck  fixing  HBase  problems”  

•  No  compac6ons  •  Instant  recovery  •  Easy  development  and  administra6on  

Service  Disrup6ons   24x7  Up6me  

Apache  HBase  

Page 10: M7 - Enterprise-Grade NoSQL

10  ©MapR  Technologies  -­‐  Confiden6al  

Case  Study  –  Cost  EffecEve  Scalability  

Managed  Security  Services  company  that  provides  solu6ons  for  data  security  and  compliance  

•  Limited  analy6cal  tools  •  No  machine  learning  capability  

•  Extended  analy6cal  eco-­‐system:  Machine  Learning,  Solr  and  Hive  

•  Similar  Reliability  

Expensive  Data  Store   Cost  Effec6ve  Scalability  

ORACLE  

Page 11: M7 - Enterprise-Grade NoSQL

11  ©MapR  Technologies  -­‐  Confiden6al  

þ  

Case  Study  –  Consistent  Superior  Performance  

Cloud-­‐based  predic6ve  analy6cs  plaeorm  

•  Compac6ons  •  Manual  administra6on  •  Poor  reliability  

•  No  Compac6ons  •  Zero  administra6on  •  Strong  consistency  •  2x  Cassandra  performance    •  3x  HBase  performance  

ý  

Apache  HBase   Cassandra  

•  Compac6ons  •  Manual  administra6on  •  Eventual  consistency  

Sociocast  conducted  a  POC  with  the  three  solu6ons  

ý    

Page 12: M7 - Enterprise-Grade NoSQL

12  ©MapR  Technologies  -­‐  Confiden6al  

YCSB  Benchmark                          (ops/sec/node)  

MapR  3.0      (M7)    

CDH  4.2.0  (HBase)   M7  Advantage  

50%  read,  50%  update   4227   1695   2.5x  

95%  read,    5%  update   3323   602   5.5x  

Read     5018   764   6.6x  

Scan  (50  rows)   922   161   5.7x  

Hardware  ConfiguraEon  

CPU  :  Intel®  Xeon®  CPU  E5645    2.40GHz    12  cores  x2  RAM  :  48  GB  Data  Disk  :    12x    3TB    (7200  rpm)  Size  –  record  size  =  1k,  data  size  =  2TB  OS  :      CentOS  Release  6.2  (Final)  

10-­‐Node  Cluster  

High  Performance  across  Varying  Workloads  

Page 13: M7 - Enterprise-Grade NoSQL

13  ©MapR  Technologies  -­‐  Confiden6al  

YCSB  Benchmark    (ops/sec/node)  

MapR  3.0        (M7)    

CDH  4.2.1  (HBase)   M7  Advantage  

50%  read,  50%  update   6188   2547   2.4x  

95%  read,    5%  update   13064   2660   4.9x  

Read   18840   1605   11.8x  

Scan  (50  rows)   1135   116   9.8x  

Hardware  ConfiguraEon  

CPU  :  Intel®  Xeon®  CPU  E5620    2.40GHz    8  cores  x2  Memory:  24  GB  Data  Disk  :    1x  1.2TB    –  Fusion  I/O    ioDrive2  Size  –  record  size  =  1k,  data  size  =  600GB  OS  :      CentOS  Release  6.3  (Final)  

5-­‐Node  Cluster  

10x  Faster  Reads  and  Scans  with  Flash  Memory  

Page 14: M7 - Enterprise-Grade NoSQL

14  ©MapR  Technologies  -­‐  Confiden6al  

M7   OTHERS  99.999%  High  Availability   ü      X  

Instant  Recovery  from  Failures   ü      X  Con6nuous  Low  Latency  (No  Compac6ons)   ü      X  Zero  Administra6on  (No  Processes  to  Manage,  Self-­‐tuning)   ü      X  Online  Data  Protec6on  (Snapshots,  Mirroring)   ü      X  Scalability  (Number  of  Tables  Supported)   Trillion   Hundreds  

MapR  Advantages