hadoop vs mongodb

11
Performance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis E. Dede, M.Govindaru (University Binghamton), D.Gunter, R. Canon, L. Ramakrishnan (Lawrence Berekely National Lab) http://dl.acm.org/citation.cfm?id=2465849

Upload: rischan-mafrur

Post on 20-Jan-2016

69 views

Category:

Documents


0 download

DESCRIPTION

Hadoop vs MongoDB performance

TRANSCRIPT

Page 1: Hadoop vs MongoDB

Performance Evaluation of a MongoDB and Hadoop Platform for Scientific Data Analysis

E. Dede, M.Govindaru (University Binghamton), D.Gunter, R. Canon, L. Ramakrishnan (Lawrence Berekely National Lab)

http://dl.acm.org/citation.cfm?id=2465849

Page 2: Hadoop vs MongoDB

Introduction

MongoDB : NoSQL DBMSHadoop : Framework -> Big Data ProcessingHDFS : Hadoop FileSystemMapReduce : for scalable Parallel Analysis

Page 3: Hadoop vs MongoDB

MotivationAdvanced Light Source (ALS) & Joint Genome Institute have project -> Materials Project.Now ALS & Materials Projects use MongoDB+MapReduceHadoop -> most popular FOSS implementation of MapReduce.How if we make a comparison of both ->

evaluate the performancescalabilityfault tolerance

Page 4: Hadoop vs MongoDB

System OverviewNow the ALS have system with native MogoDB+MapReduce, and they want to know and compare about the Hadoop MongoDB vs (native Hadoop)Hadoop HDFS

Page 5: Hadoop vs MongoDB

Experiment Setup● Hadoop Setup using Standard Setup● Data: US Census data 300 GB● MongoDB : run with 2 mode, single and sharding● Hopper Machine : Cray XE6 with 153,216 compute

cores, 217 TB RAM, 2 PetaBytes Disk, Each Node has 24 cores, 2 twelve-core AMD 2.1 GHz processor and 32 GB RAM.

Page 6: Hadoop vs MongoDB

Result

Evaluate PerformanceScalabilityFault Tolerance

Page 7: Hadoop vs MongoDB

Evaluate Performance

Page 8: Hadoop vs MongoDB

Scalability 37.2 milion input records

Page 9: Hadoop vs MongoDB

Fault Tolerancetotal 32 node processing 37 milion input records.

Hadoop after 8 faulted nodes Hadoop loses too many data nodes and fails to complete the MapReduce Job.

MogoDB-Hadoop can finish job even with half of the cluster lost.

Page 10: Hadoop vs MongoDB

Conclusion & Contribution

Case -> Scientific Big Data Processing● HDFS-Hadoop is better than MongoDB for Read and

Write Performance.● HDFS-Hadoop is better than MongoDB-hadoop for

Scalability case.● MongoDB-hadoop is better than HDFS-Hadoop for

Fault Tolerance.

Page 11: Hadoop vs MongoDB

Thank you

Any Questions?