big data, fast data - mapreduce in hazelcast
Post on 19-Aug-2015
1.083 Views
Preview:
TRANSCRIPT
BIG DATA - FAST DATAUSING MAPREDUCE IN HAZELCAST
Source: http://www.newscientist.com/gallery/dn17805-computer-museums-of-the-world/11
www.hazelcast.com
WHO AM IChristoph Engelbert (@noctarius2k)8+ years of Java WeirdonessPerformance, GC, traffic topicsApache DirectMemory PMCPrevious companies incl. Ubisoft and HRSCastMapR MapReduce for Hazelcast 3
www.hazelcast.com
TOPICSHazelcastDistributed ComputingMap & ReduceDemonstrationQuestions
www.hazelcast.com
HAZELCASTA SHORT SPACE TRIP
www.hazelcast.com
WHAT IS HAZELCAST?In-Memory Data-GridData Partioning (Sharding)Java Collections ImplementationDistributed Computing Platform
www.hazelcast.com
WHY HAZELCAST?
www.hazelcast.com
WHY IN-MEMORYCOMPUTING?
www.hazelcast.com
TREND OF PRICES
Data Source: http://www.jcmit.com/memoryprice.htm
www.hazelcast.com
SPEED DIFFERENCE
Data Source: http://i.imgur.com/ykOjTVw.png
www.hazelcast.com
DISTRIBUTEDCOMPUTING
OR
MULTICORE CPU ON STEROIDS
www.hazelcast.com
THE IDEA OF DISTRIBUTED COMPUTING
Source: https://www.flickr.com/photos/stefan_ledwina/1853508040
www.hazelcast.com
THE BEGINNING
Source: http://en.wikipedia.org/wiki/File:KL_Advanced_Micro_Devices_AM9080.jpg
www.hazelcast.com
MULTICORE IS NOT NEW
Source: http://en.wikipedia.org/wiki/File:80386with387.JPG
www.hazelcast.com
CLUSTER IT
Source: http://rarecpus.com/images2/cpu_cluster.jpg
www.hazelcast.com
SUPER COMPUTER
Source: http://www.dkrz.de/about/aufgaben/dkrz-geschichte/rechnerhistorie-1
www.hazelcast.com
CLOUD COMPUTING
Source: https://farm6.staticflickr.com/5523/11407118963_e0e0870846_b_d.jpg
www.hazelcast.com
MAP & REDUCETHE BLACK MAGIC FROM PLANET GOOGLE
www.hazelcast.com
USE CASESLog AnalysisData QueryingAggregation and summingDistributed SortETL (Extract Transform Load)and more...
www.hazelcast.com
SIMPLE STEPSReadMap / TransformReduce
www.hazelcast.com
FULL STEPSReadMap / TransformCombiningGrouping / ShufflingReduceCollating
www.hazelcast.com
MAPREDUCE WORKFLOW
www.hazelcast.com
Data are mapped / transformed in a set of key-value pairs
SOME PSEUDO CODE (1/3)
MAPPING
map( key:String, document:String ):Void -> for each w:word in document: emit( w, 1 )
www.hazelcast.com
Multiple values are combined to an intermediate result to preserve traffic
SOME PSEUDO CODE (2/3)
COMBINING
combine( word:String, counts:List[Int] ):Void -> emit( word, sum( counts ) )
www.hazelcast.com
Values are reduced / aggregated to the requested result
SOME PSEUDO CODE (3/3)
REDUCING
reduce( word:String, counts:List[Int] ):Int -> return sum( counts )
www.hazelcast.com
FOR MATHEMATICIANSProcess: (K x V)* → (L x W)* ⇒ [(l1, w1), …, (lm, wm)]
Mapping: (K x V) → (L x W)* ⇒ (k, v) → [(l1, w1), …, (ln, wn)]
Reducing: L x W* → X* ⇒ (l, [w1, …, wn]) → [x1, …,xn]
www.hazelcast.com
MAPREDUCE PROGRAMS INGOOGLE SOURCE TREE
Source: http://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0005.html
www.hazelcast.com
DEMONSTRATION
www.hazelcast.com
@noctarius2k@hazelcast
http://www.sourceprojects.comhttp://github.com/noctarius
THANK YOU!ANY QUESTIONS?
Images: All images are licensed under Creative Commons
www.hazelcast.com
top related