introduction to sara's hadoop hackathon - dec 7th 2010
DESCRIPTION
This was the first of two introduction presentations to the first Hadoop Hackathon at SARA, the Dutch center for High Performance Computing and Networking.TRANSCRIPT
![Page 1: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/1.jpg)
SARA Hadoop [email protected] 7, 2010
![Page 2: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/2.jpg)
SARA Hadoop Hackathon, December 7, 2010
DJOERD HIEMSTRA(UTwente)
EDGAR MEIJ(UvA)
![Page 3: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/3.jpg)
SARA Hadoop Hackathon, December 7, 2010
Nutch*2002 2004
MR/GFS**20062004
Hadoop
* http://nutch.apache.org/** http://labs.google.com/papers/mapreduce.html http://labs.google.com/papers/gfs.html
![Page 4: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/4.jpg)
SARA Hadoop Hackathon, December 7, 2010
http://wiki.apache.org/hadoop/PoweredBy
2010: A Hype in Production
![Page 5: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/5.jpg)
SARA Hadoop Hackathon, December 7, 2010
Super computingSuper computing
Cluster computingCluster computing
Grid computingGrid computingCloud computingCloud computing
GPU computingGPU computing
http://www.sara.nl/
![Page 6: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/6.jpg)
SARA Hadoop Hackathon, December 7, 2010
ComputationExpensive!
:-(:-)
DataCheaper!
Data
Computation
Ref: Luiz André Barroso and Urs Hölzle, Google Inc. The Datacenter as a Computer: An Introduction to the Design of WarehouseScale Machines
![Page 7: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/7.jpg)
SARA Hadoop Hackathon, December 7, 2010
DN TT DN TT DN TT DN TT
DN TT DN TT DN TT DN TT
NameNode JobTracker
DN
TT
DataNode
TaskTracker
![Page 8: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/8.jpg)
SARA Hadoop Hackathon, December 7, 2010
File Map ReduceShuffle Output
$ echo “${email#*@}, ${name}” $ sort $ wc l
ewi.utwente.nl, 1gmail.com, 2nbic.nl, 1nikhef.nl, 3sara.nl, 1
![Page 9: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/9.jpg)
SARA Hadoop Hackathon, December 7, 2010
From: Hadoop, The Definitive Guide (2nd Edition), Tom White
![Page 10: Introduction to SARA's Hadoop Hackathon - dec 7th 2010](https://reader034.vdocuments.mx/reader034/viewer/2022042515/555ab249d8b42ad0538b5126/html5/thumbnails/10.jpg)
SARA Hadoop Hackathon, December 7, 2010
Today
09.30 - 09.50 Welcome & Introduction09.50 - 10.15 Map/Reduce @ University of Twente10.15 - 10.30 Kick-off hackathon14.00 - 15.00 Optional: SARA tour10.30 - 17.00 Hackathon17.00 - 17.30 Results and closing