cloud camp chicago dec 2012 slides
DESCRIPTION
The slides from the December 2012 Cloud Camp Chicago. The slides include slides from our speakers: Dave Falck, Model Metrics: node.js on AWS; Paul Mantz, CohesiveFT: Working with APIs; Bob Chojnacki, Jellyvision Labs: Hadoop on AWS; Karl Zimmerman, Steadfast: Keep control with the Private CloudTRANSCRIPT
![Page 1: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/1.jpg)
Welcome to Cloud Chicago
Live Tweet on the second screen by using:#cloudcamp@cloudcamp_chi
1
Sponsored by
Hosted by
Thursday, December 13, 12
![Page 2: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/2.jpg)
#cloudcamp@cloudcamp_chi
Agenda
6:00pm Registration, Food, Drinks and Networking6:30 Opening Remarks, Patrick Kerpan, CoehsiveFT
6:45 Lightning TalksDave Falck, Model Metrics: node.js on AWSPaul Mantz, CohesiveFT: Working with APIs Bob Chojnacki, Jellyvision Labs: Hadoop on AWSKarl Zimmerman, Steadfast: Keep control with the Private Cloud
7:45 Unpanel: “Who’s in Control of Your Cloud? Security and Visibility”
Emceed by Mike Dorosh, IBM & Patrick Kerpan, CoehsiveFT
8:30 Breakout Sessions 9:00 Wrap Up - Drinks, anyone?
Thursday, December 13, 12
![Page 3: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/3.jpg)
#cloudcamp@cloudcamp_chi
Sponsored by
Hosted byDave Falck, Customer Solutions Engineer
Thursday, December 13, 12
![Page 4: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/4.jpg)
Node.js + AWS @davidfalck
![Page 5: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/5.jpg)
* LinkedIn’s entire mobile software stack is completely built in Node
* Why? Scale. * Huge performance gains compared to what they were
using before (Ruby on Rails) * Went from running 15 servers with 15 instances (virtual
servers) on each physical machine, to just four instances that can handle double the traffic.
Why the Node.js Buzz?
![Page 6: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/6.jpg)
* Javascript platform based on Google Chrome V8 JS Engine
* Ryan Dahl (Joyent) * Event-‐driven, non-‐blocking I/O model to allow your
applications to scale while keeping you from having to deal with threads, polling, timeouts, and event loops
* FAST * Used for real-‐time, data-‐intensive apps (mobile!)
* POPULAR
What is Node.js ?
![Page 7: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/7.jpg)
Node.js on GitHub
![Page 8: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/8.jpg)
var http = require('http'); http.createServer(function (req, res) { res.writeHead(200, {'Content-‐Type': 'text/plain'}); res.end('Hello World\n'); }).listen(1337, '127.0.0.1');
Hello World
![Page 9: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/9.jpg)
* Thread-‐based networking is inefficient and difficult * Node shows much better memory efficiency under high-‐
loads than systems which allocate 2mb thread stacks for each connection.
* Users of Node are free from worries of dead-‐locking the process (*there are no locks*)
* Almost no function in Node directly performs I/O, so the process never blocks.
* Because nothing blocks, less-‐than-‐expert programmers are able to develop fast systems
What makes Node.js so fast?
![Page 10: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/10.jpg)
Under the Node.js hood
Javascript?
![Page 11: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/11.jpg)
Under the Node.js hood
* Javascript! * Platform independent * Easy to use * Ubiquitous
* Google Chrome’s V8 Javascript Engine * Translates JS into machine code (not interpreted)
![Page 12: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/12.jpg)
When not to use Node.js
* Node.js is not ideal for CPU intensive jobs like sorting, transformations, number crunching, analytics… * Traditional CRUD web apps that need to be highly concurrent, performance degradation will occur when the data is needed to be transformed… * You can offload processing to another language that is better at making use of the CPU * Cultural fit? Too new? You decide…
![Page 13: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/13.jpg)
* Dec 6th: AWS released developer preview of node.js libraries to access AWS: * DynamoDB * S3 * EC2 * SWS
* Allows you to manage parallel calls to several AWS web services
Node.js + AWS
![Page 14: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/14.jpg)
* Azure * Joyent * EngineYard * Heroku
Node.js + Other Clouds
![Page 15: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/15.jpg)
* http://nodejs.org * http://en.wikipedia.org/wiki/Nodejs * http://aws.typepad.com/aws/2012/12/aws-‐sdk-‐for-‐nodejs-‐now-‐available-‐in-‐preview-‐form.html * http://www.jamesward.com/2011/06/21/getting-‐started-‐with-‐node-‐js-‐on-‐the-‐cloud/ * http://venturebeat.com/2011/08/16/linkedin-‐node/
More info
![Page 16: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/16.jpg)
#cloudcamp@cloudcamp_chi
Sponsored by
Hosted byPaul Mantz, Software Engineer
Thursday, December 13, 12
![Page 17: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/17.jpg)
Copyright CohesiveFT - Dec 13, 2012
APIs in Cloud Environments Paul Mantz
1Thursday, December 13, 12
![Page 18: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/18.jpg)
Copyright CohesiveFT - Dec 13, 2012
API Command-Line Clients
• Benefits to Creating API Command-Line Clients
• Lowers barrier of entry
• Familiar to technical consumers
• Advanced usage cases
• Integrates into existing toolsets
2Thursday, December 13, 12
![Page 19: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/19.jpg)
Copyright CohesiveFT - Dec 13, 2012
API Command-Line Clients
Excellent Internal Developer Tool
• Excellent for testing and rapid development
• Useful operations tool
3Thursday, December 13, 12
![Page 20: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/20.jpg)
Copyright CohesiveFT - Dec 13, 2012
API Command-Line Clients
Reference Implementation
• Gives developers an example to integrate the API
• Helps users model workflows
• DSL
4Thursday, December 13, 12
![Page 21: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/21.jpg)
Copyright CohesiveFT - Dec 13, 2012
API Command-Line Clients
Excellent Demo Tool
• Quick installation, often one file
5Thursday, December 13, 12
![Page 22: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/22.jpg)
#cloudcamp@cloudcamp_chi
Sponsored by
Hosted byBob Chojnacki, Programmer
Thursday, December 13, 12
![Page 23: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/23.jpg)
Big Data in the Cloud
A Journey into the unknown
![Page 24: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/24.jpg)
Who Jellyvision is and why are analy9cs important to us
• We create interac9ve experiences – Desktop – Mobile
• … which ask ques9ons, inform people, generate leads • “Virtual Advisors” • We also collect analy9cs in real 9me to generate reports
about: – How people answered a ques9on – Where they dropped out – Lots of impressive stats!
![Page 25: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/25.jpg)
The Problem
• Longer term projects and high volume projects causing MySQL to bust at the seams
• Some types of reports taking too long, or causing MySQL to crash if we include too much data
• In all fairness, we could probably tune MySQL, throw it on bigger servers, more memory
• Diminishing returns • MySQL is fine for collec9ng the data…
![Page 26: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/26.jpg)
The Solu9on
• Hadoop! • Why Hadoop? Lots of possibili9es out there, but which one to use? Cassandra, CouchDB, Hadoop, Membase, MongoDB, Neo4j, …
• Big Data meetups tended to have lots of people using Hadoop
• And I knew others using it. • And Hortonworks had a fancy point and click solu9on I could use to get started quickly
![Page 27: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/27.jpg)
Op9ons with op9ons
• Now that I picked Hadoop, I had several op9ons, and op9ons within op9ons to use to analyze my data: – Hive, Pig, MapReduce, Java, R
• I knew Java • MapReduce seemed to make sense • I’ll probably play with Hive and Pig next
![Page 28: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/28.jpg)
It’s All About The Data
• Visit data • Event data • Denormaliza9on of data • Generated a ton of fake data: – Started with 600K visits, 3M events – Moved up to 1.8M visits, 60M events
![Page 29: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/29.jpg)
Make it so • First experience: Hortonworks Virtual Sandbox
– Single node AMI at Amazon – Hadoop 1.0 – 600K visits, 3M events
• On our exis9ng placorm we needed to break reports up into smaller chunks for some data because MySQL could not handle it.
• Results! What would have taken hours, took only 5 minutes on a single node Hadoop "cluster”
• In reality, some of the queries I could also run with command-‐line tools (wc, grep, awk) on the data considerably faster than even Hadoop.
• Important lessons learned so far: – Think outside the RDBMS: they are great, but it may not make sense
for all types data
![Page 30: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/30.jpg)
Looking at more real data • Now, lets generate data that is much closer to some of our product • Instead of one ques9on and answer, how about 15 ques9ons? Add
in some other events gives a total of 34 events. • Throw in some people returning, some of them mul9ple 9mes • Throw in some people who don't start the conversa9on, etc. • Run my lijle auto-‐data-‐generator and BOOM! 20 million events
and 4.4GB later I have my data… • … which took up too much disk space to run on the demo system I
was using. Might as well turbo-‐charge this puppy...
![Page 31: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/31.jpg)
More disk space!
• Full install of Hadoop (Hortonworks HDP) • Single node • 600K visits, 20M events – 6m 29s, ~30s aner map phase completed
• 1.8M visits, 60M events – 18m 3s, ~90s aner map phase completed
![Page 32: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/32.jpg)
More nodes
• 3 nodes: 11m • 4 nodes: 9m 16s • Yay! Nodes!
![Page 33: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/33.jpg)
Caveats
• Not using Hadoop to its fullest / basically a weekend job
• Algorithms employed in this example probably won't end up it a book alongside Knuth’s
![Page 34: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/34.jpg)
Next steps
• Make sure results on real data lines up • Integrate with team to generate reports they need
![Page 35: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/35.jpg)
End stuff
• Thanks to the folks at Hortonworks who answered my fran9c and spas9c ques9ons.
![Page 36: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/36.jpg)
#cloudcamp@cloudcamp_chi
Sponsored by
Hosted byKarl Zimmerman, President
Thursday, December 13, 12
![Page 37: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/37.jpg)
Keep Your Control.Private Cloud with Karl Zimmerman, CEO of Steadfast.
![Page 38: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/38.jpg)
Private Cloud:What do we mean?
Private cloud is a form of cloud computing where the customer has some control/ownership of the service implementation. It is a scalable, elastic IaaS solution based on cloud computing but with more control over resources.
![Page 39: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/39.jpg)
Private Cloud:What are the advantages?
Security
Availability
No vendor lock-in
Ease of management
![Page 40: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/40.jpg)
Private Cloud:Security
Dedicated & segregated resources
More options to integrate with existing security
![Page 41: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/41.jpg)
Private Cloud:Availability
Understanding and control of the infrastructure
Get the resources you need, when you need them
You're not subject to the whims of other users
![Page 42: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/42.jpg)
Private Cloud:Vendor Lock-In
No "secret sauce."
Utilize true open source
![Page 43: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/43.jpg)
Private Cloud:Management
Easier to find employees with general IT knowledge
Utilize a broader array of tools and software
Get support/assistance from multiple levels
![Page 44: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/44.jpg)
Private Cloud:To Summarize
Private cloud can deliver what you need out of a public cloud, but giving you more control. Losing control over security, availability and issues like vendor lock-in and management vanish into thin air like, well, a cloud. And the fact that it doesn’t have to cost you more is a plus, too.
![Page 45: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/45.jpg)
#cloudcamp@cloudcamp_chi
Sponsored by
Hosted by
Unpanel: “Who’s in Control of Your Cloud? Security and Visibility”
Emceed by: Mike Dorosh, Program Manager –Cloud Technical Partnerships, IBM
& Patrick Kerpan CEO, CoehsiveFT
Thursday, December 13, 12
![Page 46: Cloud Camp Chicago Dec 2012 Slides](https://reader034.vdocuments.mx/reader034/viewer/2022051514/54833ac8b4af9f870d8b498b/html5/thumbnails/46.jpg)
#cloudcamp@cloudcamp_chi
Thursday, December 13, 12