cloud computing: what it is, dos and don'ts

68
Cloud Computing: What it is, DOs and DON'Ts Svet Ivantchev, eFaber Fourth Workshop on Advanced Computing Techniques in the Microworld, April 2011 domingo 1 de mayo de 2011

Upload: svet-ivantchev

Post on 16-Jan-2015

1.381 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Cloud Computing: What it is, DOs and DON'Ts

Cloud Computing: What it is, DOs and DON'Ts

Svet Ivantchev, eFaber

Fourth Workshop on Advanced Computing Techniques in the Microworld,

April 2011

domingo 1 de mayo de 2011

Page 2: Cloud Computing: What it is, DOs and DON'Ts

Our plan for today

• What Is Cloud Computing?

• Enabling technologies

• Public vs Private Clouds

• Idea of MapReduce with two examples

domingo 1 de mayo de 2011

Page 3: Cloud Computing: What it is, DOs and DON'Ts

Our plan for tomorrow

• Create a HPC cluster with:

• 184 GB RAM

• 13 TB local disk space and 800 GB persistent storage

• 64 cores @ 2.9 GHz, Intel Nehalem = 268 ECUs (~268 2007 1.2 GHz Xeons)

• 10 GB network connection between them

domingo 1 de mayo de 2011

Page 4: Cloud Computing: What it is, DOs and DON'Ts

(Kind of) Evolution

• Grid Computing

• Utility Computing

• Cloud Computing

• Software as a Service (SaaS)

domingo 1 de mayo de 2011

Page 5: Cloud Computing: What it is, DOs and DON'Ts

Grid Computing

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common goal. The grid can be thought of as a distributed system with non-interactive workloads that

involve a large number of files.

http://en.wikipedia.org/wiki/Grid_computing

domingo 1 de mayo de 2011

Page 7: Cloud Computing: What it is, DOs and DON'Ts

Cloud Computing

McKinsey & Co. Report

domingo 1 de mayo de 2011

Page 8: Cloud Computing: What it is, DOs and DON'Ts

Cloud Computing

Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable

computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service

provider interaction.

NIST

domingo 1 de mayo de 2011

Page 9: Cloud Computing: What it is, DOs and DON'Ts

Cloud Computing

1. The illusion of infinite computing resources... 2. The elimination of an up-front commitment...

3. The ability to pay for use ... as needed.

UC Berkeley RAD Labs

domingo 1 de mayo de 2011

Page 10: Cloud Computing: What it is, DOs and DON'Ts

So, what it is?

• Pay-per-use

• Resources are abstracted (virtualized)

• Upscale and downscale on demand

• Self service interface (API included)

domingo 1 de mayo de 2011

Page 11: Cloud Computing: What it is, DOs and DON'Ts

Enabling technologies

• Virtualisation

• Virtualised Storage

• Web Services

domingo 1 de mayo de 2011

Page 12: Cloud Computing: What it is, DOs and DON'Ts

Virtualisation

• Xen

• KVM

• WMware

• more...

domingo 1 de mayo de 2011

Page 13: Cloud Computing: What it is, DOs and DON'Ts

Abstracted Storage

• Distributed File Systems; examples:

• Amazon S3

• RackSpace’s CloudFiles

• HDFS

domingo 1 de mayo de 2011

Page 14: Cloud Computing: What it is, DOs and DON'Ts

Stack

Software as a Service (SaaS)

Platform as a Service (PaaS)

Infrastructure as a Service (IaaS)

Cloud Enabler(s)

Hardware

domingo 1 de mayo de 2011

Page 15: Cloud Computing: What it is, DOs and DON'Ts

Public Cloud Services

• Amazon EC2

• RackSpace

• 100s more ...

domingo 1 de mayo de 2011

Page 16: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 17: Cloud Computing: What it is, DOs and DON'Ts

Amazon Web Services (AWS)

domingo 1 de mayo de 2011

Page 18: Cloud Computing: What it is, DOs and DON'Ts

AWS EC2 Prices

• on demand instances

• reserved instances

• spot instances

domingo 1 de mayo de 2011

Page 19: Cloud Computing: What it is, DOs and DON'Ts

AWS EC2 prices

domingo 1 de mayo de 2011

Page 20: Cloud Computing: What it is, DOs and DON'Ts

Spot Instances

domingo 1 de mayo de 2011

Page 21: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 22: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 23: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 24: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 25: Cloud Computing: What it is, DOs and DON'Ts

Private

• Eucalyptus

• OpenNebula

• Nimbus

• OpenStack

• Hadoop & friends

domingo 1 de mayo de 2011

Page 26: Cloud Computing: What it is, DOs and DON'Ts

Public or private?Better mixed

domingo 1 de mayo de 2011

Page 27: Cloud Computing: What it is, DOs and DON'Ts

MapReduce

• High level vs low level languages

• Example: MPI/PVM vs MapReduce

domingo 1 de mayo de 2011

Page 28: Cloud Computing: What it is, DOs and DON'Ts

MRs “Hello world” Unix-style

“en un lugar de la Mancha de cuyo nombre no quiero acordarme no ha mucho tiempo que vivía un hidalgo ...”

$ cat i.txt | tr ' ' '\n' | sort | uniq -c

1 Mancha 1 acordarme 1 cuyo 2 de ...

domingo 1 de mayo de 2011

Page 29: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 30: Cloud Computing: What it is, DOs and DON'Ts

Google Books

• 129 000 000 books are publshed so far

• 15 000 000 books scanned (1700-2010)

• 5 000 000 classified and with metadataScience, Vol. 331, no 6014, pp. 176-182 (Jan 14, 2011):

domingo 1 de mayo de 2011

Page 31: Cloud Computing: What it is, DOs and DON'Ts

http://ngrams.googlelabs.com/

domingo 1 de mayo de 2011

Page 32: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 33: Cloud Computing: What it is, DOs and DON'Ts

MapReduce

map: (k1, v1) ! list (k2, v2)

reduce: (k2, list(v2)) ! list (v2)

domingo 1 de mayo de 2011

Page 34: Cloud Computing: What it is, DOs and DON'Ts

MapReduce: Mapper

map(String key, String value): // key: document name // value: document contents for each word w in value: EmitIntermediate(w, 1);

“en un lugar de la Mancha de cuyo nombre no quiero acordarme no ha mucho tiempo que vivía un hidalgo”

“en”, 1“un”, 1 “lugar”, 1 “de”, 1 “la”, 1 “Mancha”, 1 “de”, 1...

domingo 1 de mayo de 2011

Page 35: Cloud Computing: What it is, DOs and DON'Ts

MapReduce: Reducer

reduce(String key, Iterator values): // key: a word // values: a list of counts result = 0; for each v in values: result += v; Emit(result);

“en”, [1] “un”, [1,1] “lugar”, [1] “de”, [1] ...

“en”, 1“un”, 2 “lugar”, 1 “de”, 1 ...

domingo 1 de mayo de 2011

Page 36: Cloud Computing: What it is, DOs and DON'Ts

Dean, J and Ghemawat, S, Comm. ACM, Vol 51, pp. 107--113, (2008)

domingo 1 de mayo de 2011

Page 37: Cloud Computing: What it is, DOs and DON'Ts

Our input

$ ls -l donquijote_s?.txt-rw-r--r-- 1 svet staff 1037413 23 abr 18:26 donquijote_s1.txt-rw-r--r-- 1 svet staff 1099078 23 abr 18:22 donquijote_s2.txt

$ head -6 donquijote_s1.txt

El ingenioso hidalgo don Quijote de la Mancha

TASA

Yo, Juan Gallo de Andrada, escribano de Camara del Rey nuestro senor, de los que residen en su Consejo, certifico y doy fe que, habiendo visto por los senores del un libro

domingo 1 de mayo de 2011

Page 38: Cloud Computing: What it is, DOs and DON'Ts

Python Mapper

#!/usr/bin/python import sysimport re def main(argv): line = sys.stdin.readline() pattern = re.compile("[a-zA-Z][a-zA-Z0-9]*") try: while line: for word in pattern.findall(line): print "LongValueSum:" + word.lower() + "\t" + "1" line = sys.stdin.readline() except "end of file": return Noneif __name__ == "__main__": main(sys.argv)

domingo 1 de mayo de 2011

Page 39: Cloud Computing: What it is, DOs and DON'Ts

Test the mapper

$ cat donquijote_s1.txt | ./wsplit.py

LongValueSum:el 1LongValueSum:ingenioso 1LongValueSum:hidalgo 1LongValueSum:don 1LongValueSum:quijote 1LongValueSum:de 1LongValueSum:la 1LongValueSum:mancha 1LongValueSum:tasa 1LongValueSum:yo 1LongValueSum:juan 1LongValueSum:gallo 1LongValueSum:de 1LongValueSum:andrada 1

domingo 1 de mayo de 2011

Page 40: Cloud Computing: What it is, DOs and DON'Ts

Preparing the S3

domingo 1 de mayo de 2011

Page 41: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 42: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 43: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 44: Cloud Computing: What it is, DOs and DON'Ts

Run

domingo 1 de mayo de 2011

Page 45: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 46: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 47: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 48: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 49: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 50: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 51: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 52: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 53: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 54: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 55: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 56: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 57: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 58: Cloud Computing: What it is, DOs and DON'Ts

Final result$ awk '{print $2 " " $1}' part-00000 | sort -r -n

21477 que18297 de18189 y10363 la9824 a9490 el8243 en6335 no5079 se4748 los4202 con3940 por3468 las3461 lo3398 le

3352 su2647 don2623 del2539 como2345 me2312 si2284 mas2207 mi2175 quijote2148 sancho2142 es2077 yo1938 un1808 dijo1740 al1463 para1400 porque

domingo 1 de mayo de 2011

Page 59: Cloud Computing: What it is, DOs and DON'Ts

CL alternative

$ elastic-mapreduce --create \ --stream \ --input s3n://mrbg/input \ --mapper s3://mrbg/prog/wsplit.py \ --output s3n://mgbr/output/run2

$ elastic-mapreduce --create

domingo 1 de mayo de 2011

Page 60: Cloud Computing: What it is, DOs and DON'Ts

MapReduce, ex 2

Pi = 4*M/Ndomingo 1 de mayo de 2011

Page 61: Cloud Computing: What it is, DOs and DON'Ts

MapReduce: Mapper#!/usr/bin/ruby

ARGF.each do |line| mcsteps = line.strip unless mcsteps.length == 0 begin inside = 0 mcsteps.to_i.times do x, y = rand, rand inside += 1 if Math.hypot(x,y) < 1.0 end puts inside.to_s rescue # couldn't parse mc steps end end end

domingo 1 de mayo de 2011

Page 62: Cloud Computing: What it is, DOs and DON'Ts

Pi

$ cat mcs.txt

1000

... create more mcs.txts:

200_000_000

$ cat mcs.txt | ./mc-pi-mr.rb

776

200_000_000domingo 1 de mayo de 2011

Page 63: Cloud Computing: What it is, DOs and DON'Ts

MapReduce: Reducer

#!/usr/bin/ruby

count = 0ARGF.each do |line| count += line.to_iend

puts "#{count} points inside"

domingo 1 de mayo de 2011

Page 64: Cloud Computing: What it is, DOs and DON'Ts

Prepare the EMR

• upload mcsnn.txt to mrbg/mcinput/

• upload mc-mapper.rb to mrbg/prog/

• upload mc-reducer.rb to mrbg/prog/

domingo 1 de mayo de 2011

Page 65: Cloud Computing: What it is, DOs and DON'Ts

domingo 1 de mayo de 2011

Page 66: Cloud Computing: What it is, DOs and DON'Ts

est: 109955955/140000000*4=3.14159871domingo 1 de mayo de 2011

Page 67: Cloud Computing: What it is, DOs and DON'Ts

• Hadoop Common

• HDFS

• MapReduce

domingo 1 de mayo de 2011

Page 68: Cloud Computing: What it is, DOs and DON'Ts

Thank you

Q & A

domingo 1 de mayo de 2011