python, go, and the cost of concurrency in the cloud | aws re:invent
TRANSCRIPT
Python, Go, and the Cost of Concurrency in the Cloud
• Quick discussion of moving from Python to Go
• Explaining key differences between Go and other “no semicolons” languages
• Show an example application illustrating why those key differences matter for your app’s bottom line
Goal of this talk
• Daily Python hacker
• Came from C/C++, UNIX systems hacking background
• PhD on P2P/crypto research, more C++ & Python
• Six months experience in Go
• Co-founder, Tracelytics; Chief Architect, AppNeta
About meChris Erway, AppNeta Chief Architect
• Fun to program — “Zen of Python”
• Built-in maps, sets, arrays, tuples
• Good library support
• Simple duck typing (as opposed to strict OO)
• A little code goes a long way
Things I like about Python
• Performance: not too slow, but not too fast either
• Dependencies can be a pain (virtualenv, pip, etc)
• The dreaded Global Interpreter Lock (GIL)
• Lack of typed function signatures can make reading code difficult
Things I don’t like about Python
• Announced 2009
• Creators: Ken Thompson (B, Plan 9 from Bell Labs), Rob Pike (Plan 9), Robert Griesemer (V8 engine)
• “all three of us had to be talked into every feature in the language, so there was no extraneous garbage put into the language for any reason”
• Statically typed, garbage-collected
• Fast compilation, static linking
Go
• Easy to read, no semicolons
• Built-in maps, arrays, strings
• Both support calling into C code when necessary
• Interfaces based on duck typing
• No virtual inheritance
• Statically typed, but automatic type inference and interfaces give it a “dynamic feel”
Similarities between Go and Python
• Go is compiled to native machine code• Fast compiler, single static binary
• Go is fast; memory usage depends on size of structs• No per-object dictionaries, as in Python
• Go has concurrency features built into the language: goroutines, channels, runtime scheduler
• Go has curly braces
Differences between Go and Python
Language comparison
• Lots of resources online for learning Go!
• Following 4 slides from “Go for Pythonistas” by Francesc Campoy Flores, Google
• See also: The Go Programming Language Blog, blog.golang.org
For more on Go ...
• Goroutines: very light processing actors (the gophers)
• Channels: typed, synchronized, thread-safe pipes (the arrows)
Go concurrencyBased on goroutines and channels
Fibonacci with Python generators
“Generator” goroutinesA function that returns a channel:
Concurrency across languages(From Brad Fitzpatrick’s Gocon Tokyo 2014 talk)
So who cares?You do!
• You do — concurrency is very important in the modern computing environment
• Programming for “the cloud” or for “SOA” or “microservices” is fundamentally different than writing a LAMP/MEAN/Rails app
• Assumptions on latency, throughput, scale all change
• The language you pick can cost you time & money
Modern systems demand concurrency• Pre-cloud systems and databases generally use pools of long-
living, low-latency connections
• Cloud & SOA/microservices often rely on HTTPS-based APIs• “Infinite” scale, but with more latency • For example: Amazon SQS vs RabbitMQ
• HTTP-based APIs have inherently higher latencies• Amazon DynamoDB 5-10ms latency• Amazon Kinesis PutRecords, S3, SQS 10-100ms latency• Usage-based pricing• Higher throughput = more concurrent HTTP connections
What about my async code for Python, Ruby, Node?• Async I/O makes network, disk reads & writes asynchronous
• Used by Python’s gevent, Tornado, Twisted• Ruby EventMachine, Celluloid• However Python, Ruby, Node.js
all still use a single-threadedinterpreter
• Interpreter can switch to another execution context/greenlet/thread while I/O is pending
• Go: thousands of goroutines are mapped to all available cores
Cloud APIs require compute-heavy RPCs
• HTTP-based APIs with authenticated JSON/XML
• Encryption: TLS/SSL key exchange, negotiation
• Authentication: AWS, Google request signature schemes
• Serialization: Convert data to JSON, base64, etc
• Not as simple as binary data over raw sockets
• Not pure disk/network I/O — not as easy to use async I/O
Increasing prevalence of multi-core architectures
• Quad-core, 8-core, 16-core, 20-core, 32-core, 40-core …
• How will you use all those CPUs?
• Strong opinion: Docker, containerization sometimesused as a crutch for horizontally scaling services writtenin single-threaded languages
Motivating example
• ~1000 items analyzed each second
• ~1000 Amazon S3 PUTs/sec, ~70KB each
• ~1000 Amazon DynamoDB item writes/sec
Cloud storage, queue, and log costs
Motivating example
• ~1000 items analyzed each second
• ~1000 Amazon S3 PUTs/sec, ~70KB each
• ~1000 Amazon DynamoDB item writes/sec
Why not queue the writes for later?
• ~1000 items analyzed each second
• ~1000 Amazon S3 PUTs/sec, ~70KB each
• ~1000 Amazon DynamoDB item writes/sec
Batch writes for fewer PUTs
• Read data objects from Amazon SQS
• Batch into larger files and store in Amazon S3
Baseline: Amazon SQS + S3 + DynamoDBNo batching
Amazon SQS + S3 + DynamoDBBATCH_SZ=10
Amazon SQS + S3 + DynamoDBBATCH_SZ=100
Amazon SQS + S3 + DynamoDBBATCH_SZ=1000
Basic algorithm
Implementation difficulties
Single Python processes~50 messages/sec
Multiple Python processes4 procs = ~200 messages/sec
Process-based scaling leads to suboptimal cost performance• Impossible to scale number of
Amazon SQS pollers and S3 writers independently
• One batch buffers per process: smaller batches than optimal, hard to “max out” S3 batch size before timeout
• Hard to “max out” 10 messages each SQS read
• Hard to detect when system is falling behind, problematic if write latency > read latency
Go implementation
Concurrency costs money
• The concurrency model your language provides is very important when your code combines lots of high-latency API calls / RPCs
• Ruby, Python, Node.js all require lots of concurrent processes to achieve good concurrency
• Result: over-provisioning, over-polling, IPC when you don’t need to
• Result: suboptimal cost when using usage-priced APIs
Couldn’t I just use {C, C++, Java, Scala, Clojure, Erlang, Haskell} for concurrency?• Yes, but —
• it may still be such a pain to spawn & use threads in your language that you don’t do it enough (e.g. in Java, C/C++ vs. just typing “go func()”)
• Lock-based synchronized memory access more complicated than channels
• C/C++ and Java have pretty heavyweight thread sizes, typically can only support 1K-10K threads
• Go (and Erlang) have very lightweight threads and can support millions of goroutines
Does this apply to me?
• Increasingly, yes
• More cores, more cloud, all the time
• SOA, “microservices”
• Do you have code that calls multiple independent services serially?
• Why?
Thank you!
• Hope this was useful and interesting!
• Win a BB-8 droid at our booth #131!• Next drawing is at 3:15PM today
• AppNeta is hiring! Engineering roles in Providence, Boston, Vancouver
• http://www.appneta.com/about/careers/
Backup slides (if potential Q&A question comes up)
Amazon Kinesis + S3 + DynamoDBProvisioned throughput per shard + PUT units
RabbitMQ + Amazon S3 + DynamoDBself-managed queue instances