Download - Chipy Dan Griffin
![Page 1: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/1.jpg)
OpDemand.com
Concurrency in Python(and other languages)
ChipyDan Griffin
![Page 2: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/2.jpg)
2
Why Am I Here?
1. Concurrency is actually really simple
2. Python has support for just about everything
3. See why other languages specialized
4. Realize they are mostly the same as Python
5. Share how OpDemand has solved problems
6. General tips on writing code that "scales"
![Page 3: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/3.jpg)
3
OpDemand
1-click cloud deploys
Dynamic configuration
Automatic and customizable app monitoring
Real time log feedback
Complete audit trail
Easy collaboration with other users
EC2, Heroku and soon OpenStack!
Simple Cloud Management
![Page 4: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/4.jpg)
4
Two Reasons for Concurrency
I want to use the time that I am spending waiting for IO or other events. Problems that are IO bound.
1
2 I want to do a lot of work as fast as I can. Problems that are CPU bound and can be parallelized.
![Page 5: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/5.jpg)
5
Event loops (Twisted, Asyncore, etc.)
Every significant bit of work slows everything down
Still constrained to 1 process
Library compatibility is terrible
Callback hell. d.addCallback(lambda _: self) Inline deferreds are better a = yield db.find(id)
Let the Operating System tell you when you have work to do. Usually based on select, poll, kqueue.
![Page 6: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/6.jpg)
6
deferToThread
CouchDB-Python
d= threads.deferToThread( template_model.assemble, serv )
Use blocking libraries in twisted by deferring them to threads
![Page 7: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/7.jpg)
7
Processes
The root of “real” concurrency for Python systems
Process per core + 1 to distribute work and collect results
Fork - create a copy of current process and continue execution
![Page 8: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/8.jpg)
8
The Celery Project
Parent process forks n workers
Relies on RabbitMQ and multiprocessing to handle concurrency
Celery is a perfect example
![Page 9: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/9.jpg)
9
Threads
Shared memory
Mutation with locks (hopefully)
Everyone knows about the GIL
Still useful in Python
![Page 10: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/10.jpg)
10
A Quick Clojure Detour
Software Transactional Memory - SQL like transactions for modifying data from different threads
Embracing mutation of shared data
Everything is based on Threads, you can dosync, send, promise and deliver
Mostly immutable BUT you can change refs with ref-set inside transactions
![Page 11: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/11.jpg)
11
Why Does Erlang Exist?
Wraps all the concepts into 1 heavy duty package
Pins schedulers to different cores
Uses thread pools
Has transparent inter-process/server communication
Makes use of OS event loops
You would never want to write many common tasks in it
![Page 12: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/12.jpg)
12
How OpDemand Works
Twisted Twisted
Node Proxy
RabbitMQ
Celery Monitor
Client
Twisted
![Page 13: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/13.jpg)
13
What does Node do?
Reference SocketIO Implementation
Take service updates and log output from ZMQ and re-publish over SocketIO
Serve static content
Round robin HTTP requests between reactors
Replace with Python or Nginx soon hopefully
![Page 14: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/14.jpg)
14
Explicitly Saving, Implicitly Publishing
d = defer.Deferred() d.addCallback(self.transition_state, core_fsm.DEPLOYING) d.addCallback(self._set_status_detail, 'deploy in progress') d.addCallback(self._save_obj, **kwargs) d.addCallback(self._start_interval, context, 'deploy') d.addCallback(self._deploy, context, **kwargs) d.addCallback(self._set_time, 'deploy') d.addCallback(self._set_interval, context, 'deploy') d.addCallback(self.transition_state, core_fsm.ACTIVE) d.addCallback(self._set_status_detail, 'deploy operation successful') d.addCallback(self._save_obj, **kwargs)
![Page 15: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/15.jpg)
15
Real-time Publishing
def save_obj(self, this, ctx, **kwargs):# here is where we save to couchsaved_obj = self.db.save(this)
if ctx and "service" in ctx:
if settings.ZMQ_PUBLISHER: tag = 'service-%s' % ctx["service"]["_id"] settings.ZMQ_PUBLISHER.publish( view.to_json(saved_obj), tag=str(tag))
# Publish documents over ZMQ when they are saved
![Page 16: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/16.jpg)
16
Wrapping Celery in Twisted
A "polling" deferred using twisted.internet.task
def _do_poll():if celery_task.ready():
raise StopIterationtask = cooperate(_do_poll()) return task.whenDone()
Essentially launch Celery tasks and poll for completion
![Page 17: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/17.jpg)
17
A Common Interface for Celery Tasks
# Celery Task Definition@aws_celery.taskdef refresh(comp, config, creds):
doctype = comp.get("doctype")if doctype == "server":
i = Instance() return i.refresh(comp, config)
Celery transforms a component and it’s configuration
![Page 18: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/18.jpg)
18
Returning the Finished Product
# AWS Instance Codedef refresh(self, comp, config, **kwargs): boto = self.get_boto(comp, config) comp, config = self.sync(comp, config, boto) return comp, config
The Provider code returns the new Comp and Config
![Page 19: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/19.jpg)
19
Why bother with Celery
Code from the first AWS provider using Twisted
# this is one path through this d = threads.deferToThread(self.conn.get_all_images, [dc['image_id']]) d.addErrback(self._handle_error) d.addCallback(self.__get_image) d.addCallback(self.__create_reservation, self.__prepare_kwargs(context, kwargs, resolved)) d.addCallback(self.__construct_instances, context, resolved) d.addCallback(self.__sync_instances, context) d.addCallback(self._save_obj, **kwargs) d.addCallback(self._poll_state, context, 'running', **kwargs) if 'elastic_ip' in dc and dc['elastic_ip'] is not None: d.addCallback(self.__associate_address, context) d.addCallback(self._save_obj, **kwargs) d.addCallback(self.__poll_address, context, **kwargs) d.addCallback(self._save_obj, **kwargs) if not context.config.get("server/instance_id"): d.addCallback(self._poll_signal, context, 22, **kwargs) # transition the server to built state so it gets destroyed # I cut like 20 more lines of code
![Page 20: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/20.jpg)
20
Using Celery
Much better
Image_id = self._get_image_id(config) images = conn.get_all_images([image_id]) if len(images) != 1: raise LookupError('Could not find AMI: %s' % image_id) image = images[0] kwargs = self._prepare_run_kwargs(config) reservation = image.run(**kwargs) instances = reservation.instances boto = instances[0] config['ec2-instance/id'] = boto.id config['ec2-instance/region_name'] = boto.region.name config['ec2-instance/zone_name'] = boto._placement.zone return comp, config
![Page 21: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/21.jpg)
21
Using Pika
mq.create_async_subscriber("c2-service", "service", handle_service_updates)
def create_async_subscriber(exchange, queue, callback, amqtype="topic"): tw = TwistedHandler(exchange, queue, callback, amqtype=amqtype) connection = TwistedConnection(pika.ConnectionParameters( host=settings.RABBITMQ_HOST, port=settings.RABBITMQ_PORT, virtual_host=settings.RABBITMQ_VHOST), tw.on_connected) return tw
Modified from Pika repository (maybe HEAD works now?)
Subscribe with a Twisted handler
![Page 22: Chipy Dan Griffin](https://reader036.vdocuments.mx/reader036/viewer/2022062519/56815168550346895dbf987e/html5/thumbnails/22.jpg)
OpDemand.com
22