taming pythons with zookeeper (pyconfi edition)

Post on 24-Apr-2015

1.157 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Taming Pythons with ZooKeeper

PyCon Finland 2012

@nailor

Content Squad

• Consume boatloads of XML

• Build database out ofit

• Transcode audio

• Enrich data

• Build indexes

• Ship indexes

Day to day operations

Shipping

Pre-defined order

• Central orchestrator

• Assume all machines are running

• Run remote commands

• We need to know all the quirks

• Fail fast

The naïve solution

Guess what? It breaks

...and replacing isn't simple

• No central orchestrator

• Assume some machines are always down

• Run locally

• Shift responsibility to system owners

• Fail gracefully

New tool is needed

CAP Theorem

The CAP Theorem

1.Consistency

2.Availability

3.Partition tolerance

Pick two. Any two will do.

We went for CA

• Users are mostly contained in single DC

• Inside a single DC connections are quite robust

• Remember the order? We need consistency

• Availability is everything

Why?

How?

Apache ZooKeeper

• A distributed tree-like data structure

• Simple primitives

• Automatic leader elections

• Guaranteed hard consistency

• Ephemeral nodes

What?

Tree-like structure

/

/dir/

/

/subdir/

/dir2/

Simple primitives

● Guaranteed atomic operations● Counters● Change notifications

Automatic leader election

● Nodes know who is the most up to date● If no leader can be picked, ZooKeeper refuses to work

Guaranteed hard consistency

● Every change is sent to every node!● Quorum for all operations is always required

Ephemeral nodes

● Node is present only if the client is alive

Library: zkPython

The Good:

● Thin● Comes with ZooKeeper● Maintained by the Apache ZooKeeper project

The bad:

● Thin● C bindings only, no PyPy for you

The ugly:

● No documentation :(

There are others: Kazoo

The Good:

● Pure Python● Recipes implemented● Used by many (Quora, Mozilla, reddit, Zope)

The bad:

● Not much recipes done● Not owned by the mainline

The ugly:

● Own implementation of the protocol

Dos and Don'ts

Don't ship large chunks

Monitor the ZooKeeper

Don't write there all the time

Stay in one DC

Spotify & ZooKeeper

Summary time!

Concurrency == hard

Distributed consistency == hard

No partitions? Go ZooKeeper!

Pick your weapon library

Remember tradeoffs

Thank you

top related