streaming way to webscale: how we scale bitly via streaming

86
16 x 9 Streaming Your Way to Web Scale: Scaling Bitly via Stream-Based Processing All Things Open October 23, 2014, 4:15pm 85 slides 23 Images 21 diagrams

Upload: all-things-open

Post on 14-Jul-2015

200 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Streaming Way to Webscale: How We Scale Bitly via Streaming

16 x 9

Streaming Your Way to Web Scale: Scaling Bitly via Stream-Based Processing!All Things Open!October 23, 2014, 4:15pm

85  slides  23  Images  21  diagrams

Page 2: Streaming Way to Webscale: How We Scale Bitly via Streaming

Peter [email protected]!@tpherndon

backend developer for Bitly

Page 3: Streaming Way to Webscale: How We Scale Bitly via Streaming

Scaling Bitly via Stream-Based Processing

Page 4: Streaming Way to Webscale: How We Scale Bitly via Streaming

or

Page 5: Streaming Way to Webscale: How We Scale Bitly via Streaming

Streaming Your Way to Web Scale

Page 6: Streaming Way to Webscale: How We Scale Bitly via Streaming
Page 7: Streaming Way to Webscale: How We Scale Bitly via Streaming

http://www.mongodb-is-web-scale.com/!(**NSFW due to language)

Page 8: Streaming Way to Webscale: How We Scale Bitly via Streaming

Define streaming — public API surface that puts events into message queues that are consumed by web services

Page 9: Streaming Way to Webscale: How We Scale Bitly via Streaming

Need to be asynchronous

Page 10: Streaming Way to Webscale: How We Scale Bitly via Streaming

we use Tornado

Page 11: Streaming Way to Webscale: How We Scale Bitly via Streaming

and Go

Page 12: Streaming Way to Webscale: How We Scale Bitly via Streaming

we had an older queue, but replaced it with a newer one.

Page 13: Streaming Way to Webscale: How We Scale Bitly via Streaming

written in Go

Page 14: Streaming Way to Webscale: How We Scale Bitly via Streaming

command v. event messages

Page 15: Streaming Way to Webscale: How We Scale Bitly via Streaming

Notification of an event

Page 16: Streaming Way to Webscale: How We Scale Bitly via Streaming

allows interested listeners to respond as they require

Page 17: Streaming Way to Webscale: How We Scale Bitly via Streaming

λLambda architecture - batch + stream processing

Page 18: Streaming Way to Webscale: How We Scale Bitly via Streaming

Part Two, NSQ

Page 19: Streaming Way to Webscale: How We Scale Bitly via Streaming

NSQ

http://nsq.io

Page 20: Streaming Way to Webscale: How We Scale Bitly via Streaming

Look, Ma, real code!

https://github.com/bitly/nsq https://github.com/bitly/go-nsq https://github.com/bitly/pynsq https://github.com/jehiah/nsqauth-contrib

Page 21: Streaming Way to Webscale: How We Scale Bitly via Streaming

Servers!• nsqd!• nsqlookupd!• nsqadmin!• pynsqauthd

Parts is parts! (part 1)

Page 22: Streaming Way to Webscale: How We Scale Bitly via Streaming

Clients!• go-nsq!• pynsq

Parts is parts! (part 2)

Page 23: Streaming Way to Webscale: How We Scale Bitly via Streaming

Utilities!• nsq_tail!• nsq_to_file!• nsq_to_nsq!• nsq_stat

Parts is parts! (part 3)

Page 24: Streaming Way to Webscale: How We Scale Bitly via Streaming

two basic building blocks of distributed NSQ

Page 25: Streaming Way to Webscale: How We Scale Bitly via Streaming

nsqd & nsqlookupd

put into context, look at evolution of a website

Page 26: Streaming Way to Webscale: How We Scale Bitly via Streaming

Basic Web App

Web App

Database

Basic web app. In Python, Django + Postgres, Flask + Postgres, Tornado + Postgres

Page 27: Streaming Way to Webscale: How We Scale Bitly via Streaming

Scaling the Mountain (of Load)

Web App

Database

Web App Web App

First bottleneck: web layer

Page 28: Streaming Way to Webscale: How We Scale Bitly via Streaming

Cache Rules Everything Around Me

Database

Web AppWeb AppWeb App

Cache

Remove web layer bottleneck, next is DB, so add caching layer

Page 29: Streaming Way to Webscale: How We Scale Bitly via Streaming

You Want Me to Replicate You

DatabaseDatabase

Web AppWeb AppWeb App

Cache

Works for a while, but DB requests still take too long, so replicate

Page 30: Streaming Way to Webscale: How We Scale Bitly via Streaming

Shards Here, Shards There

Web AppWeb AppWeb App

Cache

DatabaseDatabaseDatabaseDatabase DatabaseDatabase

…and then shard

Page 31: Streaming Way to Webscale: How We Scale Bitly via Streaming

It’s Off to Work I Go!

Database

Web App

Cache

Queue

Worker

But individual requests still take too long, because doing too much work. So add message queue and worker. In Python, Celery

Page 32: Streaming Way to Webscale: How We Scale Bitly via Streaming

Message From a Bottle(.py)

Database

Web App

Cache

Queue

Worker

Web app sends messages to queue

Page 33: Streaming Way to Webscale: How We Scale Bitly via Streaming

Working the Event

Database

Web App

Cache

Queue

Worker

Worker pulls message off queue and processes

Page 34: Streaming Way to Webscale: How We Scale Bitly via Streaming

Write Here, Write Now

Database

Web App

Cache

Queue

Worker

Worker writes results to database, file system, etc.

Page 35: Streaming Way to Webscale: How We Scale Bitly via Streaming

Write Here, Write Now (redux)

Web App

Database

Worker

Queue

Instead, imagine worker writes results

Page 36: Streaming Way to Webscale: How We Scale Bitly via Streaming

Sending Out an SMS

Web App

Database

Worker

Queue

and web app writes event messages to queue local to the web service

Page 37: Streaming Way to Webscale: How We Scale Bitly via Streaming

Listen, listen, LISTEN

Web App

Database

Worker

Queue

Queue

but worker is listening to a queue running on another server

Page 38: Streaming Way to Webscale: How We Scale Bitly via Streaming

Workin’ On a Chain(ed) Gang

Web App

Database

Worker

Queue

Web App

Database

Worker

Queue

Web App

Database

Worker

Queue

Page 39: Streaming Way to Webscale: How We Scale Bitly via Streaming

Look it up!

Web App

Database

Worker

Queue

Queue

Worker finds queue with topic via nsqlookupd

Page 40: Streaming Way to Webscale: How We Scale Bitly via Streaming

if  __name__  ==  "__main__":          tornado.options.parse_command_line()          logatron_client.setup()                    Reader(                  topic=settings.get('nsqd_output_topic'),                  channel='queuereader_spam_metrics',                  validate_method=validate_message,                  message_handler=count_spam_actions,                  lookupd_http_addresses=settings.get('nsq_lookupd')          )          run()

/<service>/queuereader_<service>.py

How do I find things?

Page 41: Streaming Way to Webscale: How We Scale Bitly via Streaming

Sending Out an SMS

Web App

Database

Worker

Queue

topic: ‘spam_api’

First time app writes to a TOPIC in the local nsqd

Page 42: Streaming Way to Webscale: How We Scale Bitly via Streaming

Sending Out an SMS

Web App

Database

Worker

Queue

topic: ‘spam_api’

nsqlookupd

nsqd creates the topic and registers it with nsqlookupd

Page 43: Streaming Way to Webscale: How We Scale Bitly via Streaming

Where Am I Again?

Web App

Database

Worker

nsqdtopic: 'spam_counter'

nsqdtopic: 'spam_api'

nsqlookupdtopic: 'spam_api'?

Worker in another service looking for a topic asks nsqlookupd, replies with address

Page 44: Streaming Way to Webscale: How We Scale Bitly via Streaming

Talkin’ ‘Bout Something

Web App

Database

Worker

nsqdtopic: 'spam_counter'

nsqdtopic: 'spam_api'

channel: 'spam_counter'

queuereader connects to nsqd, registers a channel

Page 45: Streaming Way to Webscale: How We Scale Bitly via Streaming

Cross-Town Traffic

nsqdtopic: 'spam_api'

Worker Worker Worker

channel: 'spam_counter'

messages are divided by # of subscribers to a channel; allows horizontal scaling

Page 46: Streaming Way to Webscale: How We Scale Bitly via Streaming

Channeling the Ghost

nsqdtopic: 'spam_api'

Worker Worker Worker

channel: 'spam_counter'

Worker

channel: 'nsq_to_file''

full copy of all messages to each channel

Page 47: Streaming Way to Webscale: How We Scale Bitly via Streaming
Page 48: Streaming Way to Webscale: How We Scale Bitly via Streaming

nsqadmin

Page 49: Streaming Way to Webscale: How We Scale Bitly via Streaming

nsqadmin

How we manage our message queues

Page 50: Streaming Way to Webscale: How We Scale Bitly via Streaming

nsqadmin

Page 51: Streaming Way to Webscale: How We Scale Bitly via Streaming

nsqadmin

Page 52: Streaming Way to Webscale: How We Scale Bitly via Streaming
Page 53: Streaming Way to Webscale: How We Scale Bitly via Streaming

pynsqauthd

Page 54: Streaming Way to Webscale: How We Scale Bitly via Streaming

It’s made of PEOPLE!

https://github.com/jehiah/nsqauth-contrib

Page 55: Streaming Way to Webscale: How We Scale Bitly via Streaming
Page 56: Streaming Way to Webscale: How We Scale Bitly via Streaming
Page 57: Streaming Way to Webscale: How We Scale Bitly via Streaming
Page 58: Streaming Way to Webscale: How We Scale Bitly via Streaming

pynsq client

Page 59: Streaming Way to Webscale: How We Scale Bitly via Streaming

It’s still made of PEOPLE!

https://github.com/bitly/pynsq

Page 60: Streaming Way to Webscale: How We Scale Bitly via Streaming

• settings.py!• <service>_api.py!• queuereader_<service>.py!• README.md

/<service>

Queuereaders are part of streaming architecture

Page 61: Streaming Way to Webscale: How We Scale Bitly via Streaming

if  __name__  ==  "__main__":          tornado.options.parse_command_line()          logatron_client.setup()                    Reader(                  topic=settings.get('nsqd_output_topic'),                  channel='queuereader_spam_metrics',                  validate_method=validate_message,                  message_handler=count_spam_actions,                  lookupd_http_addresses=settings.get('nsq_lookupd')        )          run()

/<service>/queuereader_<service>.py, 1 of 4

Page 62: Streaming Way to Webscale: How We Scale Bitly via Streaming

if  __name__  ==  "__main__":          tornado.options.parse_command_line()          logatron_client.setup()                    Reader(                  topic=settings.get('nsqd_output_topic'),                  channel='queuereader_spam_metrics',                  validate_method=validate_message,                  message_handler=count_spam_actions,                  lookupd_http_addresses=settings.get('nsq_lookupd')        )          run()

/<service>/queuereader_<service>.py, 1 of 4

Page 63: Streaming Way to Webscale: How We Scale Bitly via Streaming

def  validate_message(message):          if  message.get('o')  ==  '+'  and  message.get('l'):                  return  True          if  message.get('o')  ==  '-­‐'  and  message.get('l')\            and  message.get('bl'):                     return  True          return  False

/<service>/queuereader_<service>.py, 2 of 4

Page 64: Streaming Way to Webscale: How We Scale Bitly via Streaming

if  __name__  ==  "__main__":          tornado.options.parse_command_line()          logatron_client.setup()                    Reader(                  topic=settings.get('nsqd_output_topic'),                  channel='queuereader_spam_metrics',                  validate_method=validate_message,                  message_handler=count_spam_actions,                  lookupd_http_addresses=settings.get('nsq_lookupd')        )          run()

/<service>/queuereader_<service>.py, 1 of 4

Page 65: Streaming Way to Webscale: How We Scale Bitly via Streaming

def  count_spam_actions(message,  nsq_msg):          key_section  =  statsd_keys[message['o']]          key  =  key_section.get(message['l'],  key_section['default'])          statsd.incr(key)                    if  key  ==  'remove_by_manual':                  key_section  =  statsd_keys['-­‐manual']                  key  =  key_section.get(message['bl'],  key_section['default'])                  statsd.incr(key)                    return  nsq_msg.finish()

/<service>/queuereader_<service>.py, 3 of 4

Page 66: Streaming Way to Webscale: How We Scale Bitly via Streaming

def  count_spam_actions(message,  nsq_msg):          key_section  =  statsd_keys[message['o']]          key  =  key_section.get(message['l'],  key_section['default'])          statsd.incr(key)                    if  key  ==  'remove_by_manual':                  key_section  =  statsd_keys['-­‐manual']                  key  =  key_section.get(message['bl'],  key_section['default'])                  statsd.incr(key)                    return  nsq_msg.finish()

/<service>/queuereader_<service>.py, 3 of 4

Page 67: Streaming Way to Webscale: How We Scale Bitly via Streaming

if  __name__  ==  "__main__":          tornado.options.parse_command_line()          logatron_client.setup()                    Reader(                  topic=settings.get('nsqd_output_topic'),                  channel='queuereader_spam_metrics',                  validate_method=validate_message,                  message_handler=count_spam_actions,                  lookupd_http_addresses=settings.get('nsq_lookupd')        )          run()

/<service>/queuereader_<service>.py, 4 of 4

Page 68: Streaming Way to Webscale: How We Scale Bitly via Streaming

if  __name__  ==  "__main__":          tornado.options.parse_command_line()          logatron_client.setup()                    Reader(                  topic=settings.get('nsqd_output_topic'),                  channel='queuereader_spam_metrics',                  validate_method=validate_message,                  message_handler=count_spam_actions,                  lookupd_http_addresses=settings.get('nsq_lookupd')        )          run()

/<service>/queuereader_<service>.py, 4 of 4

Page 69: Streaming Way to Webscale: How We Scale Bitly via Streaming
Page 70: Streaming Way to Webscale: How We Scale Bitly via Streaming

utilities

Page 71: Streaming Way to Webscale: How We Scale Bitly via Streaming

nsq_tail!nsq_to_file!to_nsq!nsq_to_nsq!nsq_stat

Parts is parts! (part 3, redux)

Page 72: Streaming Way to Webscale: How We Scale Bitly via Streaming

Features & Guarantees!(aka Trade-Offs)

Distributed, No SPOF || Horizontally Scalable || TLS || statsd integration || Easy to Deploy || Cluster Administration

Page 73: Streaming Way to Webscale: How We Scale Bitly via Streaming

Messages NOT Durable

Page 74: Streaming Way to Webscale: How We Scale Bitly via Streaming

Delivered at least once

Page 75: Streaming Way to Webscale: How We Scale Bitly via Streaming

Un-ordered Delivery

Page 76: Streaming Way to Webscale: How We Scale Bitly via Streaming

Eventually-Consistent Discovery

Page 77: Streaming Way to Webscale: How We Scale Bitly via Streaming

A Thousand Points of Light (well, 58)

Page 78: Streaming Way to Webscale: How We Scale Bitly via Streaming

A Thousand Points of Light (well, 58)

Page 79: Streaming Way to Webscale: How We Scale Bitly via Streaming

Whoa…

8.2 billion decodes per month

Page 80: Streaming Way to Webscale: How We Scale Bitly via Streaming

Streaming Architecture

Easy to build new services Easy to scale individual components horizontally Durable in the face of single component failure Distributed

Page 81: Streaming Way to Webscale: How We Scale Bitly via Streaming

THINGS TO THINK ABOUT Monitoring, monitoring, monitoring Failure modes — how can things fail? How does your application as a whole handle the failure of individual components? Measurement — metrics show the range Timeouts — connection timeouts, DNS timeouts — a slow network is the same as a failed service

Page 82: Streaming Way to Webscale: How We Scale Bitly via Streaming

NSQ

http://nsq.io

Page 83: Streaming Way to Webscale: How We Scale Bitly via Streaming

Duke of URL

https://github.com/bitly/nsq https://github.com/bitly/go-nsq https://github.com/bitly/pynsq https://github.com/jehiah/nsqauth-contrib

Page 84: Streaming Way to Webscale: How We Scale Bitly via Streaming

Web Scale - http://www.mongodb-is-web-scale.com!Waterfall - https://www.flickr.com/photos/desatur8/14949285342!Tornado - https://www.flickr.com/photos/indigente/798304!John de Lancie - https://www.flickr.com/photos/cayusa/1394930005!Ben Whishaw - https://www.flickr.com/photos/rossendalewadey/6032496676!Command Key - https://www.flickr.com/photos/klash/3175479797!iPhone6 Event - https://www.flickr.com/photos/notionscapital/15067798867!Wait for iPhone - https://www.flickr.com/photos/josh_gray/662814907!NSQ Logo - http://nsq.io!!

All other photos by T. Peter Herndon!!

Photo Credits

Page 85: Streaming Way to Webscale: How We Scale Bitly via Streaming

Questions?

Page 86: Streaming Way to Webscale: How We Scale Bitly via Streaming

Peter [email protected]!@tpherndon