real-time performance at massive scale francois orsini machine zone, inc

20
Real-Time Performance at Massive Scale Francois Orsini Machine Zone, Inc.

Upload: brandon-cole

Post on 18-Jan-2016

220 views

Category:

Documents


1 download

TRANSCRIPT

Real-Time Performance at Massive Scale

Francois OrsiniMachine Zone, Inc.

Massive Scalability

• What does Massive Scalability mean?

• What is the Upper bound?

• How to reach Scalability Nirvana?

• Brewer CAP Theorem

2

Real-Time @ Massive Scale

3

Real-Time Translation @ Massive Scale

4

Global Challenges

• Worldwide Reach• Single World / N Regions• No Language Barrier• 5M-40M+ Daily Active Users• 100-500k Concurrent Users• N client platforms

– iOS– Android– Windows

• 24 x 7 non-stop

5

Global

Real-Time

• Real-Time Chats– Including real-time translation

• Real-Time Events– Massive Parallel broadcasting– Auto-subscribe model

• Real-Time Push Notifications

• High Availability– Hot stand-by– 99.99% (“four nines”) ~< 53mns down / year

6

Real-Time

Massive Scaling

• Big Scale Architecture– Scaling Up + Out– TCO what?

• Fully Distributed• Fully Redundant• Cluster Native

– Automatic Failover

• Cloud Aware

7

Social 2.0

• Defining Social 2.0– No language barrier– Real-time

• Messaging• Translation

– Global scale and reach

• Social World Changer– Not just social gaming

8

What’s the Secret Sauce?

• Know your subject– Understand what you are building– New is good but bench / stress test it!

• Use the right tools for the Job!– Do not hesitate to vary– Pick proven software (not just the latest cool one)

• Open Source– Fork as needed– Rewrite early as necessary

9

What’s the Secret Sauce? (Cont.)

• Profile everything– Start doing it early– Care about performance

• No single point of Failure– Horizontally and vertically– Yeah right == Wrong

10

Real-Time Ingredient

• Erlang– Hidden jewel– Proven, created in 1986, open sourced in 1998– Supports distributed, fault-tolerant, soft real-time non-stop applications

– Great level of concurrency (no locks)– Hot-swapping of Code– Scales up & out

• Functional programming WTH?– Paradigm shift but makes a lot of sense– Hire the right competences– Get trained (don’t learn on your own)

11

Erlang Components

• Real-time Chat– Large chat rooms– Private 1-1 chat– Full chats history support– Languages transformation & translation

• Real-time Events• Real-time Timers• Real-time Push Notifications• Real-time Message Queuing & Delivery

12

Data Storage Ingredient

• MySQL (Percona)– Very reliable and predictable

– Proven (used by the biggest applications out there)

– Not the perfect storage solution but goes a long way

– Scale out solution (“logical shards” / partitions)

– Cluster (auto-failover)– InnoDB is reliable– Support for direct data access (e.g. HandlerSocket)

– SQL == Most expressive declarative language

13

Data Storage (Cont.)

• Master-Master topology• Scale Up and Out

– Use 5.5+

• How to reduce replication lag– Partition tables (5.1+)– HandlerSocket plug-in (750K QPS), Memcached plug-in

– Facebook Faker (http://dom.as/2012/09/04/faker/)

• Percona Toolkit, ”user_stat”• What!? No NOSQL!?

– Fear not … The next slide is near– Know what you will gain and lose– WITHSQL + NOSQL

14

Data Shards Storage Topology

15

Secondary

Primary

Shards

Shards Map

NOSQL Ingredient

• Redis– Proven, fast, powerful, yet simple– Scale out with Sharding (Cluster is Beta)– Great for object caching– 100k+ requests per second– Extremely reliable– Fast, Fast, Fast!

• Works great coupled with HaProxy / Heartbeat

16

REDIS HA TOPOLOGY

17

Slave

Master

Shards

Middle-Tier / WAS

HA

HA

HA

Some Numbers

• 3.5M soft real-time messages to a single channel per second.– Benchmarked with actual connected endpoints

• 500k concurrent and active users– Benchmarked with actual connected player bots

• > 100k writes per second• > 250K reads per second• Hundreds of real-time battles in 1 single

Map chunk• > 1k phrase translations per second minimum

18

Alternatives & Misc Products

• NOSQL– Cassandra– RIAK (Erlang)– Couchbase

• MariaDB (MySQL Fork + Percona XtraDB)• PostgreSQL• Other Commercial RDBMS• Continuent Tungsten

19

Summary

• Know what you’re up to!– Global scale, volumes– Research!

• Use proven approaches• Don’t be afraid to innovate• Bench, bench and bench!

– Don’t wait a few weeks before WW launch

• Profile the code• Have a plan B

20