#geodesummit - design tradeoffs in distributed systems

51
Design Tradeos In Distributed Systems How Southwest Uses Apache Geode

Upload: pivotalopensourcehub

Post on 20-Mar-2017

1.861 views

Category:

Technology


0 download

TRANSCRIPT

Design Tradeoffs In Distributed Systems

How Southwest Uses Apache Geode

Brian Dunlap @brianwdunlap

Technical Lead - Aircraft Systems March 8th, 2016

APACHE GEODE AT SOUTHWEST

OPS SUITE CARGO CREW SOUTHWEST.COM

Optimize decisions with integrated schedule information.

Grow over the next 10 years.

Show a real-time view of our operational day…

to 1,000s of web users.

Scale across data centers and ensure consistency.

Support our most critical operational systems.

Southwest’s Network Operations Control integrates decision makers.

BOISE ALBANY

OKLAHOMA CITY

AUSTIN PANAMA CITY BEACH

CHARLESTON

GREENVILLE-SPARTANBURG

TUCSONLUBBOCK

AMARILLO

MIDLAND/ODESSAEL PASO

LITTLE ROCK

NASHVILLE

DALLAS (LOVE FIELD)

SACRAMENTOOAKLAND

SAN JOSE

BURBANKLOS ANGELES

(LAX) ORANGE COUNTYONTARIO

SAN DIEGO

SAN FRANCISCO (SFO)

BIRMINGHAM

LOUISVILLE

CLEVELAND

OMAHA

TULSA

RENO/TAHOE

HARLINGEN/SOUTH PADRE ISLAND

PUERTO VALLARTA

CORPUS CHRISTI

ALBUQUERQUE

DES MOINES

MEMPHIS

CABO SAN LUCAS/LOS CABOS

ROCHESTER

AKRON/CANTON

WICHITA

PENSACOLA

MEXICO CITY

NASSAU

PUNTA CANA

SAN JUAN

MONTEGO BAY

ARUBA

CANCÚN

FLINTGRAND RAPIDS

CHARLOTTE

DAYTON

MINNEAPOLIS/ST. PAUL

PHOENIX

DENVERINDIANAPOLIS

COLUMBUS

RALEIGH/DURHAM

CHICAGO (MIDWAY)

FT. LAUDERDALE (MIAMI AREA)

DETROIT

HOUSTON (HOBBY)

SEATTLE/TACOMA

LAS VEGAS

NEW ORLEANS

ST. LOUIS

MILWAUKEE

BUFFALO/NIAGARA FALLS

ATLANTA

ORLANDO

FT. MYERS/NAPLES

JACKSONVILLE

TAMPA

WEST PALM BEACH

SAN ANTONIO

KANSAS CITY

BELIZE CITY

SAN JOSÉLIBERIA

PORTLAND

WASHINGTON, D.C. (REAGAN NATIONAL)

RICHMOND

MANCHESTER

PROVIDENCEHARTFORD/SPRINGFIELD

NORFOLK/VIRGINIA BEACH

BOSTON LOGAN

PHILADELPHIA

BALTIMORE/WASHINGTON (BWI)WASHINGTON, D.C. (DULLES)

PITTSBURGH

NEW YORK (LAGUARDIA)LONG ISLAND/ISLIP

NEW YORK (NEWARK)

SALT LAKE CITY

SPOKANE

PORTLAND

NOC

CREW

PASSENGER

MAINTENANCE

FLIGHT

GATE

CARGO

AIRCRAFT

FACILITY

OPS SUITE CONSUMES

10M JMS MESSAGES DAILY

RECOVERY OPTIMIZATION

USES

OVER

1,000,000 SCHEDULES

4,000 FLIGHTS700 AIRCRAFT

500K PASSENGERS / DAY

a response time measured

in secondsRECOVERY

OPTIMIZATION DRIVES

Luv!

Thinking about boundaries

TEAMS

SOFTWARE FOCUS

ORG FOCUS

DOMAINS

CORE DOMAIN

SUPPORTING DOMAIN

GEODE

NODES ACROSS AZs

GC, WAN

PATTERNS

PARALLEL PROCESSING

ASYNC BEHAVIOR

Domain tradeoffs

TEAMS

SOFTWARE FOCUS

ORG FOCUS

DOMAIN

CORE DOMAIN

SUPPORTING DOMAIN

GEODE

NODES ACROSS AZs

GC, WAN

PATTERNS

PARALLEL PROCESSING

ASYNC BEHAVIOR

What do you own? What do you need? How long can you keep it?

Domain tradeoffs

Books by: @ericevans0 @VaughnVernon

Get Organized! Domain Driven Design (DDD)

CORE DOMAIN

SUPPORTING DOMAIN

UBIQUITOUS LANGUAGE

AGGREGATES

DOMAIN EVENTS

What do you own? (core) <invest>

What do you need? (supporting) <simplify>

How long can you keep it? <intentional>

Crew Maint Pax Cargo Flight Gate

Existing domain silos…

OVER 15 YEARS OF COMPLEXITY

Crew Maint Pax Cargo Flight Gate

100% 100%

CORE DOMAINSSUPPORTING DOMAINS

INCREMENTAL STEPS

SELECTED INTEGRATION

What do you own? (core) <focus>

What do you need? (supporting) <simplify>

How long can you keep it? <intentional>

Adding is very easy. Watch out for data that’s around for too long.

Do all of these data need to be in-memory?

Data at rest for a long time? (>365 days)

GEODE REGION

SIZES

Determine if each subdomain should use Geode.

Don’t make an automatic decision.

Domain tradeoffs

Maybe it needs an entirely different home?

Domain tradeoffs

Pattern tradeoffs

TEAMS

SOFTWARE FOCUS

ORG FOCUS

DOMAIN

CORE DOMAIN

SUPPORTING DOMAIN

GEODE

NODES ACROSS AZs

GC, WAN

PATTERNS

PARALLEL PROCESSING

ASYNC BEHAVIOR

How far? How fast?

Pattern trade-offs

The chart of scalability!

OLD NEW

NORMALIZED JOINSREGIONS FOR READS

REGIONS FOR AGGREGATES

BLOCKING THREADS ASYNC - AKKA / ACTORS

ACTIVE / PASSIVE ACTIVE / ACTIVE

MUTABLE STATEIMMUTABILITY / EVENT SOURCING

DATA CONVERGENCE

CRUDCQRS / DDD

EVENT DRIVEN

ServiceManagerHandlerImpl

We’re learning!

OLD NEW

NORMALIZED JOINSREGIONS FOR READS

REGIONS FOR AGGREGATES

BLOCKING THREADS ASYNC - AKKA / ACTORS

ACTIVE / PASSIVE ACTIVE / ACTIVE

MUTABLE STATEIMMUTABILITY / EVENT SOURCING

DATA CONVERGENCE

CRUDCQRS / DDD

EVENT DRIVEN

We write immutable domain events into event regions.

Clients receive events using Geode CQs.

Clients checkpoint their position into separate regions.

Event regions expire messages.

checkpointing

Akka Cluster manages Actor Singletons which coordinate parallel processing based on a logical groupId.

Backpressure is implemented through a competing consumer pattern. Take a look at Akka Streams!

All Geode replicate regions use distributed ack. We don’t want to converge. (some write wins)

coordination (*important concept)

JMS adapter Command adapters Command handlers - to CQ clients View model builders - to CQ clients JMS publishers

data flow

PUSH or PULL How do we scale expensive read I/O?

Contain expensive reads

With CQRS view model builders, perform heavy state enriching “select *” once. Push read updates vs. polling (Geode CQs)

Conflate triggering view model rebuild events

Be careful with timeouts!

Be careful with alerts!

Be careful with joins!

Be careful with large values!

Be careful with old habits!

safety tips

Teams

TEAMS

SOFTWARE FOCUS

ORG FOCUS

DOMAIN

CORE DOMAIN

SUPPORTING DOMAIN

GEODE

NODES ACROSS AZs

GC, WAN

PATTERNS

PARALLEL PROCESSING

ASYNC BEHAVIOR

Distributed systems are created by distributed teams.

Communication coordination is a thing.

Integrate Geode security with a directory Tune JVM size and GC Deploy and upgrade environments Size and configure VMs Support production events Enable WAN Gateway Sender / Receivers Load snapshots between environments Automate starting and stopping clusters Teaching distributed concepts - like CAP

How do we share new distributed system responsibilities?

DBAs UNIX DEVs

Middleware Release Management

Offshore Support New Geode Team

DevOps

EARLIER IS BETTER

Learn to luv conversation tension.

When there’s tension, you’re on the right track!

opssuite-all schedule-core.jar

Use separate repos to help with boundaries. Align Teams with repo ownership. Minimize jar dependencies across teams.

EMBRACE a 100X MENTALITY!

What does 100x mean? (msg/sec)

Normal rate: 50 Busy rate: 500

Recover rate: 5,000HOW FAST CAN WE RECOVER?

Create great learning resources

Watch out for old habits!

Geode

TEAMS

SOFTWARE FOCUS

ORG FOCUS

DOMAIN

CORE DOMAIN

SUPPORTING DOMAIN

GEODE

NODES ACROSS AZs

GC, WAN

PATTERNS

PARALLEL PROCESSING

ASYNC BEHAVIOR

Prefer less-shared disk I/O. (local to a VM rack, or dedicated)

Prefer larger + fewer Geode nodes. (4 larger nodes vs. 8 smaller ones)

Take advantage of availability zones (AZs).

CONVERSATION LEADERSHIP

ACROSS TEAMS

SHARED or SHARED LESS What infrastructure supports Geode?

Know your memory (and GC) limits.

Watch out for slow heap growth that triggers continuous GC. -XX:+UseConcMarkSweepGC

-XX:CMSInitiatingOccupancyFraction=60 -Xloggc:/your/path/node-name.GC.log -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCCause -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=20 -XX:GCLogFileSize=5M

Check out GCViewer for GC log analysis.

Essential tool for real-time decision optimization testing!

Helpful for QA performance and functional testing.

Wonderful Geode feature!

WAN Gateway

Optimization binary consumes PDX via C++ Native Client

Moving > 200 MB per optimization request

Be careful with refactoring PDX data types!

C++ Native Client

Questions

TEAMS

SOFTWARE FOCUS

ORG FOCUS

DOMAIN

CORE DOMAIN

SUPPORTING DOMAIN

GEODE

NODES ACROSS AZs

GC, WAN

PATTERNS

PARALLEL PROCESSING

ASYNC BEHAVIOR

QUESTIONS

51

Join the Apache Geode Community! • Check out http://geode.incubator.apache.org

• Subscribe: [email protected]

• Download: http://geode.incubator.apache.org/releases/ `