writing scalable software in java

66
Writing Scalable Software in Java From multi-core to grid-computing

Upload: ruben-badaro

Post on 29-Jan-2018

17.784 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Writing Scalable Software in Java

Writing Scalable Software in JavaFrom multi-core to grid-computing

Page 2: Writing Scalable Software in Java

Me

• Ruben Badaró

• Dev Expert at Changingworlds/Amdocs

• PT.JUG Leader

• http://www.zonaj.org

Page 3: Writing Scalable Software in Java

What this talk is not about

• Sales pitch

• Cloud Computing

• Service Oriented Architectures

• Java EE

• How to write multi-threaded code

Page 4: Writing Scalable Software in Java

Summary

• Define Performance and Scalability

• Vertical Scalability - scaling up

• Horizontal Scalability - scaling out

• Q&A

Page 5: Writing Scalable Software in Java

Performance != Scalability

Page 6: Writing Scalable Software in Java

Performance

Amount of useful work accomplished by a computer system compared to the time and

resources used

Page 7: Writing Scalable Software in Java

Scalability

Capability of a system to increase the amount of useful work as resources and load are added to

the system

Page 8: Writing Scalable Software in Java

Scalability

• A system that performs fast with 10 users might not do so with 1000 - it doesn’t scale

• Designing for scalability always decreases performance

Page 9: Writing Scalable Software in Java

Linear Scalability

Throughput

Resources

Page 10: Writing Scalable Software in Java

Reality is sub-linear

Throughput

Resources

Page 11: Writing Scalable Software in Java

Amdahl’s Law

Page 12: Writing Scalable Software in Java

Scalability is about parallelizing

• Parallel decomposition allows division of work

• Parallelizing might mean more work

• There’s almost always a part of serial computation

Page 13: Writing Scalable Software in Java

Vertical Scalability

Page 14: Writing Scalable Software in Java

Vertical ScalabilitySomewhat hard

Page 15: Writing Scalable Software in Java

Vertical ScalabilityScale Up

• Bigger, meaner machines

- More cores (and more powerful)

- More memory

- Faster local storage

• Limited

- Technical constraints

- Cost - big machines get exponentially expensive

Page 16: Writing Scalable Software in Java

Shared State

• Need to use those cores

• Java - shared-state concurrency

- Mutable state protected with locks

- Hard to get right

- Most developers don’t have experience writing multithreaded code

Page 17: Writing Scalable Software in Java

This is how they look like

public static synchronized SomeObject getInstance() {

return instance;

}

public SomeObject doConcurrentThingy() {

synchronized(this) {

//...

}

return ..;

}

Page 18: Writing Scalable Software in Java

Single vs Multi-threaded

• Single-threaded

- No scheduling cost

- No synchronization cost

• Multi-threaded

- Context Switching (high cost)

- Memory Synchronization (memory barriers)

- Blocking

Page 19: Writing Scalable Software in Java

Lock ContentionLittle’s Law

The average number of customers in a stable system is equal to their average arrival rate

multiplied by their average time in the system

Page 20: Writing Scalable Software in Java

Reducing Contention

• Reduce lock duration

• Reduce frequency with which locks are requested (stripping)

• Replace exclusive locks with other mechanisms

- Concurrent Collections

- ReadWriteLocks

- Atomic Variables

- Immutable Objects

Page 21: Writing Scalable Software in Java

Concurrent Collections

• Use lock stripping

• Includes putIfAbsent() and replace() methods

• ConcurrentHashMap has 16 separate locks by default

• Don’t reinvent the wheel

Page 22: Writing Scalable Software in Java

ReadWriteLocks

• Pair of locks

• Read lock can be held by multiple threads if there are no writers

• Write lock is exclusive

• Good improvements if object as fewer writers

Page 23: Writing Scalable Software in Java

Atomic Variables

• Allow to make check-update type of operations atomically

• Without locks - use low-level CPU instructions

• It’s volatile on steroids (visibility + atomicity)

Page 24: Writing Scalable Software in Java

Immutable Objects

• Immutability makes concurrency simple - thread-safety guaranteed

• An immutable object is:- final

- fields are final and private

- Constructor constructs the object completely

- No state changing methods

- Copy internal mutable objects when receiving or returning

Page 25: Writing Scalable Software in Java

JVM issues

• Caching is useful - storing stuff in memory

• Larger JVM heap size means longer garbage collection times

• Not acceptable to have long pauses

• Solutions

- Maximum size for heap 2GB/4GB

- Multiple JVMs per machine

- Better garbage collectors: G1 might help

Page 26: Writing Scalable Software in Java

Scaling Up: Other Approaches

• Change the paradigm

- Actors (Erlang and Scala)

- Dataflow programming (GParallelizer)

- Software Transactional Memory (Pastrami)

- Functional languages, such as Clojure

Page 27: Writing Scalable Software in Java

Scaling Up: Other Approaches

• Dedicated JVM-friendly hardware

- Azul Systems is amazing

- Hundreds of cores

- Enormous heap sizes with negligible gc pauses

- HTM included

- Built-in lock elision mechanism

Page 28: Writing Scalable Software in Java

Horizontal Scalability

Page 29: Writing Scalable Software in Java

Horizontal ScalabilityThe hard part

Page 30: Writing Scalable Software in Java

Horizontal ScalabilityScale Out

• Big machines are expensive - 1 x 32 core normally much more expensive than 4 x 8 core

• Increase throughput by adding more machines

• Distributed Systems research revisited - not new

Page 31: Writing Scalable Software in Java

Requirements

• Scalability

• Availability

• Reliability

• Performance

Page 32: Writing Scalable Software in Java

Typical Server Architecture

Page 33: Writing Scalable Software in Java

... # of users increases

Page 34: Writing Scalable Software in Java

... and increases

Page 35: Writing Scalable Software in Java

... too much load

Page 36: Writing Scalable Software in Java

... and we loose availability

Page 37: Writing Scalable Software in Java

... so we add servers

Page 38: Writing Scalable Software in Java

... and a load balancer

Page 39: Writing Scalable Software in Java

... and another one rides the bus

Page 40: Writing Scalable Software in Java

... we create a DB cluster

Page 41: Writing Scalable Software in Java

... and we cache wherever we can

Cache

Cache

Page 42: Writing Scalable Software in Java

Challenges

• How do we route requests to servers?

• How do distribute data between servers?

• How do we handle failures?

• How do we keep our cache consistent?

• How do we handle load peaks?

Page 43: Writing Scalable Software in Java

Technique #1: Partitioning

A...E

U...Z

P...T

K...O

F...J

Users

Page 44: Writing Scalable Software in Java

Technique #1: Partitioning

• Each server handles a subset of data

• Improves scalability by parallelizing

• Requires predictable routing

• Introduces problems with locality

• Move work to where the data is!

Page 45: Writing Scalable Software in Java

Technique #2: Replication

Active

Backup

Page 46: Writing Scalable Software in Java

Technique #2: Replication

• Keep copies of data/state in multiple servers

• Used for fail-over - increases availability

• Requires more cold hardware

• Overhead of replicating might reduce performance

Page 47: Writing Scalable Software in Java

Technique #3: Messaging

Page 48: Writing Scalable Software in Java

Technique #3: Messaging

• Use message passing, queues and pub/sub models - JMS

• Improves reliability easily

• Helps deal with peaks

- The queue keeps filling

- If it gets too big, extra requests are rejected

Page 49: Writing Scalable Software in Java

Solution #1: De-normalize DB

• Faster queries

• Additional work to generate tables

• Less space efficiency

• Harder to maintain consistency

Page 50: Writing Scalable Software in Java

Solution #2: Non-SQL Database

• Why not remove the relational part altogether

• Bad for complex queries

• Berkeley DB is a prime example

Page 51: Writing Scalable Software in Java

Solution #3: Distributed Key/Value Stores

• Highly scalable - used in the largest websites in the world, based on Amazon’s Dynamo and Google’s BigTable

• Mostly open source

• Partitioned

• Replicated

• Versioned

• No SPOF

• Voldemort (LinkedIn), Cassandra (Facebook) and HBase are written in Java

Page 52: Writing Scalable Software in Java

Solution #4: MapReduce

Map...

Page 53: Writing Scalable Software in Java

Solution #4: MapReduce

Map...

Page 54: Writing Scalable Software in Java

Divide Work

Solution #4: MapReduce

Map...

Page 55: Writing Scalable Software in Java

Divide Work

Solution #4: MapReduce

Map...

Page 56: Writing Scalable Software in Java

Divide Work

Solution #4: MapReduce

Map...

Page 57: Writing Scalable Software in Java

Solution #4: MapReduce

Map...

Page 58: Writing Scalable Software in Java

Compute

Solution #4: MapReduce

Map...

Page 59: Writing Scalable Software in Java

Solution #4: MapReduce

Reduce...

Return and aggregate

Page 60: Writing Scalable Software in Java

Solution #4: MapReduce

Reduce...

Return and aggregate

Page 61: Writing Scalable Software in Java

Solution #4: MapReduce

Reduce...

Return and aggregate

Page 62: Writing Scalable Software in Java

Solution #4: MapReduce

• Google’s algorithm to split work, process it and reduce to an answer

• Used for offline processing of large amounts of data

• Hadoop is used everywhere! Other options such as GridGain exist

Page 63: Writing Scalable Software in Java

Solution #5: Data Grid

• Data (and computations)

• In-memory - low response times

• Database back-end (SQL or not)

• Partitioned - operations on data executed in specific partition

• Replicated - handles failover automatically

• Transactional

Page 64: Writing Scalable Software in Java

Solution #5: Data Grid

• It’s a distributed cache + computational engine

• Can be used as a cache with JPA and the like

• Oracle Coherence is very good.

• Terracotta, Gridgain, Gemfire, Gigaspaces, Velocity (Microsoft) and Websphere extreme scale (IBM)

Page 65: Writing Scalable Software in Java

Retrospective

• You need to scale up and out

• Write code thinking of hundreds of cores

• Relational might not be the way to go

• Cache whenever you can

• Be aware of data locality

Page 66: Writing Scalable Software in Java

Q & AThanks for listening!

Ruben Badaróhttp://www.zonaj.org