scalability designprinciples-v2-130718023602-phpapp02 (1)

Scalable Web Design – Principles and Patterns

Speaker : Sachin Prakash Sancheti Principal Architect – Cloud (Windows Azure)

1

Context

2

3

Server Busy!!

Your Request can not be processed,

please try after some time

I am trying to book a

ticket for 1 hour now

Please wait!

Any Real life Examples?

4

Survey

What is Scalability?

• It is NOT – Only Performance

– High Availability

– Business Continuity Planning

• It Is – Traffic, User Growth

– Dataset, Database Size Growth

5

What is Scalability?

• Scalability – “The Scalability is measure of number of users it can effectively

support at the same time without degrading the defined performance”

– Has limits – E.g. “With two load balanced capacity it should support 1000 concurrent users with average response time of 3 seconds”

• “Performance is what an individual user experiences; Scalability is how many users get to experience it TOGETHER”

6

What is the Concern?

• Scalability is a business concern

– Google observed 500-milisecond delay to page response caused 20% decrease in traffic

– Amazon.com observed 100-milisecond delay caused a 1% decrease in retail revenue

– Remember “Performance is what an individual user experiences; Scalability is how many users get to experience it TOGETHER”

7

Handling Scalability – Degraded Application

• Degraded Application – Doing nothing and loosing business

8

Handling Scalability - Throttling

• Throttling – Throttling the requests to temporarily stop accepting new requests

and serve better to existing or important users

9

Handling Scalability – Adding Resources

• Adding Resources – Scaling up – Vertical Scaling

• Get Bigger

• Widening the roads

– Scaling out – Horizontal Scaling

• Get More

• Routing the traffic (Partitioning)

10

Typical Web Application Resources

• Web Server, Application Server (Middle Tier) and Database Tier

11

Web Server

Database Server

Application Server

Scaling Solutions

• Vertical Scaling OR Scaling Up – Increasing resource power

– Remember widening the roads!!

• Horizontal Scaling OR Scaling Out – Adding additional machines/nodes

– Remember routing the traffic

12

Vertical Vs. Horizontal Scaling

13

Vertical Scaling Horizontal Scaling

Higher Capital Investment On Demand Investment

Utilization concerns Utilization can be optimized

Relatively Quicker and works with the current design

Relatively more time consuming and needs redesigning

Limiting Scale Internet Scale

Not Cloud Native Design Cloud Native Design

Web/Application Server Scalability

14

Scaling Out Web Server – Load Balancing

15

Web Server

Web Server

Web Server

• Design for Fault Tolerance – Intent : Enables system to continue its

intended operation, possibly at a reduced level, rather than failing completely, when some part of the system fails

– Drivers: Degraded services are better than no service at all. Compare cost effectiveness

– Solution: • Load Balancing

• Monitoring, Self Healing, Restart

Pattern - Bi-directional Scaling

• Design for Scaling Out (Bidirectional) – Intent: Deployment built using commodity of hardware working

together for economies of scale. Optimization is easier with scaling out and in, rather than scaling up and down. Driven for Elasticity

– Driver: Optimized utilization, cost saving

– Solution:

• Stateless Application Design

• Nothing is shared except Database

• Scaling every tier is possible – Web/Service/Database etc.

16

Scaling Out / Horizontally: Adding Removing Boxes

Design Principle - Stateless Design

• Stateless designs increases scalability – Don’t store anything locally on Web Server

• Session State – Local Sessions – Avoid – Not Scalable

• Load Balancer Sticky sessions can create hot spot load

– Central Session – Good – Distributed Cache, Database

– Client Session – Better – Client Cookie

– No Session – Awesome

18

Design Principle – Loosely Coupled

• Components and layers should be loosely coupled to be able to scale each layer separately

19

Database Server

Web Servers

Application Servers

Caching in Scalability

• Caching helps in avoiding scale

• In-memory distributed cache offers an excellent solution to data storage bottlenecks

• Distributed caching clusters can keep growing horizontally, just like the application servers. This reduces pressure on data storage so that it is no longer a scalability bottleneck.

20

Design Pattern - Cache Aside Pattern

• Prefer Cache to Database for Reading – Intent : Increase read throughput and

reduce database bottleneck

– Drivers: Distributed cache are faster and shared across web/application servers

– Solution:

• Update cache and database both for synchronization

• Read from Cache

• Decorator Design Pattern

21

Distributed Cache

Write

Read

Design Pattern - Cache Read-through/Write-through (RT/WT)

• Prefer Cache to Database – Intent: Increase read throughput and reduce database bottleneck. Use

Cache for read write both

– Drivers: Distributed cache are faster and shared across web/application servers

– Solution:

• Application treats cache as the main data store and reads data from it and writes data to it.

• The cache is responsible for reading and writing this data to the database, thereby relieving the application of this responsibility, asynchronously

22

23

Design Pattern - Cache Read-through/Write-through (RT/WT)

Database Scalability

24

CAP Theorem

• CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees: Consistency, Availability and Partition tolerance.

• Consistency: All clients always have the same view of the data

• Availability: Each client can always read and write

• Partition Tolerance: The system works well despite physical network partition

25

CAP Theorem – Database Placements

26

Database Scaling – Replication - Read Mostly Pattern

• Intent: Increase database scalability by separating write and read operations – Generally most of the applications have around 80% read and 20%

write

• Drivers: Separate read write responsibilities, High availability benefits

• Solution: – Read Write Separation

– Master Slave Pattern

27

Database Scaling – Read Write Separation

28

Reads and Writes

Reads

Design Pattern – Partitioning / Sharding

• Design for Database Sharding – Intent: Increasing data size might rise throttling. Database scale and

performance is more important than reliability. CAP Theorem

– Drivers: Scaling database layer, increasing database throughput

– Solution:

• Database Sharding / Horizontal Partitioning

• Database Federation

29

Shard Resolver

Shard = User ID % 4

Database Sharding Example

30

Shard 0 25%

Shard 1 25%

Shard 2 25%

User ID=3

Shard 3 25%

Design Principles – Eventually Consistent

• BASE Opposite to ACID – Intent: Real internet scale model. Postpone the consistency.

• Basically Available, Soft state, Eventual consistency

– Solution:

• Queue Based processing Model

• Change in behavior

– Order Placed successfully TO Order Received Successfully

31

Design Principles – Asynchronous Processing

• Blocking is bane for Scalability – Intent:

• Avoid blocking calls, reduce contention

– Solution:

• Queue Based processing Model

• Fire and Forget Calls

• 1000 users blocked for 5 seconds = 5000 users per second

32

Design Principles – Parallel Design

• Design for Parallel and Reliable Work – Intent: Increasing resources should results in a proportional increase

in performance. Dependent services might not be available. Blocking is bane of scalability

– Drivers: Higher reliability, Proportional distribution

– Solution:

• Concern Independent Scaling

• Reliability through Queue

• Queue driven worker tasks - more messages more workers faster work

33

Queue Based Pattern

34

Queue - Load Leveling, Load Balancing, Loose Coupling

35

http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-01-45-09/8712.3.gif

Design Principles – Queue Based Pattern

• Idempotent – Design the operation to be idempotent; that is, if it's carried out more

than once, it's as if it was carried out just once

– Implement the receiver in such a way that it can receive a message multiple times safely, either through a filter that removes already received messages or by adjustment of message semantics

36

Design Principles – Capacity Planning

• Everything has a limit: Compose a Scale – Intent: Design Around Provider SLAs and Capacity

– Solution:

• Know the limits, measure the scalability and increase the scale

• E.g. Storage supports up to 10000 transactions/sec

– Add storage for higher scale

• E.g. Queue supports 5000 messages per seconds

– Add additional Queues (Partitioning) for additional scale

37

Design Pattern – Multi Site Deployment Pattern

38

Database Server

Web Servers

Application Servers

Database Server

Web Servers

Application Servers

Sync

Routing • Performance Based • Round Robin • Failover

Asia United States

Summary

39

Scalability Principles

40

Scalability

Stateless

Parallelization

Asynchronous

Partitioning

Idempotent

Fault Tolerance

Vertical Vs. Horizontal Scaling

41

Vertical Scaling Horizontal Scaling

ACID BASE

Availability First Focus on Commit

Pessimistic Locking Optimistic Locking

Transactional Shared nothing

Favor Consistency Maximum Scalability

Most Distributed Systems Realize Both

Thank You !

42

44

Some of the images are taken by utilizing Google search and due credit to the source. Author do not claim any creation or originality of the contents. It is used only for learning purposes

scalability designprinciples-v2-130718023602-phpapp02 (1)

Technology

directional scaling

web server load balancing

server busy

application server middle

stateless application

scalable web design

current design

concurrent users