service primitives for internet scale applications amr awadallah, armando fox, ben ling computer...

14
Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

Upload: grant-blankenship

Post on 05-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

Service Primitives for Internet Scale Applications

Amr Awadallah, Armando Fox, Ben Ling

Computer Systems LabStanford University

Page 2: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

2 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

Interactive Internet-Scale Application?

Millions of users. Global LB

Local LB

PresentationServers + $

LB

Application Servers + $

Fail over

State Replica

Local LB

PresentationServers + $

PresentationServers + $

Application Servers + $

Application Servers + $

Data Center

State

PS + $

LB

AS + $

Fail over

Local LB

State

PS + $

LB

AS + $

Fail over

Page 3: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

3 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

Motivation

A general framework to describe IIA’s and characterize the functional properties that can be traded away to improve the following operational metrics: Throughput (how many user requests/sec?)

Interactivity (latency, how fast user requests finish?)

Availability (% of time user perceives service as up), including fast recovery to improve availability

TCO (Total Cost of Ownership)

In particular, enumerate architectural primitives that expose partial degradation of functional properties and illustrate how they can be built with “commodity” HW.

Page 4: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

4 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

Recall ACID

Atomicity: For a transaction involving two or more discrete pieces of information, either all pieces changed are committed or none.

Consistency: A transaction creates a new valid state obeying all user integrity constraints.

Isolation: Changes from non-committed transactions remains hidden from all other concurrent transactions (Serializable, Repeatable-R, Commited-R, Uncommit-R)

Durability: Committed data survives beyond system restarts and storage failures.

Page 5: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

5 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

ACID is too much for Internet scale

Yahoo UDB: tens of thousands of reads/sec, up to 10k writes/sec

Geoplexing used for both disaster recovery and scalability, but eager replication (strong consistency) across replicas scales poorly If total DB size grows with # nodes, deadlock rate

increases at the same rate as number of nodes

If DB size grows sublinearly, deadlock rate increases as cube of number of nodes

Even if we could use transactional DB’s and eager replication, cost would be too high

Page 6: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

6 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

The New Properties

Durability (State): Hard, Soft, Stateless

Consistency: Strong, Eventual, Weak, NonC

Completeness: Full, Incomp-R, Lossy-W

Visibility: User, Entity, World

Page 7: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

7 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

Durability (Hard, Soft, Stateless)

Hard: This is permanent state in the original sense of the D in ACID.

Soft: This is temporary storage in the RAM sense, i.e. if power fails then data is lost. This is cheaper and acceptable if user can rebuild state quickly.

Stateless: No need to store state on behalf of the user.

Page 8: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

8 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

Consistency (Strong, Eventual, Weak)

Eventual: after a write, there is some time t after which all reads see the new value. (eg caching)

Strong: in addition, before time t, no reads see the new value (single-copy ACID consistency)

Weak: This is weak consistency in the TACT sense - captures ordering inaccuracies, or persistent staleness.

Page 9: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

9 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

Completeness (Full, Incomp, Lossy)

Complete: all updates either succeed, or fail synchronously. All queries return 100% accurate data.

Incomplete Queries: This is aggregated lossy reads over partitioned state, or state sampling. The best example here is Inktomi’s distributed search where its ok that some partitions not return results under load.

Lossy Updates: This means that its ok for some commited writes to not make it. Example: Lossy Counters and online polls.

Page 10: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

10 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

Visibility (World, Entity, User)

World: The state and changes to it are visible to all the world, e.g. listing a product on eBay.

Entity: State is only visible to a group of users, or within a specific subset of the data (e.g. eBay Jewlery)

User: The state and changes to it are only visible to the user interacting with it, e.g. the MyYahoo user profile. This could be simpler to implement using ReadMyWrites techniques.

Page 11: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

11 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

Architectural Primitives

Primitives Trades Gains

Caching, Replication Eventual Consistency

Interactiveness, Availability, Throughput

Partitioning Entity Visibility Interactiveness, Graceful Degradation

Lossy/Sampled Aggregation

Weak Consistency

Interactiveness, Graceful Degradation

Page 12: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

12 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

Examples of Primitives

LossyUpdate(key,newVal)

LossyAccumulator(key, updateOp) - for commutative ops

LossyAggregate(searchKeys) - lossy search of an index

Page 13: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

13 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

LossyUpdate implementation

LossyUpdate Steve Gribble’s DHT: atomic ops, single-copy consistency; during

failure recovery, reads are slower and writes are refused

If update occurs while updated partition is recovering => fail

Otherwise, update is persistent

When is this useful?

LossyAccumulator (for hit counter, online poll, etc) Every period T, in-memory sub-accumulators from worker nodes

are swept to persistent copy

At the same time, current value of master accumulator is read by each worker node, to serve reads locally

Worker nodes don’t backup in-memory copy => fast restart

Can bound loss rate of accumulator and inconsistency in read

Page 14: Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University

14 Service Primitives for Interactive Internet-Scale Applications. Amr Awadallah, Armando Fox, Ben Ling. ROC, Sep 2002.

What is given up

What is given up Strict consistency of read copies of accumulator

Precision of accumulator value (lost updates)

What is gained: fast recovery for each node, continuous operation despite transient per-node failures