acms: the akamai configuration management system a. sherman, p. h. lisiecki, a. berkheimer, and j....
TRANSCRIPT
![Page 1: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/1.jpg)
ACMS: The Akamai Configuration Management SystemA. Sherman, P. H. Lisiecki, A. Berkheimer, and J.
Wein
Presented byParya Moinzadeh
![Page 2: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/2.jpg)
The Akamai Platform
•Over 15,000 servers•Deployed in 1200+ different ISP networks•In 60+ countries
![Page 3: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/3.jpg)
Motivation
•Customers need to maintain close control over the manner in which their web content is served
•Customers need to configure different options that determine how their content is served by the CDN
•Need for frequent updates or “reconfigurations”
![Page 4: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/4.jpg)
Akamai Configuration Management System (ACMS)•Supports configuration propagation
management▫Accepts and disseminates distributed
submissions of configuration information•Availability•Reliability•Asynchrony•Consistency•Persistent storage
![Page 5: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/5.jpg)
Problem
•The widely dispersed set of end clients•At any point in time some servers may be
down or have connectivity problems•Configuration changes are generated
from widely dispersed places•Strong consistency requirements
![Page 6: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/6.jpg)
Assumptions• The configuration les will vary in size from a few
hundred bytes up to 100MB• Most updates must be distributed to every Akamai node• There is no particular arrival pattern of submissions• The Akamai CDN will continue to grow• Submissions could originate from a number of distinct
applications running at distinct locations on the Akamai CDN
• Each submission of a configuration file foo completely overwrites the earlier submitted version of foo
• For each configuration file there is either a single writer or multiple idempotent (non-competing) writers
![Page 7: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/7.jpg)
Requirements
•High Fault-Tolerance and Availability•Efficiency and Scalability•Persistent Fault-Tolerant Storage•Correctness•Acceptance Guarantee•Security
![Page 8: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/8.jpg)
ApproachFront- end
Back-endArchitecture
Requirements
Update management
Delivery
Small set of storage points
The entire Akamai CDN
![Page 9: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/9.jpg)
Architecture
Edge Servers
Storage Points
SPSP
SPSP
SPSP SPSP
SPSP
Publishers
Accepting SP
![Page 10: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/10.jpg)
Quorum-based Replication
•A quorum is defined as a majority of the ACMS SPs
•Any update submission should be replicated and agreed upon by the quorum
•A majority of operational and connected SPs should be maintained
•Every future majority overlaps with the earlier majority that agreed on a file
![Page 11: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/11.jpg)
Acceptance Algorithm
Acceptance Algorithm
Replication
Agreement
the Accepting SP copies the update to at least a quorum of the SPs
Vector exchange protocol
![Page 12: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/12.jpg)
Acceptance Algorithm Continued
SP
SP
SP
SP S
PSP
SP
SP
SP
SP
UID for a configuration
file foo:“foo.A.1234”
A publisher contacts an accepting SP
The Accepting SP first creates a temporary file with a unique filename (UID)
The accepting SP sends this file to a number of SPs
If replication succeeds the accepting SP initiates an agreement algorithm called Vector Exchange
Upon success the accepting SP “accepts” and all SPs upload the new file
Agreement
algorithm ???
![Page 13: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/13.jpg)
Acceptance Algorithm Continued
•The VE vector is just a bit vector with a bit corresponding to each Storage Point
•A 1-bit indicates that the corresponding Storage Point knows of a given update
•When a majority of bits are set to 1, we say that an agreement occurs and it is safe for any SP to upload this latest update
![Page 14: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/14.jpg)
Acceptance Algorithm Continued• The Accepting SP sets its own bit to 1
and the rest to 0,• broadcasts the vector along with the
UID of the update to the other SPs• Any SP that sees the vector sets its corresponding bit to 1,
• re-broadcasts the modified vector to the rest of the SPs
•Each SP learns of the agreement independently when it sees a quorum of bits set
•When the Accepting SP that initiated the VE instance learns of the agreement it accepts the submission of the publishing application
•When any SP learns of the agreement it uploads the file
![Page 15: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/15.jpg)
Acceptance Algorithm Continued
SP
SP
SP
SP
SP
SP
SP
SP
SP
SP
“A” initiates and broadcasts a vector:
A:1 B:0 C:0 D:0 E:0
“C” sets its own bit and re-broadcasts:
A:1 B:0 C:1 D:0 E:0
“D” sets its bit and rebroadcastsA:1 B:0 C:1 D:1 E:0
Any SP learns of the “agreement” when it sees a majority of bits set.
AB
CD
E
![Page 16: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/16.jpg)
Recovery
•The recovery protocol is called Index Merging
•SPs continuously run the background recovery protocol with one another
•The downloadable configuration files are represented on the SPs in the form of an index tree
•The SPs “merge” their index trees to pick up any missed updates from one another
•The Download Points also need to sync up state
![Page 17: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/17.jpg)
The Index Tree
•Snapshot is a hierarchical index structure that describes latest versions of all accepted files
•Each SP updates its own snapshot when it learns of a quorum agreement
•For full recovery each SP needs only to merge in a snapshot from majority-1 other SPs
•Snapshots are also used by the edge servers to detect changes
![Page 18: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/18.jpg)
Data Delivery
•Processes on edge servers subscribe to specific configurations via their local Receiver process
•A Receiver checks for updates to the subscription tree by making HTTP IMS requests recursively
• If the updates match any subscriptions the Receivers download the files
![Page 19: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/19.jpg)
Evaluation• Workload of the ACMS front-end over a 48 hour
period in the middle of a work week• 14,276 total file submissions on the system • Five operating Storage Points
Size range Avg file sz Distribution Avg time(s)
0K-1K 290 40% 0.61
1K-10K 3K 26% 0.63
10K-100K 22K 23% 0.72
100K-1M 167K 7% 2.23
1M-10M 1.8M 1% 13.63
10M-100M 51M 3% 199.87
The period from the time an Accepting SP is first
contacted by a publishing application, until it replies
with “Accept”
![Page 20: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/20.jpg)
Propagation Time Distribution• A random sampling
of 250 Akamai nodes
• The average propagation time is approximately 55 seconds
![Page 21: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/21.jpg)
Propagation times for various size files
• Mean and 95th-percentile delivery time for each submission
• 99.95% of updates arrived within three minutes
• The remaining 0.05% were delayed due to temporary network connectivity issues
The average time for each file to propagate to 95% of its recipients
The average propagation time
![Page 22: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/22.jpg)
Discussion• Push-based vs. pull-based update
• The effect of increasing the number of SP on the efficiency
• The effect of having less number on nodes in the quorum
• The effect of having a variable sized quorum
• Consistency vs. availability trade-offs in quorum selection
• How is a unique and synchronized ordering of all update versions of a given configuration file maintained? Can it be optimized?
• Is VE expensive? Can it be optimized?
• Can we optimize the index tree structure?
• The trade-off of having great cacheability…
![Page 23: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/23.jpg)
CS525 – Dynamo vs. Bigtable
Keun Soo Yim
April 21, 2009
• Dynamo: Amazon’s Highly Available Key-value Store G. DeCandia et al. (Amazon), SOSP 2007.
• Bigtable: A Distributed Storage System for Structured Data F. Chang et al. (Google), OSDI 2006.
![Page 24: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/24.jpg)
Scalable Distributed Storage Sys.
•RDBMS and NFS▫High throughput and Scalability▫High-availability vs. Consistency▫Relational data processing model and security▫Cost-effectiveness
![Page 25: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/25.jpg)
Amazon Dynamo – Consistent Hashing
•Node: Random Value Position in the ring•Data: Key Position in the ring
Interface• Get(Key)• Put(Key, Data,
[Context])
![Page 26: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/26.jpg)
Virtual Node for Load Balancing
• If a node fails,the load is evenly dispersed across the rest.
• If a node joins,its virtual nodes accept a roughly equivalent amount of load from the rest.
• How does it handle heterogeneity in nodes? # virtual nodes for a node is decided based on
the node capacity
Physical Node
![Page 27: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/27.jpg)
Replication for High Availability
•Each data item is replicated at N hosts.•Key is stored in N-1 clockwise successors.•N is a per-instance configurable parameter
![Page 28: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/28.jpg)
Quorum for Consistency
• W + R > N▫ W: 2▫ R: 2
Consistency Insurance
Consistency Insurance
Write
ReadWrite
Read
• Slow Write• Write: 3• Read: 1
• Ambiguous& Slow Read(cache)
Write
Read
Consistency Insurance
![Page 29: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/29.jpg)
Vector Clock for Eventual Consistency
•Vector clock: a list of (node, cnt)
•Client is asked for reconciliation.
•Why reconciliation isnot likely to happen?
![Page 30: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/30.jpg)
Latency of 99.9 Percentile
Service Level Agreements (SLA)
![Page 31: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/31.jpg)
Load Balancing
![Page 32: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/32.jpg)
Google’s Bigtable
• Key: <Row, Column, Timestamp>▫ Rows are ordered lexicographically
▫ Column = family:optional_qualifier
• API: lookup, insert, and delete▫ No support for relational DBMS model
![Page 33: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/33.jpg)
Tablet•Tablet (size: 100-200MB)
▫a set of adjacent rows edu.illinois.cs/i.html, edu.illinois.csl/i, edu.illinois.ece/i.html
▫unit of distribution and load balancing▫ Each tablet lives at only one table server▫ Tablet server splits tablets that get too big
Index
64K block
64K block
64K block
SSTable
Index
64K block
64K block
64K block
SSTable
Tablet Start:aardvark
End:apple
Immutable, sorted file of key-value pairs
![Page 34: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/34.jpg)
System Organization
Scheduler Master
(Google WorkQueue)
Lock Service
(Chubby, OSDI’06)
GFS
Master
GFS
Chunk Server
Scheduler
Slave
Linux
<Tablet Server 1>
GFS
Chunk Server
Scheduler
Slave
Linux
Tablet Server N
BigTable
Server
BigTable
Master
<Masters>
BigTableClient
<Clients>
• Master for load balancing and fault tolerance• Metadata: Use Chubby to monitor health of tablet
servers, restart failed servers [OSDI’06]• Data: GFS replicates data. [SOSP’03]
![Page 35: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/35.jpg)
Finding a Tablet
• In most cases, clients directly communicate with the Tablet server
![Page 36: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/36.jpg)
Editing a Table
•Mutations are logged, then applied to an in-memory version
•Logfile stored in GFS
![Page 37: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/37.jpg)
Table
• Multiple tablets make up the table• SSTables can be shared• Tablets do not overlap, SSTables can overlap
SSTable SSTable SSTable SSTable
Tablet
aardvark apple
Tablet
apple_two_E boat
![Page 38: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/38.jpg)
Scalability
![Page 39: ACMS: The Akamai Configuration Management System A. Sherman, P. H. Lisiecki, A. Berkheimer, and J. Wein Presented by Parya Moinzadeh](https://reader035.vdocuments.mx/reader035/viewer/2022062516/56649dbd5503460f94aafc01/html5/thumbnails/39.jpg)
Discussion Points• What’s the difference between these two and
NFS/DBMS in terms of interface?• Non-hierarchical name space
Key vs. <Row,Col,Timestamp>• Dynamo vs. Bigtable?
• Partitioning: Hashing without master vs. Alphabet ordered key with master
• Consistency: Quorum/Versioning vs. Chubby
• Fault tolerance: Replication vs. GFS• Load Balancing: Virtual Node vs. Tablet