improving docker registry design based on production workload analysis · 2019-12-18 · improving...
TRANSCRIPT
![Page 1: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/1.jpg)
Improving Docker Registry Design based on Production Workload Analysis
Ali Anwar, Mohamed Mohamed, Vasily Tarasov, Michael Littley, Lukas Rupprecht, Yue Cheng,
Nannan Zhao, Dimitrios Skourtis, Amit S. Warke, Heiko Ludwig, Dean Hildebrand, and Ali R. Butt
![Page 2: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/2.jpg)
TheaveragecompanyQUINTUPLESitsDockerusagewithin
9MONTHS
Source: Datadog
Containers will be a $2.7B market by 2020*
2
§ Containers accelerate software development and distribution.
§ In 2017 alone, Docker adoption went up by 40%.
§ Containers use in enterprise and cloud infrastructure is expected to grow much faster.
*http://bit.ly/2uryjDI
![Page 3: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/3.jpg)
Docker usage patterns remain a mystery
3
§ How are Docker containers used and managed?
§ How can we streamline Docker workflows?
§ How do we facilitate Docker performance analysis?
![Page 4: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/4.jpg)
Our contribution: Characterization and optimization of Docker workflow
4
§ Conduct a large-scale analysis of a real-world Docker workload from geo-distributed IBM container service
§ Provide insights and develop heuristics to increase Docker performance
§ Develop an open source Docker workflow analysis tool* *https://dssl.cs.vt.edu/drtp/
![Page 5: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/5.jpg)
Background: Docker container image
5
§ Container images are divided into layers. § The metadata file is called manifest. § Users create repositories to store images. § Images in a repository can have
different tags (versions). JSON
Layer
Layer
Layer
Manifest
Container image
}
Redis CentOS
v2.6
latest myOS
![Page 6: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/6.jpg)
Background: Docker container image
6
§ Container images are divided into layers. § The metadata file is called manifest. § Users create repositories to store images. § Images in a repository can have
different tags (versions). JSON
Layer
Layer
Layer
Manifest
Container image
}
Redis CentOS
v2.6
latest myOS<user,repository,tag>
![Page 7: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/7.jpg)
Background: Docker container registry
7
§ Docker container images are stored online in Docker registry.
dockerpush dockerpull
§ Push image: 1. HEAD layers 2. POST/PUT layer 3. PUT manifest
§ Pull image: 1. GET manifest 2. GET layers
![Page 8: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/8.jpg)
Background: Docker container registry
8
§ Docker container images are stored online in Docker registry.
dockerpush dockerpull
§ Push image: 1. HEAD layers 2. POST/PUT layer 3. PUT manifest
§ Pull image: 1. GET manifest 2. GET layers
Significantamountofacontainerstartuptimeisspentinpullingtheimage
![Page 9: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/9.jpg)
The IBM Cloud Docker registry traces
9
§ Capture a diverse set of customers: individuals, small & medium businesses, government institutions
§ Cover five geographical locations and seven availability zones
§ Span 75 days and 38M requests that account for more than ~181TB of data transferred
![Page 10: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/10.jpg)
IBM Docker registry service
10
Five geographical locations constitute seven Availability Zones (AZ):
IBMCloudRegistryarchitecture
*The registry setup is identical, except prs and dev are only half the size of the other Azs.
IBMInternal5. Staging(stg)
Testing*6. Prestaging(prs)7. Development(dev)
Production1. Dallas(dal)2. London(lon)3. Frankfurt(fra)4. Sydney(syd) Nginx
Object store Registry
Broadcaster
Registry
Registry
Stats counter
![Page 11: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/11.jpg)
11
Tracing methodology
§ Combined traces by matching the incoming HTTP request identifier across the components
§ Removed redundant fields and anonymized the traces
§ Collected data from Registry, Nginx, and Broadcaster
§ Studied requests: GET, PUT, HEAD, PATCH, POST
Registry
Broadcaster
Registry
Registry
Nginx
![Page 12: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/12.jpg)
12
{"host":" ","http.request.duration":0.879271282,"http.request.method":"GET","http.request.remoteaddr":" ","http.request.uri":"v2/ / /blobs/ ","http.request.useragent":"docker/17.04.0-cego/go1.7.5..)","http.response.status":200,"http.response.written":1518,"id":" ","timestamp":"2017-07-01T01:39:37.098Z"}
Anonymized log sample
![Page 13: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/13.jpg)
13
Q1: What is the distribution of request types?
0%20%40%60%80%
100%
dal
lon fra
syd
stg
prs
dev
Requ
ests
pull push80%–95%ofrequestsarereads(pulls)
Production:dal,lon,fra,sydIBMinternal:stgTesting:prs,dev
![Page 14: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/14.jpg)
0%
20%
40%
60%
80%
100%
dal lon fra syd stg prs dev
Requ
ests
GET POST HEAD PUT PATCH
14
Q1: What is the distribution of request types?
60%oftherequestsareGETand10%–22%areHEADrequests
Production:dal,lon,fra,sydIBMinternal:stgTesting:prs,dev
![Page 15: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/15.jpg)
15
Q2: What is the manifest size distribution?
Typicalmanifestsizeisaround1KB
Production:dal,lon,fra,sydIBMinternal:stgTesting:prs,dev
![Page 16: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/16.jpg)
16
Q3: What is the layer size distribution? 65%ofthelayersaresmallerthan1MBand
around80%aresmallerthan10MB
Production:dal,lon,fra,sydIBMinternal:stgTesting:prs,dev
![Page 17: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/17.jpg)
17
Q3: What is the layer size distribution? 65%ofthelayersaresmallerthan1MBand
around80%aresmallerthan10MB
Thereisasignificantopportunityforcachingthelayers
Production:dal,lon,fra,sydIBMinternal:stgTesting:prs,dev
![Page 18: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/18.jpg)
18
Q4: Is there spatial locality? 1%ofmostaccessedlayersaccountfor42%and59%of
allrequestsindalandsyd,respectively
Production:dal,lon,fra,sydIBMinternal:stgTesting:prs,dev
![Page 19: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/19.jpg)
19
Q4: Is there spatial locality?
0%
5%
10%
15%
20%
1 2 3 4 5 6 7 8 9 10
%ofreq
uests
Popularityrank
dal lon fra sydstg prs dev
Production:dal,lon,fra,sydIBMinternal:stgTesting:prs,dev
Thepopularityratedropsrapidlyaswemovefrommostpopulartotenthmostpopularlayer
![Page 20: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/20.jpg)
20
Q5: Can future requests be predicted?
GETmanifestrequestsarenotfollowedbyanysubsequentGETlayerrequest Production:dal,lon,fra,syd
IBMinternal:stgTesting:prs,dev
![Page 21: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/21.jpg)
21
Q5: Can future requests be predicted? SignificantincreaseinsubsequentGET
layerrequestswithinasessionProduction:dal,lon,fra,sydIBMinternal:stgTesting:prs,dev
![Page 22: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/22.jpg)
22
Q5: Can future requests be predicted? SignificantincreaseinsubsequentGET
layerrequestswithinasession
StrongcorrelationbetweenrequestsàGETlayersrequestscanbepredictedàopportunityforlayerprefetching
Production:dal,lon,fra,sydIBMinternal:stgTesting:prs,dev
![Page 23: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/23.jpg)
23
Enabling further analysis: Trace re-player
Client 1
Master Client 2
Client 3
Registry
Trace Round Robin/ Hashing (client remote address)
Performanceanalysismode
Offlineanalysismode
§ Study throughput and latency § Understand effect of CPU,
Memory, Storage, Network
§ Simulate prefetching and caching policies § Explore cache efficacy
Additionalanalysis§ Analyze request arrival rate at user define granularity § Study effect of deduplication on registry size
![Page 24: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/24.jpg)
24
Effect of backend storage technologies
10210 103 104 105 106 107 108 109
Experimental setup:
§ Registry on 32 core machine with 64 GB RAM and 512 GB SSD
§ Swift object store on 10 similar nodes
§ Trace re-player on 6 additional nodes
![Page 25: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/25.jpg)
25
Effect of backend storage technologies
10210 103 104 105 106 107 108 109
Experimental setup:
§ Registry on 32 core machine with 64 GB RAM and 512 GB SSD
§ Swift object store on 10 similar nodes
§ Trace re-player on 6 additional nodes Fastbackendstorage/cachefortheregistrycansignificantlyimprovetheoverallperformance
![Page 26: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/26.jpg)
26
Effect of a two-level Main Memory+SSD cache
Experimental setup: § Small layers (<100 MB) are stored in the main memory § Replacement policy for both cache level is LRU § Studied cache sizes:
RAM: 2%, 4%, 6%, 8%, and 10% of the data ingress SSD: 10x, 15x, 20x the size of RAM cache
§ Layers are content addressable à cache invalidation is not a problem
![Page 27: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/27.jpg)
27
Two-level cache: Main memory+SSD
00.20.40.60.81
2% 4% 6% 8% 10%
hitratio
dataingress
LRU:mem LRU:mem+SSD(10x)
LRU:mem+SSD(15x) LRU:mem+SSD(20x)
Dallas
![Page 28: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/28.jpg)
28
Benefit of layer prefetching
PUT layer GET manifest GET layer LMthresh MLthresh
036912
1h 12h 1d
hits/prefetch
LMthresh
ML-thresh:1hour ML-thresh:12hours ML-thresh:1day
![Page 29: Improving Docker Registry Design based on Production Workload Analysis · 2019-12-18 · Improving Docker Registry Design based on Production Workload Analysis Ali Anwar, Mohamed](https://reader030.vdocuments.mx/reader030/viewer/2022041019/5ecd6181eaac6c5f67389d27/html5/thumbnails/29.jpg)
29
Summary
§ We perform a quantitative characterization of a production Docker registry deployment § Registry workload is read intensive § Layers sizes are small § Strong correlation exists between layer requests
§ We propose effective caching and prefetching strategies for container layers
§ We enable further Docker investigation and optimization by making our traces and the trace re-player tool open source*
*https://dssl.cs.vt.edu/drtp/