elastic cloud caches for accelerating service-oriented computations gagan agrawal ohio state...
TRANSCRIPT
Elastic Cloud Caches for Accelerating Service-Oriented Computations
Gagan AgrawalOhio State UniversityColumbus, OH
David ChiuWashington State UniversityVancouver, WA
2
Cloud Computing
• Pay-As-You-Go Computing‣ Running 1 machine for 10
hours = running 10 machines for 1 hour
• Elasticity‣ Cloud applications can
stretch and contract their resource requirements
• “Infinite resources”
3
Outline
‣Accelerating Data Intensive Services Using the Cloud
•Motivating Application
•Design of an Elastic Cache
‣Performance Evaluation
•Up-Scaling (cache expansion)
•Down-Scaling (cache contraction)
‣Future Work & Conclusion
4
Motivating Application
Data Sources
5
Computing & Storage Resources
Geoinformatics Cyber Infrastructure: Lake Erie
6
Shared/Proprietary Web Services
= Web Service
Geoinformatics Cyber Infrastructure: Lake Erie
7
. . .
Service Interaction with Cyber Infrastructure
Service Infrastructure
8
Service Interaction with Cyber Infrastructure
. . .
invoke
results
Service Infrastructure
9
Problem: Query Intensive Circumstances
. . .
. . .
. . .
Service Infrastructure
10
Outline
‣Accelerating Data Intensive Services Using the Cloud
•Motivating Application
•Design of an Elastic Cache
‣Performance Evaluation
•Up-Scaling (cache expansion)
•Down-Scaling (cache contraction)
‣Future Work & Conclusion
11
Designing an Elastic Cache
Compute Cloud
. . .
Service Infrastructure
A
B
12
Designing an Elastic Cache. .
.
Service Infrastructure
A
BCache
Requests
Inserts
Misses
node = (k mod 2)
13
Eventual Overloading. .
.
Service Infrastructure
A
BCache
Requests
Inserts
Misses
node = (k mod 2)
14
Scaling up to Meet Demand. .
.
Service Infrastructure
A
B
CacheRequests
Compute Cloud
C
node = (k mod 2)
15
Issues with Naive Hashing. .
.
Service Infrastructure
A
B
CacheRequests
node = (k mod 3)
C
How to incorporate node C withleast amount of “disruption?”
16
. . .
A
B
75
25
8
Hash Intervals (buckets)
Distributed Hashtables (DHT)
17
. . .
A
B
75
25
8
invoke:
service(35)
(35 mod 100) = 35Which proxy has the page?h(k) = (k mod 100)
h(35)
Distributed Hashtables (DHT)
A
B
75
25
8
50 COnly records hashing into (25,50] need to be moved
from A to C!
DHT to Minimize Hash Disruption when Scaling
19
That’s Not Completely Elastic
‣What about relaxing the amount of nodes to help save
Cloud save costs?
‣First, we need an eviction scheme
20
Exponential Decay Eviction
‣At eviction time:
•A value, , is calculated for each data record in the
evicted slice
• is higher:
- if was accessed more recently
- if was accessed frequently
•If is lower than some threshold, evict
21
Outline
‣Accelerating Data Intensive Services Using the Cloud
•Motivating Application
•Design of an Elastic Cache
‣Performance Evaluation
•Up-Scaling (cache expansion)
•Down-Scaling (cache contraction)
‣Future Work & Conclusion
22
Experimental Configuration
• Application‣ Shoreline Extraction‣ Takes 23 seconds to
complete without benefits of cache
‣ Executed on a miss
‣Amazon EC2 Cloud
•Each Cloud node:
- Small Instances (Single core 1.2Ghz, 1.7GB, 32bits)
- Ubuntu Linux
•Caches start out cold
•Data stored in memory only
23
Experimental Configuration
‣Our approach exploits an elastic Cloud environment:
‣We compare GBA against statically allocated Cloud
environments:
•2 fixed nodes (static-2)
•4 fixed nodes (static-4)
•8 fixed nodes (static-8)
•Cache overflow --> LRU eviction
24
Relative Speedup
Querying Rate: 255 invocations/sec
25
Cache Expansion/Migration Times
Querying Rate: 255 invocations/sec
26
Experimental Configuration
‣Amazon EC2 Cloud
•Each Cloud node:
- Small Instance (Single core 1.2Ghz, 1.7GB, 32bits)
•Caches start out cold
•Data stored in memory
•When 2 nodes become < 30% capacity, merge
‣Sliding Window Configuration:
•Time Slice: 1 sec
•Size: 100 Time Slices
27
Data Eviction: 50/255/50 queries per sec
Sliding Window Size = 100 sec
50 q/sec 255 q/sec 50 q/sec
28
Cache Contraction: 50/255/50 queries per sec
29
Cache Contraction: 50/255/50 queries per sec
30
Experimental Summary
‣Caching Web service results reduces mean execution
times significantly for our application
‣Cloud node allocation is a huge overhead, but the cost
is amortized over average execution times
‣On average, our approach uses less nodes (and thus,
less cost) than statically allocated schemes
31
Outline
‣Accelerating Data Intensive Services Using the Cloud
•Motivating Application
•Design of an Elastic Cache
‣Performance Evaluation
•Up-Scaling (cache expansion)
•Down-Scaling (cache contraction)
‣Future Work & Conclusion
32
Conclusion
‣We introduced to some challenges in the Cloud:
•Controlling Cost
•Real-time system management (downscaling, upscaling)
‣We saw how the Cloud’s elasticity could be harnessed
to accelerate service-oriented processes
33
Future/Current Work
34
Thank you
‣Questions and Comments?
•David Chiu - [email protected]
•Gagan Agrawal - [email protected]
In memory of Prof. Yuri Breitbart
(1940 -- 2010)