information models: creating and preserving value in ...€¦ · •chaojiezhang, varun gupta, and...

21
Information Models: Creating and Preserving Value in Volatile Resources Chaojie Zhang, Varun Gupta, Andrew A. Chien University of Chicago June 25, 2019 ROSS Workshop 1 Chien - ROSS 2019

Upload: others

Post on 04-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Information Models: Creating and Preserving Value in Volatile Resources

Chaojie Zhang, Varun Gupta, Andrew A. ChienUniversity of Chicago

June 25, 2019ROSS Workshop

1Chien - ROSS 2019

Page 2: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Excess Resources in the Cloud

• IaaS demand expanding• Demand fluctuates • Capacities must meet peak

demand•à excess resources• Excess offered as volatile

resources

Chien - ROSS 20192

Excess Resources

Foreground load

Reso

urce

s0

N

Page 3: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Cloud Operator

What are Volatile Resources?

• Unreliable, can be unilaterally revoked• Examples• Google Preemptible VMs• AWS Spot Instances

• Consequences• Wasted work• Delayed critical path

Chien - ROSS 20193

RequestsReliable Resource User

VolatileResource User Revocation

Requests

Page 4: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Volatile Resource Availability

Arming Users with Information

Chien - ROSS 20194

Volatile Resource AvailabilityInformation Models (summary)

Page 5: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Maximizing Value of Volatile Resources

• What information model do users need to maximize their value of volatile resources• Assume if user value maximized à cloud providers can sell for more

money

Chien - ROSS 20195

Optimized Use

Information Models

Volatile ResourceUser

Volatile Resource Availability

Page 6: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Main Contributions

• Show a specific information model that dramatically increases users’ ability to achieve value (small)• Cloud providers can provide information models without

compromising internal resource management flexibility • Results are robust over 608 AWS Spot Instance pools• 4 regions, millions of CPUs

Chien - ROSS 20196

Page 7: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Information Models

• What information enables users to target volatile resources to extract most value?• Interval duration PDF's

1. MTTR 2. 10pctile 3. 90pctile FullChien - ROSS 2019

7

Page 8: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Evaluation of Information Models

• Resource Dynamics: 3-month 608 AWS Spot Instance pools• 5 minute intervals, 15 million data points

• User behaviors• Match computations to resource (duration ~ time to revocation) • Maximize the expectation of value of job duration on the intervals

• Utility function• Step function (batch and workflow tasks)

• Metrics:• Total User Value

Chien - ROSS 20198

Page 9: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Evaluation: Total Value vs. Information Models

• Comparing three information models• 90pctile gives best results

• 30% value increase

Chien - ROSS 20199

Page 10: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Evaluation: Total Value of Information Models

• Comparing three information models, and Full is a reference• 90pctile gives best results

• 30% value increase• Limited information models can achieve most of the benefit of Full, 90%• Results are robust over vast majority of 608 instance pools

Chien - ROSS 201910

Page 11: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Evaluation: Robustness of Info Model BenefitMean of 608 pools

Chien - ROSS 201911

• But, cloud providers use a range of volatile resource management (VRM, revocation) policies?• Information Model benefit and ordering is robust across• A range of VRMs• All 608 instance pools

Page 12: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Information Models: Summary

• It’s hard for users to maximize value with no information, and cloud providers afraid of sharing too much• With just limited information (mean + 90th percentile) dramatically

increase user value

• However, cloud providers worry that information model will constrain resource management

Chien - ROSS 201913

Page 13: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Original foreground loadIncreased frequencyOriginal foreground loadIncreased Magnitude

Challenge: Statistical Guarantees and Resource Management “Freedom”• So, if we gave out an information model (statistical guarantee) :

Does it constrain resource management?• Changed foreground load à Changed statistics

Chien - ROSS 201914

Original foreground load

Volatile resource availability

Page 14: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

What about a Change in Magnitude?

• Consider drastic reduction in volatile resources (1->1/K)• K = 1, 2, 3• How does this affect 90pctile? • 2-week sliding window

• Magnitude change has no impact on 90pctile statistical guarantees à No constraint!

Chien - ROSS 201915

Page 15: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

What about a Change in Frequency?

• Increase volatile resource variation frequency by contracting time base (1->1/F)• F = 1, 2, 3• How does this affect 90pctile?• 2-week sliding window

• Frequency change reduces 90pctile dramatically• Violates the guarantee!

Chien - ROSS 201916

Page 16: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Can We Preserve the Guarantee?

• Idea: Guarantee-Preserving Resource Management• Maintain 90pctile guarantee under frequency

change

• Offline Static Algorithm• Reshape the distribution by withholding each

interval for X minutes• kills short intervals, shortens long intervals

• What is the best X?• Find smallest X that preserves guarantee

Chien - ROSS 201917

X minutes

Page 17: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Online Dynamic Algorithms

• Idea: AIMD, Online Targeting• Doubles the 90pctile – preserves

the guarantee and reduces job failures• Info Model => Good user value• Preserving RM => Providers’

flexibilities

Chien - ROSS 201918

Page 18: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Classifying 608 Instance Types

• 3 Classes of Instance Types• Stable, Transition, Unstable

• 400 Stable• The 90pctile is consistent

• 177 Transition • 90pctile guarantee is

matched most of the time

• 31 Unstable • 90pctile unstable, low,

unusable

Chien - ROSS 201920

Stable Transition

Page 19: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Evaluation: Preserving 90pctile Guarantees

• Guarantee Preserving Algorithms • Effective for Stable pools• Helpful for Transition pools

Viol

atio

n Pe

rcen

tage

(tim

e)

Chien - ROSS 201921

Page 20: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Related Work

• Volatile Resource Characterization• Characterization of price [Javadi 2011, Tang 2012, Wolski 2017], revocation

behavior [Chohan 2010]• Engineering Reliable Resources• Checkpointing [Khatua 2013], replication [Voorsluys 2012, Xu 2016 ],

migration [Yi 2013, Jung 2013]• Construct an “economy class” of nearly reliable resources [Carvalho 2014]

• Value of Information• Transient guarantee [Shastri 2016]

• Guarantee Preserving Algorithms• None

Chien - ROSS 201922

Page 21: Information Models: Creating and Preserving Value in ...€¦ · •ChaojieZhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating and Preserving Value in Volatile Cloud

Summary & Future Work

• Small information model à large increase in user value• 90pctile info model: two numbers• 30% average increase, up to 2X• 90% of the benefit of full disclosure

• Guarantee preserving algorithms can preserve guarantees and maintain cloud provider’s flexibility• Results robust over 608 AWS Spot Instance pools

• For more information: http://zccloud.cs.uchicago.edu/ and• Chaojie Zhang, Varun Gupta, and Andrew A. Chien, Information Models: Creating

and Preserving Value in Volatile Cloud Resources , in the IEEE International Conference on Cloud Engineering (IC2E), June 2019, Prague, Czechoslovakia.

Chien - ROSS 201923