network weather service sathish vadhiyar sources / credits: nws web site: ://nws.cs.ucsb.edu nws...

38
Network Weather Network Weather Service Service Sathish Vadhiyar Sathish Vadhiyar Sources / Credits: NWS web site: http://nws.cs.ucsb.edu NWS papers

Upload: estefani-sommers

Post on 28-Mar-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Network Weather Network Weather ServiceService

Sathish VadhiyarSathish Vadhiyar

Sources / Credits:

• NWS web site: http://nws.cs.ucsb.edu

• NWS papers

Page 2: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

IntroductionIntroduction

““NWS provides accurate forecasts of NWS provides accurate forecasts of dynamically changing performance dynamically changing performance characteristics from a distributed set of characteristics from a distributed set of metacomputing resources”metacomputing resources”What What willwill be the be the futurefuture load (not current load (not current load) when a program is executed?load) when a program is executed?Producing short-term performance Producing short-term performance forecasts based on historical performance forecasts based on historical performance measurementsmeasurementsThe forecasts can be used by dynamic The forecasts can be used by dynamic scheduling agentsscheduling agents

Page 3: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

IntroductionIntroduction

Resource allocation and scheduling Resource allocation and scheduling decisions must be based on decisions must be based on predictionspredictions of resource of resource performance during a timeframeperformance during a timeframe

NWS takes periodic measurements of NWS takes periodic measurements of performance and using numerical performance and using numerical models, forecasts resource models, forecasts resource performanceperformance

Page 4: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

NWS GoalsNWS Goals

ComponentsComponents Persistent statePersistent state Name serverName server SensorsSensors

Passive (CPU availability)Passive (CPU availability)

Active (Network measurements)Active (Network measurements) ForecasterForecaster

Page 5: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

ArchitectureArchitecture

Page 6: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

ArchitectureArchitecture

Page 7: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Performance measurementsPerformance measurements

Using sensorsUsing sensorsCPU sensorsCPU sensors

Measures CPU availabilityMeasures CPU availability UsesUses

uptimeuptimevmstatvmstatActive probesActive probes

Network sensorsNetwork sensors Measures latency and bandwidthMeasures latency and bandwidth

Each host maintainsEach host maintains Current dataCurrent data One-step ahead predictionsOne-step ahead predictions Time series of dataTime series of data

Page 8: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Network MeasurementsNetwork Measurements

Page 9: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Issues with Network SensorsIssues with Network Sensors

Appropriate Appropriate transfer size for transfer size for measuring measuring throughputthroughput

Collision of network Collision of network probesprobes

SolutionsSolutions Tokens and Tokens and

hierarchical trees hierarchical trees with cliqueswith cliques

Page 10: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Available CPU measurementAvailable CPU measurement

Page 11: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Available CPU measurementAvailable CPU measurement

The formulae The formulae shown does not shown does not take into account take into account job prioritiesjob priorities

Hence periodically Hence periodically an active probe is an active probe is run to adjust the run to adjust the estimatesestimates

Page 12: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

PredictionsPredictions

To generate a forecast, forecaster requests To generate a forecast, forecaster requests persistent state datapersistent state dataWhen a forecast is requested, forecaster makes When a forecast is requested, forecaster makes predictions for existing measurements using predictions for existing measurements using different forecast modelsdifferent forecast modelsDynamic choice of forecast models based on the Dynamic choice of forecast models based on the best Mean Absolute Error, Mean Square Prediction best Mean Absolute Error, Mean Square Prediction Error, Mean Percentage Prediction ErrorError, Mean Percentage Prediction ErrorForecasts requested by:

InitForecaster() RequestForecasts()

Forecasting methodsForecasting methods Mean-basedMean-based Median basedMedian based AutoregressiveAutoregressive

Page 13: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Forecasting MethodsForecasting Methods

Notations:

Prediction Accuracy:

Mean Absolute Error (MAE) is the average of the above

Prediction Method:

Page 14: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Forecasting Methods – Mean-Forecasting Methods – Mean-basedbased

1.

2.

3.

Page 15: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Forecasting Methods – Mean-Forecasting Methods – Mean-basedbased

4.

5.

Page 16: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Forecasting Methods – Median-Forecasting Methods – Median-basedbased

1.

2.

3.

Page 17: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

AutoregressionAutoregression1.

ai found such that it minimizes the overall error.

ri ,j is the autocorellation function for the series of N measurements.

Page 18: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Forecasting MethodologyForecasting Methodology

Page 19: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Forecast ResultsForecast Results

Page 20: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Forecasting Complexity vs Forecasting Complexity vs AccuracyAccuracy

•Semi Non-parametric Time Series Analysis (SNP) – an accurate but complicated model

•Model fit using iterative search

•Calculation of conditional expected value using conditional probability density

Page 21: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Sensor ControlSensor Control

Each sensor connects to Each sensor connects to other sensors and other sensors and perform measurements perform measurements O(NO(N22))To reduce the time To reduce the time complexity, sensors complexity, sensors organized in hierarchy organized in hierarchy called cliquescalled cliquesTo avoid collisions, To avoid collisions, tokens are usedtokens are usedAdaptive control using Adaptive control using adaptive token timeoutsadaptive token timeoutsAdaptive time-out Adaptive time-out discovery and distributed discovery and distributed leader election protocolleader election protocol

Page 22: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Synchronizing network probesSynchronizing network probes

Consistent periodicity and Consistent periodicity and mutual exclusionmutual exclusionTokenToken

List of hosts to probeList of hosts to probe Periodicity of probePeriodicity of probe Parameters to the probeParameters to the probe Sequence numberSequence number

Leader initiates the tokenLeader initiates the tokenA hosts after receiving a A hosts after receiving a token:token:

Conducts probes with the Conducts probes with the other hosts in the tokenother hosts in the token

Passes the token to the Passes the token to the next hostnext host

Token passed back to the Token passed back to the leaderleader

Page 23: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Contd…Contd…

Leader notes the token circuit time and calculates Leader notes the token circuit time and calculates the next token initiation time as (desired the next token initiation time as (desired periodicity – token circuit time)periodicity – token circuit time)To avoid long delays in token circulation and to To avoid long delays in token circulation and to have fault tolerance:have fault tolerance:

Each host maintains a timerEach host maintains a timer When the timer times out, the host declares itself as the When the timer times out, the host declares itself as the

leader and initiates a new tokenleader and initiates a new token When a host encounters two tokens, the old token is When a host encounters two tokens, the old token is

destroyeddestroyed

Calculation of time-outsCalculation of time-outs Each host records token circuit time, variance of the Each host records token circuit time, variance of the

timetime Uses NWS forecasting models to predict the next token Uses NWS forecasting models to predict the next token

arrival timearrival time

Page 24: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

New ProtocolNew Protocol

Compromise between periodicity and Compromise between periodicity and mutual exclusionmutual exclusionNWS administrator specifies periodicity, NWS administrator specifies periodicity, and an upper range of desired periodicityand an upper range of desired periodicity If network conditions are stable and if tokens If network conditions are stable and if tokens

are received within the upper range, then are received within the upper range, then mutual exclusion is guaranteedmutual exclusion is guaranteed

If not, hosts times out and start conducting If not, hosts times out and start conducting probes with possible collisionsprobes with possible collisions

Thus the protocol switches between good Thus the protocol switches between good and bad phasesand bad phases

Page 25: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

IllustrationIllustration

Page 26: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Comparison of 2 protocols – Comparison of 2 protocols – Experimental setupExperimental setup

4 machines – 2 in Lyon, France and 2 4 machines – 2 in Lyon, France and 2 in Tennessee, USAin Tennessee, USA

240 second periodicity240 second periodicity

5 second range5 second range

Page 27: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Comparison - PeriodicityComparison - Periodicity

Page 28: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Comparison – Mutual exclusionComparison – Mutual exclusion

Page 29: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Use of NWS: Use of NWS: Scheduling a Jacobi applicationScheduling a Jacobi application

The problem: Appropriate partitioning strategy to balance processor efficiencies and communication overheads, i.e. deriving partitions to obtain resource performance

Page 30: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Deriving Partitions for JacobiDeriving Partitions for Jacobi

NotationsNotations

Per-processor execution timePer-processor execution time

The goalThe goal

Page 31: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Deriving Partitions for JacobiDeriving Partitions for Jacobi

Communication timeCommunication time

Soultion: system of linear equations by Gaussian Soultion: system of linear equations by Gaussian EliminationElimination

Page 32: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

NWS in JacobiNWS in Jacobi

Page 33: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Resource Selection and SchedulingResource Selection and Scheduling

Page 34: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Resource Selection and SchedulingResource Selection and Scheduling

Page 35: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

ReferencesReferences

Implementing a Performance Forecasting System for Implementing a Performance Forecasting System for Metacomputing: The Network Weather Service. Rich Metacomputing: The Network Weather Service. Rich Wolski, Neil Spring, Chris Peterson, in Proceedings of Wolski, Neil Spring, Chris Peterson, in Proceedings of SC97, November, 1997.SC97, November, 1997.Dynamically Forecasting Network Performance Using Dynamically Forecasting Network Performance Using the Network Weather Service. Rich Wolski, in Journal of the Network Weather Service. Rich Wolski, in Journal of Cluster Computing, Volume 1, pp. 119-132, January, Cluster Computing, Volume 1, pp. 119-132, January, 1998.1998.The Network Weather Service: A Distributed Resource The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing. Performance Forecasting Service for Metacomputing. Rich Wolski, Neil Spring, and Jim Hayes, Journal of Rich Wolski, Neil Spring, and Jim Hayes, Journal of Future Generation Computing Systems,Volume 15, Future Generation Computing Systems,Volume 15, Numbers 5-6, pp. 757-768, October, 1999.Numbers 5-6, pp. 757-768, October, 1999.

Page 36: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

ReferencesReferences

Synchronizing Network Probes to avoid Synchronizing Network Probes to avoid Measurement Intrusiveness with the Network Measurement Intrusiveness with the Network Weather Service, B. Gaidioz, R. Wolski, and B. Weather Service, B. Gaidioz, R. Wolski, and B. Tourancheau, Proceedings of 9th IEEE High-Tourancheau, Proceedings of 9th IEEE High-performance Distributed Computing Conference, performance Distributed Computing Conference, August, 2000, pp. 147-154.August, 2000, pp. 147-154.Experiences with Predicting Resource Experiences with Predicting Resource Performance On-line in Computational Grid Performance On-line in Computational Grid Settings, Rich Wolski, ACM SIGMETRICS Settings, Rich Wolski, ACM SIGMETRICS Performance Evaluation Review, Volume 30, Performance Evaluation Review, Volume 30, Number 4, pp 41--49, March, 2003. Number 4, pp 41--49, March, 2003.

Page 37: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Forecasting Methods SummaryForecasting Methods Summary

Page 38: Network Weather Service Sathish Vadhiyar Sources / Credits: NWS web site: ://nws.cs.ucsb.edu NWS papers

Prediction AccuracyPrediction Accuracy