Improving System Availability in Distributed Environments
Sam Malek [email protected]
withMarija Mikic-Rakic [email protected]
Nels Beckman [email protected]
Nenad Medvidovic [email protected]
Motivation
How good is this deployment architecture?What are its properties?
How should it be modified to ensure higher availability?
Effect of Deployment on Availability
Bad deployment Low availability Better deployment Higher availability
Redeployment
• Redeployment to maximize the availability – Frequency and volume of interactions, reliability and capacity of
network links• Hard to determine a good deployment in large scale
distributed systems– In the small example above, there are 310 = 59049 possible
deployments
Host 2Host 1
Host 3
34
8
7
9
5
1 2
6
10
Host 2Host 1
Host 3
34
8
79
5
1 2
6
10
Availability Definition
nsinteractio component total#
nsinteractio component successful #tyAvailabili
The degree to which the system is operational and accessible when required for use
System Model Parameters• Software component properties
• Memory requirements• Frequency of interaction • Size of the exchanged data
• Hardware host properties• Memory capacity• Network reliability• Network bandwidth
• Constraints• Location• Co-location
Problem Definition
• Find a system deployment architecture such that:
• It adheres to the system model parameters and constraints
• It has the greatest availability
Problem Break Down1) Lack of knowledge about runtime system parameters
– System model parameters not known at the time of initial deployment– System model parameters change at runtime
• Reliability of links, frequencies of interaction, etc.– Prism-MW monitoring support
2) Exponentially complex problem– n components and k hosts = kn possible deployments– DeSi’s polynomial time approximating algorithms
3) Solution analysis– Comparison of different solutions and algorithms– Centralized vs. Decentralized, performance vs. complexity, etc– DeSi’s visualization and comparison utilities
4) Effecting the selected solution– Redeploying components– Requires an automated solution– Prism-MW deployment support
DeSi
Approach
Prism-MW
2) Monitoring Data
1) Monitor
4) Redeployment Data
3) Analyze
Prism-MW– An architectural middleware that enables efficient implementation,
deployment, and execution of distributed systems in terms of their architectural elements: components, connectors, configurations, etc.
– Support for monitoring
– Support for redeployment
Admin
34
31
18
2 615
16
4 12
21
Admin
8
3 9
29 1
Admin
28
2030
17
14
0Admin
2226
13
27
10
33
7
24
25
32
19
23
11
Deployer
Distributed System
5
Architecture
Scaffold
BrickConnector
Component
DeployerAdmin
IMonitor
IAdmin
IScaffold
Serializable
Event
Extensible Component
DistributionConnector
Evt FrequencyMonitor
Network ReliabilityMonitor
Simplified Class Diagram of Prism-MW
Prism-MW’s Role
DeSi Prism-MW
2) Monitoring Data
1) Monitor
4) Redeployment Data
3) Analyze
Supports:
• Step 1 by monitoring events in the system and calculating the system parameters
• Step 4 by providing an API for the redeployment of components and meta-level components to automate the tasks
Maximizing Availability
• A family of centralized algorithms• Exact – exponential• Stochastic – quadratic• Adaptive greedy – cubic
• A family of decentralized algorithms• DecAp: Auction-based – cubic
• A set of clustering techniques – Reduce complexity– Improve performance
Algorithms’ Results
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Ava
ilab
ilit
y10 comps, 4hosts, 100%connected
50 comps, 15hosts, 80%connected
100 comps, 25hosts, 40%connected
250 comps, 50hosts, 80%connected
Assessing the Algorithms• Efficiency
– Execution time vs. precision
• Applicability– Centralized vs. Decentralized
• Effect of system characteristics• Impact of individual parameter changes• Addition of new system parameters• Application to new system properties• Requires “what if” scenario exploration
In comes DeSi!
DeSi’s Architecture
DeSi Model DeSi View
DeSi Controller
MiddlewarePlatform
TableView GraphView
SystemData
AlgoResultData
GraphViewData
Generator
AlgorithmContainer
Modifier
MiddlewareAdapter
Monitor
Effector
Legend:
Dataflow
Controlflow
• Key properties:• Tailorability• Scalability• Efficiency• Explorability
DeSi’s View (1)
DeSi’s View (2)
DeSi’s View (3)
DeSi’s View (4)
DeSi’s View (5)
DeSi’s Role
DeSi Prism-MW
2) Monitoring Data
1) Monitor
4) Redeployment Data
3) Analyze
Supports:
• Step 3 by providing several redeployment algorithms and various visualization utilities
• Steps 2 and 4 by providing the appropriate middleware adapter
Conclusion• Suite of automated tools and techniques for
improving the availability of a distributed system
• Currently extending the tools to model, analyze, and improve other non-functional aspects of a distributed system: security, latency, etc.
Questions?