Download - On Demand and Autonomic Computing
IBM Research
© 2003 IBM Corporation
On Demand andAutonomic Computing
Steve R. White
Senior Manager, Autonomic ComputingThomas J. Watson Research Laboratory
2
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Outline
Background and motivation
Research in autonomic components and systems
Autonomic computing architecture
Research in structured autonomic systems
3
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
On Demand Era
Responsive in real-time
Variable cost structures
Focused on what’s core and differentiating
Resilient around the world, around the clock
IntegratedOpenVirtuala
4
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Complex heterogeneous infrastructures are a reality!
Directory Directory and Security and Security
ServicesServicesExistingExisting
ApplicationsApplicationsand Dataand Data
BusinessBusinessDataData
DataDataServerServer
WebWebApplicationApplication
ServerServer
Storage AreaStorage AreaNetworkNetwork
BPs andBPs andExternalExternalServicesServices
WebWebServerServer
DNSDNSServerServer
DataData
Dozens of systems and applications
Hundreds of components
Thousands of tuning
parameters
5
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Motivation
Administration of individual systems is increasingly difficult 100s of configuration, tuning parameters for databases, Web application servers,
storage, …
Heterogeneous systems are becoming increasingly connected Integration becoming ever more difficult
Architects can't intricately plan interactions among components Increasingly dynamic; more frequently with unanticipated components
More of the burden must be assumed at run time But human system administrators can't assume the burden
6:1 cost ratio between storage administration and storage40% outages due to operator error
We need self-managing computing systems Behavior specified by system administrators via high-level policies
System and its components figure out how to carry out policies
6
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Increase Responsiveness
Adapt to dynamically changing environments
Business Resiliency
Discover, diagnose, act to prevent disruptions
Operational Efficiency
Tune resources, balance workloads to best use IT resources
Secure Information & Resources
Anticipate, detect, identify, deter attacks
Autonomic Self-Management
7
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Manual Autonomic
Ben
efi
tsS
kill
sC
har
acte
rist
ics
BasicLevel 1
ManagedLevel 2
PredictiveLevel 3
Evolving to Autonomic Computing
Multiple sources of
system generated data
Extensive, highly skilled
IT staff
Basic Requirements
Met
Data & actionsconsolidated through mgt
tools
IT staffanalyzes &
takes actions
Greater system awareness
Improved productivity
Sys monitors correlates & recommends
actions
IT staffapproves &
initiates actions
Less need for deep skills
Faster/better decision making
Sys monitors correlates &
takesaction
IT staff manages performance against SLAs
Human/system interaction
IT agility & resiliency
AutonomicLevel 5
Componentsdynamically
respond to bus policies
IT staff focuseson enabling
business needs
Business policy drives IT mgt
Business agility and resiliency
AdaptiveLevel 4
8
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Human Interaction with Autonomic SystemsP. Maglio, Almaden
Basic questions What do middleware administrators do?
How can we better support the problems and practices they have?
Learn answers to these questions via ethnographic studies
Use insights to design new ways to interact with complex computing systems
… but we thought that was the return
port!
We had it wrong. Our assumption of how it worked was incorrect.
We start with looking at the proxy server log files, then the web server log files, then the application server admin log files then the application log files.
9
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Few minutes later…
Dynamic Surge ProtectionJ. Hellerstein, Watson
Systems can go from steady Systems can go from steady state … state …
Internet
to overloaded without to overloaded without warning warning
10
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Response Time
#Active Servers#Requested Servers
Surge Protection DemoMonitor & remove servers
Actual BOPS
Predicted BOPS
11
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Enterprise Workload ManagementD. Dillenberger, Watson
InternetInternet
Appliance Appliance ServersServers
Web Web Application Application
ServersServersData and Data and
Transaction Transaction ServersServers
Internet/Internet/ExtranetExtranet
Business Business PartnersPartners
Large, distributed,heterogeneous system
Achieves end-to-end performance via adaptive algorithms Administrator defines policy
– Desired response times for various classes of users, apps eWLM managers on each resource cooperate to adaptively tune parameters
– OS, network, storage, virtual server knobs– JVM heap size, # garbage collection threads– Workload balancing, routing parameters
12
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Policies and Autonomic ComputingD. Verma and D. Kandlur, Watson
Policy: Set of guidelines or directives provided to autonomic element to influence its behavior.
Key Challenge: Move away from low level controls
Move towards high level directives (policies) over autonomic decisions
Developing scenarios, standards and technologies to support policies for autonomic computing
Element
M
A
S
EP
E
K
S E
Element
MM
AA
S
EEPP
E
KK
S E
1. External policies are delivered through effectors.
3. AnalyzeAnalyze system operation w.r.t. policiesCreates reports as dicatated by policy
4. PlanAssigns tasks based on policesAssigns resources based on policies Enables sensorsAdd/modify/delete policies
2. Policies are stored as knowledge
5. Enabled/disabled based on policies
6. Enabled/disabled based on external policies
13
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Utility Functions and Autonomic ComputingW. Walsh, Watson
Utility functions can guide autonomic decision making Self-optimization: natural way to express
optimization criteria
–Declarative: preferable to implicitly hard-coded in special purpose algorithms
Derivable from business objectives (e.g. optimize total profits)
–Can translate to computing metrics at different levels
Exploring applications in eWLM, eUtility, SLEDS
Response time RT
V(R
T)
Utility function
14
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Autonomic Computing ArchitectureThe Autonomic Element
AE is the fundamental abstraction Defines an important boundary
An AE contains Exactly one autonomic manager
Zero or more managed element(s)
– Could be basic resource like database, storage system, server, software app
– Higher level elements may have no managed element; they manage other autonomic elements via messages
AE is responsible for Providing/consuming computational
services
Interacting with other autonomic elements
Managing own behavior in accordance with policies
An Autonomic Element
Managed Element
ES
Monitor
Analyze
Execute
Plan
Knowledge
Autonomic Manager
An Autonomic Element
E.g. Database, storage, server, software app, workload mgr, sentinel, arbiter, OGSA infrastructure elements
15
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Autonomic Computing Architecture Element interactions
Based on OGSA; extensions as necessary Service-oriented architecture
Messages defined by WSDL: portTypes, operations
Services defined by constellations of portTypes AC architecture defines:
Required messages
Optional but standard messages For advanced interactions: conversation support
“Choreography” defines structure of multi-step interactions
Runtime enforces conversational protocols for app logic.
Underlies robust interactions
16
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Autonomic Manager ToolsetW. Arnold et al., Watson
Facilitates autonomic manager construction In accordance with AC architecture
Catcher for generic AM technologies OGSA messaging Policy tools Monitoring technologies AI tools for knowledge representation,
reasoning Math libraries for modeling, analysis,
planning Feedback control
V1.0 now available on alphaWorks Part of the Exploratory Technology Toolkit www.alphaworks.ibm.com An Autonomic Element
Managed Element
ES
Monitor
Analyze
Execute
Plan
Knowledge
Autonomic Manager
An Autonomic Element
ES
17
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Autonomic Computing SystemsA small-scale system prototype
PolicyRepository
Database Storage
Register Register
OGSARegistr
y
UserInterfac
e
18
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Autonomic Computing SystemsA small-scale system prototype
PolicyRepository
Database Storage
FindServiceData(PolicyRepository)
OGSARegistr
y
UserInterfac
eFetchPolicy,Subscribe(Policy)
ReportPolicy
19
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Autonomic Computing SystemsA small-scale system prototype
PolicyRepository
Database Storage
OGSARegistr
y
Service Class Definition
Alert Policy
ReportPolicySetPolicy
Publish(Policy)
UserInterfac
e
20
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Autonomic Computing SystemsA small-scale system prototype
PolicyRepository
Database
Storage
OGSARegistr
y
CreateTableSpace
AddResource(LV, Parms)
Alert PoliciesSvc Class Defs
FindServiceData(Storage)
QueryResponse(List(Storage))
UserInterfac
e
DeliverResource(LV Name)
21
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Autonomic Computing SystemsFlexibly composed from autonomic elements
LargeAutonomicSystem
ResourceArbiter
WorkloadManager
NetworkServer
Application Environment 2
Application
Manager
StorageDatabaseDatabaseNetwork
Application Environment 1
Application
Manager
Predictor
Server Storage
WorkloadManager
RegistryResourceManagers
(e.g. Storage, DB,
Servers)
PolicyRepository
Sentinel
eUtilityManager
22
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Workshops
First Workshop on Algorithms and Architectures for Self-Managing Systems (at FCRC ’03)
June 11, 2003 in San Diego, CA 5th Annual International Conference on Active Middleware Services:
Autonomic Computing Workshop June 25, 2003 in Seattle, WA
IJCAI-03 AI and Autonomic Computing: Developing a Research Agenda for Self Managing Computer Systems
August 10, 2003 in Acapulco, Mexico First International Workshop Autonomic Computing Systems at 14th
International Conference on Database and Expert Systems Applications (DEXA'2003)
1-5 September, 2003 in Prague, Czech Republic 14th IFIP/IEEE International Workshop on Distributed Systems:
Operations & Management (DSOM-03) October 20-22, 2003 in Heidelberg, Germany
23
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
References
The Vision of Autonomic Computing IEEE Computer, January 2003
http://computer.org/computer/homepage/0103/Kephart/
IBM Systems Journal special issue on Autonomic Computing http://www.research.ibm.com/journal/sj42-1.html
24
IBM Research
© 2003 IBM CorporationOn Demand and Autonomic Computing | August 1, 2003
Interesting Research Problems
Architecture What is the right architecture?
Should we be working on architecture at all? Policies
Can we really run large IT systems by specifying high-level policies? Centralized vs. Decentralized Control
Will decentralized control play an important role? Human Interaction
How will humans interact with large autonomic systems?
How can we express the behavior of a large, dynamic system to humans?
Systems With a Billion Components Are they even possible?