division of labor: tools for growing and scaling grids
DESCRIPTION
Division of Labor: Tools for Growing and Scaling Grids. Tim Freeman, Kate Keahey, Ian Foster, Abhishek Rana, Frank Wuerthwein, Borja Sotomayor. Division of Labor. - PowerPoint PPT PresentationTRANSCRIPT
Division of Labor:Tools for Growing and
Scaling Grids
Tim Freeman, Kate Keahey,
Ian Foster, Abhishek Rana, Frank Wuerthwein, Borja Sotomayor
12/05/06 ICSOC ‘06
Division of Labor
How can we implement division of labor in Grid computing?
The greatest improvements in the productive powers of labour , and the greater part of the skill, dexterity, and judgment with which it is anywhere directed, or applied, seem to have been the effects of the division of labour.
(Adam Smith)
requirementsfor an abstraction
tools to implement an abstraction
12/05/06 ICSOC ‘06
Overview
Problem Definition The Edge Service Use Case
Workspace Service Overview of the workspace service Extensions to workspace service
Implementation and Evaluation CPU enforcement Network Enforcement
Status of the Edge Services Project Conclusions
12/05/06 ICSOC ‘06
Overview Problem Definition
The Edge Service Use Case Workspace Service
Overview of the workspace service Extensions to workspace service
Implementation and Evaluation CPU enforcement Network Enforcement
Status of the Edge Services Project Conclusions
12/05/06 ICSOC ‘06
Providers and Consumers
Resource provider Resource consumers
Has a limited number of resourcesWant the resources when they
need them & as much as they need
Has to balance the softwareneeds of multiple users
Want to use specificsoftware packages
Has to provide a limited executionenvironment for security reasons
Wants as much controlas possible over resources
12/05/06 ICSOC ‘06
The Edge Service Use Case
CDF
CMS ATLAS
Guest VO
ESF
SE CE
Site
GT4 Workspace Service & VMM
Dynamically deployed ES Wafers for each VO
Wafer images stored in SE
Compute nodes and Storage nodes
12/05/06 ICSOC ‘06
Edge Services: Challenges
VO-specific Edge Services Each VO has very specific configuration requirements
Resource management The VOs would like to provide quality of service to their
users The resource needs of the VOs are change dynamically
Dynamic, policy-based deployment and management of Edge Services Updates, ephemeral edge services, infrastructure
testing, short-term usage
12/05/06 ICSOC ‘06
Division of Labor Dimensions
Environment and Configuration Isolation
Critical from the point of view of the provider if the VOs are to be allowed some independence
Resource usage and accounting Application-independent Management along different resource aspects Dynamically renegotiable/adaptable
12/05/06 ICSOC ‘06
Overview Problem Definition
The Edge Service Use Case Workspace Service
Overview of the workspace service Extensions to workspace service
Implementation and Evaluation CPU enforcement Network Enforcement
Status of the Edge Services Project Conclusions
12/05/06 ICSOC ‘06
GT4 workspace service
The GT4 Virtual Workspace Service (VWS) allows an authorized client to deploy and manage workspaces on-demand. GT4 WSRF front-end Leverages multiple GT services Currently implements workspaces as VMs
Uses the Xen VMM but others could also be used
Current release 1.2.1 (December, 06) http://workspace.globus.org
12/05/06 ICSOC ‘06
Workspace Service Usage Scenario
Poolnode
Trusted Computing Base (TCB)
ImageNode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
The workspace service has a WSRF frontend that allows users to deploy and manage
virtual workspaces
The VWS manages a set of nodesinside the TCB (typically a cluster).
This is called the node pool.
Each node must have a VMM (Xen)installed, along with the workspacebackend (software that manages
individual nodes)
VM images are staged to adesignated image node
inside the TCB
VWSNode
VWSService
12/05/06 ICSOC ‘06
ImageNode
Deploying Workspaces
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Workspace
- Workspace metadata - Resource Allocation
VWSService
Adapter-based implementation model Transport adapters
Default scp, then gridftp
Control adapters Default ssh Deprecated: PBS, SLURM
VW deployment adapter Xen Previous versions: VMware
12/05/06 ICSOC ‘06
ImageNode
Interacting with Workspaces
Poolnode
Trusted Computing Base (TCB)
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
Poolnode
The workspace service publishesinformation on each workspace
as standard WSRF ResourceProperties.
Users can query thoseproperties to find out
information about theirworkspace (e.g. what IP
the workspace was bound to)
Users can interact directly with their
workspaces the same way the would with a
physical machine.
VWSService
12/05/06 ICSOC ‘06
Deployment Request Arguments
A workspace, composed of: VM image Workspace metadata
XML document Includes deployment-independent information:
VMM and kernel requirements NICs + IP configuratoin VM image location
Need not change between deployments Resource Allocation
Specifies availability, memory, CPU%, disk Changes during or between deployments
12/05/06 ICSOC ‘06
Workspace Service Interfaces
Workspace Service
Workspace FactoryService
Create()
Workspace Meta-data/Image
ResourceAllocation
inspect & manage
notify
Workspace Resource Instance
authorize & instantiate
Workspace Service
Handles creation of workspaces.Also publishes information onwhat types of workspaces it
can support
Handles management ofeach created workspace
(start, stop, pause, migrate,inspecting VW state, ...)
Resource Properties publish theassigned resource allocation, how VW was bound to metadata (e.g.IP address), duration, and state
12/05/06 ICSOC ‘06
Extensions to Resource Allocation
12/05/06 ICSOC ‘06
Overview
Problem Definition The Edge Service Use Case
Workspace Service Overview of the workspace service Extensions to workspace service
Implementation and Evaluation CPU resource allocation Network resource allocation
Status of the Edge Services Project Conclusions
12/05/06 ICSOC ‘06
Edge Services Today
GRAMVO1
VO2
VO1
7.83 jpm
8 jpm
Job throughput is low as both VOs are equally impacted by the high VO1 traffic
Both VOsshare the
same resource
Compute Element (CE) implemented as GT GRAM
12/05/06 ICSOC ‘06
Allocating Resources for Edge Services
GRAM
4.18 jpm
22.36 jpm
GRAMVO1
VO2
VO1
Job throughput for VO2 is high as it is unimpacted by the high VO1 traffic
WorkspaceService
Resource Allocation:MEM: 896 MBCPU: CPU %: 45% CPU arch: AMD Athlon
Resource Allocation:MEM: 896 MBCPU: CPU %: 45% CPU arch: AMD Athlon
Dom0 CPU %: 10%
12/05/06 ICSOC ‘06
Tracking Requests Overtime
Comparison of Request Throughput over Time
0
5
10
15
20
25
30
30 90 150 210 270 330 390 450 510 570 630 690 750 810
Time (in 30 second buckets)
Completed jobs
VO1Client VO2Client
- Histogram of request throughput
- Resource usage is enforced on an “as needed” basis
12/05/06 ICSOC ‘06
Increasing Load on VO1VO2 (under changing VO1 load conditions
0
2
4
6
8
10
12
14
16
30 60 90 120 150 180 210 240 270 300 330 360 390 420 450 480
Time (in 30 second buckets)
Jobs completed
1mill-VO2 2mill-VO2 3mill-VO2 - Histogram of request throughput
- The load on VO1 increases2x and 3x
- Request throughputfor VO2 is unimpacted
12/05/06 ICSOC ‘06
Network Resource Allocation
Processing network traffic requires CPU In Xen: for both dom0 and guest domains
CPU allocation tradeoffs Scheduling frequency
The mechanism is general Save for direct drivers
B
dom0domU
domU
12/05/06 ICSOC ‘06
Network Resource Allocation
Network Allocation Implementation CPU allocations based on a parameter
sweep Close to maximum bandwidth
Linux network shaping tools Negotiating network resource allocations
Policy: accepting only CPU allocations that match the bandwidth
12/05/06 ICSOC ‘06
Storage Element (SE) Edge Service
VO2GridFTP
VO1GridFTPVO1
VO2
WorkspaceService
Resource Allocation:MEM: 128 MBCPU: CPU %: 6% CPU arch: AMD AthlonNIC: Incoming: 4.1 MB/s
Resource Allocation:MEM: 128 MBCPU: CPU %: 6% CPU arch: AMD AthlonNIC: Incoming: 4.1 MB/s
Dom0 CPU %: 22%
12/05/06 ICSOC ‘06
Negotiating Bandwidth
12/05/06 ICSOC ‘06
Renegotiating CPU and Bandwidth
VO2GridFTP
VO1GridFTP
WorkspaceService
Resource Allocation:MEM: 128 MBCPU: CPU %: 6% CPU arch: AMD AthlonNIC: Incoming: 4.1 MB/s
Resource Allocation:MEM: 128 MBCPU: CPU %: 6% CPU arch: AMD AthlonNIC: Incoming: 4.1 MB/s
Resource Allocation:MEM: 128 MBCPU: CPU %: 14% CPU arch: AMD AthlonNIC: Incoming: 8.2 MB/s
Dom0 CPU %: 22%
12/05/06 ICSOC ‘06
Renegotiating CPU and Bandwidth
12/05/06 ICSOC ‘06
Renegotiating CPU
VO2GridFTP
VO1GridFTP
WorkspaceService
Resource Allocation:MEM: 128 MBCPU: CPU %: 6% CPU arch: AMD AthlonNIC: Incoming: 4.1 MB/s
Resource Allocation:MEM: 128 MBCPU: CPU %: 14% CPU arch: AMD AthlonNIC: Incoming: 8.2 MB/s
Resource Allocation:MEM: 128 MBCPU: CPU %: 34% CPU arch: AMD AthlonNIC: Incoming: 8.2 MB/s
Dom0 CPU %: 22%
12/05/06 ICSOC ‘06
Renegotiating CPU
12/05/06 ICSOC ‘06
Edge Services: Status OSG activity
www.opensciencegrid.org/esf Edge Services in use (database caches)
ATLAS: mysql-gsi db built by the DASH project CMS: frontier database
Base Image library SDSC: SL3.0.3, FC4, CentOS4.1 FNAL: SL3.0.3, SL4, LTS 3, LTS 4
Sites Production: SDSC also testing at FNAL, UC and ANL
12/05/06 ICSOC ‘06
Related Work Edge Service efforts
VO boxes, EGEE APAC, static Edge Services Grid-Ireland, static Edge Services
OGF efforts: WS-Agreement, JSDL Managed Services QoS with Xen
Padma Apparo, Intel (VTDC paper) Rob Gardner & team, HP Credit-based scheduler
Grid computing and virtualization Work at University of Florida, Purdue, Northwestern, Duke
and others
12/05/06 ICSOC ‘06
Conclusions
VM-based workspaces are a promising tool to implement “division of labor”
Renegotiation is an important resource management tool Protocols Enforcement methods: dynamic reallocation,
migration, etc. Aggregate resource allocations
Different resource aspects influence each other More work on managing VM resources is needed