division of labor: tools for growing and scaling grids

Division of Labor:Tools for Growing and

Scaling Grids

Tim Freeman, Kate Keahey,

Ian Foster, Abhishek Rana, Frank Wuerthwein, Borja Sotomayor

12/05/06 ICSOC ‘06

Division of Labor

How can we implement division of labor in Grid computing?

The greatest improvements in the productive powers of labour , and the greater part of the skill, dexterity, and judgment with which it is anywhere directed, or applied, seem to have been the effects of the division of labour.

(Adam Smith)

requirementsfor an abstraction

tools to implement an abstraction

12/05/06 ICSOC ‘06

Overview

Problem Definition The Edge Service Use Case

Workspace Service Overview of the workspace service Extensions to workspace service

Implementation and Evaluation CPU enforcement Network Enforcement

Status of the Edge Services Project Conclusions

12/05/06 ICSOC ‘06

Overview Problem Definition

The Edge Service Use Case Workspace Service

Overview of the workspace service Extensions to workspace service



12/05/06 ICSOC ‘06

Providers and Consumers

Resource provider Resource consumers

Has a limited number of resourcesWant the resources when they

need them & as much as they need

Has to balance the softwareneeds of multiple users

Want to use specificsoftware packages

Has to provide a limited executionenvironment for security reasons

Wants as much controlas possible over resources

12/05/06 ICSOC ‘06

The Edge Service Use Case

CDF

CMS ATLAS

Guest VO

ESF

SE CE

Site

GT4 Workspace Service & VMM

Dynamically deployed ES Wafers for each VO

Wafer images stored in SE

Compute nodes and Storage nodes

12/05/06 ICSOC ‘06

Edge Services: Challenges

VO-specific Edge Services Each VO has very specific configuration requirements

Resource management The VOs would like to provide quality of service to their

users The resource needs of the VOs are change dynamically

Dynamic, policy-based deployment and management of Edge Services Updates, ephemeral edge services, infrastructure

testing, short-term usage

12/05/06 ICSOC ‘06

Division of Labor Dimensions

Environment and Configuration Isolation

Critical from the point of view of the provider if the VOs are to be allowed some independence

Resource usage and accounting Application-independent Management along different resource aspects Dynamically renegotiable/adaptable

12/05/06 ICSOC ‘06

Overview Problem Definition

The Edge Service Use Case Workspace Service

Overview of the workspace service Extensions to workspace service



12/05/06 ICSOC ‘06

GT4 workspace service

The GT4 Virtual Workspace Service (VWS) allows an authorized client to deploy and manage workspaces on-demand. GT4 WSRF front-end Leverages multiple GT services Currently implements workspaces as VMs

Uses the Xen VMM but others could also be used

Current release 1.2.1 (December, 06) http://workspace.globus.org

12/05/06 ICSOC ‘06

Workspace Service Usage Scenario

Poolnode

Trusted Computing Base (TCB)

ImageNode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

The workspace service has a WSRF frontend that allows users to deploy and manage

virtual workspaces

The VWS manages a set of nodesinside the TCB (typically a cluster).

This is called the node pool.

Each node must have a VMM (Xen)installed, along with the workspacebackend (software that manages

individual nodes)

VM images are staged to adesignated image node

inside the TCB

VWSNode

VWSService

12/05/06 ICSOC ‘06

ImageNode

Deploying Workspaces

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Workspace

- Workspace metadata - Resource Allocation

VWSService

Adapter-based implementation model Transport adapters

Default scp, then gridftp

Control adapters Default ssh Deprecated: PBS, SLURM

VW deployment adapter Xen Previous versions: VMware

12/05/06 ICSOC ‘06

ImageNode

Interacting with Workspaces

Poolnode

Trusted Computing Base (TCB)

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

Poolnode

The workspace service publishesinformation on each workspace

as standard WSRF ResourceProperties.

Users can query thoseproperties to find out

information about theirworkspace (e.g. what IP

the workspace was bound to)

Users can interact directly with their

workspaces the same way the would with a

physical machine.

VWSService

12/05/06 ICSOC ‘06

Deployment Request Arguments

A workspace, composed of: VM image Workspace metadata

XML document Includes deployment-independent information:

VMM and kernel requirements NICs + IP configuratoin VM image location

Need not change between deployments Resource Allocation

Specifies availability, memory, CPU%, disk Changes during or between deployments

12/05/06 ICSOC ‘06

Workspace Service Interfaces

Workspace Service

Workspace FactoryService

Create()

Workspace Meta-data/Image

ResourceAllocation

inspect & manage

notify

Workspace Resource Instance

authorize & instantiate

Workspace Service

Handles creation of workspaces.Also publishes information onwhat types of workspaces it

can support

Handles management ofeach created workspace

(start, stop, pause, migrate,inspecting VW state, ...)

Resource Properties publish theassigned resource allocation, how VW was bound to metadata (e.g.IP address), duration, and state

12/05/06 ICSOC ‘06

Extensions to Resource Allocation

12/05/06 ICSOC ‘06

Overview

Problem Definition The Edge Service Use Case

Workspace Service Overview of the workspace service Extensions to workspace service

Implementation and Evaluation CPU resource allocation Network resource allocation


12/05/06 ICSOC ‘06

Edge Services Today

GRAMVO1

VO2

VO1

7.83 jpm

8 jpm

Job throughput is low as both VOs are equally impacted by the high VO1 traffic

Both VOsshare the

same resource

Compute Element (CE) implemented as GT GRAM

12/05/06 ICSOC ‘06

Allocating Resources for Edge Services

GRAM

4.18 jpm

22.36 jpm

GRAMVO1

VO2

VO1

Job throughput for VO2 is high as it is unimpacted by the high VO1 traffic

WorkspaceService

Resource Allocation:MEM: 896 MBCPU: CPU %: 45% CPU arch: AMD Athlon

Resource Allocation:MEM: 896 MBCPU: CPU %: 45% CPU arch: AMD Athlon

Dom0 CPU %: 10%

12/05/06 ICSOC ‘06

Tracking Requests Overtime

Comparison of Request Throughput over Time

0

5

10

15

20

25

30

30 90 150 210 270 330 390 450 510 570 630 690 750 810

Time (in 30 second buckets)

Completed jobs

VO1Client VO2Client

- Histogram of request throughput

- Resource usage is enforced on an “as needed” basis

12/05/06 ICSOC ‘06

Increasing Load on VO1VO2 (under changing VO1 load conditions

0

2

4

6

8

10

12

14

16

30 60 90 120 150 180 210 240 270 300 330 360 390 420 450 480

Time (in 30 second buckets)

Jobs completed

1mill-VO2 2mill-VO2 3mill-VO2 - Histogram of request throughput

- The load on VO1 increases2x and 3x

- Request throughputfor VO2 is unimpacted

12/05/06 ICSOC ‘06

Network Resource Allocation

Processing network traffic requires CPU In Xen: for both dom0 and guest domains

CPU allocation tradeoffs Scheduling frequency

The mechanism is general Save for direct drivers

B

dom0domU

domU

12/05/06 ICSOC ‘06

Network Resource Allocation

Network Allocation Implementation CPU allocations based on a parameter

sweep Close to maximum bandwidth

Linux network shaping tools Negotiating network resource allocations

Policy: accepting only CPU allocations that match the bandwidth

12/05/06 ICSOC ‘06

Storage Element (SE) Edge Service

VO2GridFTP

VO1GridFTPVO1

VO2

WorkspaceService

Resource Allocation:MEM: 128 MBCPU: CPU %: 6% CPU arch: AMD AthlonNIC: Incoming: 4.1 MB/s


Dom0 CPU %: 22%

12/05/06 ICSOC ‘06

Negotiating Bandwidth

12/05/06 ICSOC ‘06

Renegotiating CPU and Bandwidth

VO2GridFTP

VO1GridFTP

WorkspaceService




Dom0 CPU %: 22%

12/05/06 ICSOC ‘06

Renegotiating CPU and Bandwidth

12/05/06 ICSOC ‘06

Renegotiating CPU

VO2GridFTP

VO1GridFTP

WorkspaceService




Dom0 CPU %: 22%

12/05/06 ICSOC ‘06

Renegotiating CPU

12/05/06 ICSOC ‘06

Edge Services: Status OSG activity

www.opensciencegrid.org/esf Edge Services in use (database caches)

ATLAS: mysql-gsi db built by the DASH project CMS: frontier database

Base Image library SDSC: SL3.0.3, FC4, CentOS4.1 FNAL: SL3.0.3, SL4, LTS 3, LTS 4

Sites Production: SDSC also testing at FNAL, UC and ANL

12/05/06 ICSOC ‘06

Related Work Edge Service efforts

VO boxes, EGEE APAC, static Edge Services Grid-Ireland, static Edge Services

OGF efforts: WS-Agreement, JSDL Managed Services QoS with Xen

Padma Apparo, Intel (VTDC paper) Rob Gardner & team, HP Credit-based scheduler

Grid computing and virtualization Work at University of Florida, Purdue, Northwestern, Duke

and others

12/05/06 ICSOC ‘06

Conclusions

VM-based workspaces are a promising tool to implement “division of labor”

Renegotiation is an important resource management tool Protocols Enforcement methods: dynamic reallocation,

migration, etc. Aggregate resource allocations

Different resource aspects influence each other More work on managing VM resources is needed

division of labor: tools for growing and scaling grids

Documents

workspace serviceextensions

ephemeral edge services

quality of service

division of labour

independence resource

usersthe resource needs

policybased deployment

virtual workspacesthe