bqs update architecture scheduling job types new users needs more users & machines, scalability...

13
BQS Update BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control over the system [email protected] HEPIX Edinburgh 26/5/04

Upload: amber-frost

Post on 28-Mar-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

BQS UpdateBQS Update

Architecture Scheduling Job Types

New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring

and control over the system

[email protected] HEPIX Edinburgh 26/5/04

Page 2: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

Architecture UpdateArchitecture Update

Client

Worker

MySQL DB

BQS Schedul

er

Worker

DB Agent

results spawn report

submit

query

DB Agent

[email protected] HEPIX Edinburgh 26/5/04

Page 3: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

Scheduler Resources Quasi Interactive Jobs

Scheduling UpdateScheduling Update

[email protected] HEPIX Edinburgh 26/5/04

Page 4: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

More Control for Operation and Administration:

Weight of Past Resource Usage

And Group ObjectivesMax Job DurationSmall Jobs Bias

SchedulerScheduler

[email protected] HEPIX Edinburgh 26/5/04

Page 5: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

Beyond Traditional ResourcesE.G. Disk, Time, Memory

Logical ResourcesNameMax AvailableRestricted Flag

Admin Defined ResourcesE.G. HPSS

Logical ResourcesLogical Resources

[email protected] HEPIX Edinburgh 26/5/04

Page 6: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

Created & managed by Users:

Decide of the Name: u_XXX

Receive Privilege bqs.u_XXXadmin

Set Max Available and Restricted Flag

Grant/deny bqs.u_XXXusage privilege

Logical U_ResourceLogical U_Resource

[email protected] HEPIX Edinburgh 26/5/04

Page 7: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

A General Service, APIs, Commands To: Grant, Deny, Check & List Privileges

Given to Users, Groups and Machines

EG in BQS applid:bqs.admin, bqs.oper,

bqs.spawn_forbidden

Privilege ManagementPrivilege Management

[email protected] HEPIX Edinburgh 26/5/04

Page 8: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

Parallel Jobs

Arborescent Jobs

GRID Jobs

New Job TypesNew Job Types

[email protected] HEPIX Edinburgh 26/5/04

Page 9: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

2 new submit options: proc, ptype

proc: Number of WorkPoints

ptype: PVM, MPICH, LAM-MPI

Parallel JobsParallel Jobs

[email protected] HEPIX Edinburgh 26/5/04

Page 10: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

Parallel JobsParallel Jobs

Client

Worker

MySQLDB

BQS Master

Worker

BQS DB Agent

results

spawn parallel job

report

submit …

query

spawn task

DB DB Agent

global report

[email protected] HEPIX Edinburgh 26/5/04

Page 11: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

SNOVAE:many related small tasksneed short global response

time

Schedule and spawn as one Job to reduce BQS latency

Runs on a number of WorkPoints

User must describe tasks dependencies

Arborescent JobArborescent Job

[email protected] HEPIX Edinburgh 26/5/04

Page 12: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

Real and Generic Accounts

AFS Tokens and Certificates

Specific RH 7.3 + LCG Soft Full Production Farm

(Currently a Specific “lcg” Logical Test Farm for Validation)

GRID JobsGRID Jobs

[email protected] HEPIX Edinburgh 26/5/04

Page 13: BQS Update Architecture Scheduling Job Types New Users Needs More users & machines, Scalability issues Needs for more sophisticated monitoring and control

BIO: Quasi Interactives Jobs Installation and

documentation for LCG

Other ProjectsOther Projects

[email protected] HEPIX Edinburgh 26/5/04