zellescher weg 12 trefftz-building – hrsk/151 phone +49 - 351 - 463 39871 guido juckeland...

34
Zellescher Weg 12 Trefftz-Building – HRSK/151 Phone +49 - 351 - 463 39871 Guido Juckeland ([email protected]) Center for Information Services and High Performance Computing (ZIH) Introduction to High Performance Computing at ZIH The LSF Batch System

Upload: amice-cunningham

Post on 29-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Zellescher Weg 12

Trefftz-Building – HRSK/151

Phone +49 - 351 - 463 39871

Guido Juckeland ([email protected])

Center for Information Services and High Performance Computing (ZIH)

Introduction to High Performance Computing at ZIH

The LSF Batch System

Slide 2 - Guido Juckeland

Agenda

What is a batch system?

Batch queues on the Altix/Deimos

Host groups on Deimos

Starting, stopping, and monitoring batch jobs

Batch scripts

Slide 3 - Guido Juckeland

What is a batch system?

Slide 4 - Guido Juckeland

Concept of a batch system

Login-Host

- Compile- File transfers

Master-Host

Compute-Host

Compute-Host

Compute-Host

Compute-Host

...

Submission of the

batch job

Execution of the batch

job

Slide 5 - Guido Juckeland

Deimos Batchsystem

Login-Host

- Compile- File transfer

Master-Host

p1d001

p1d002

p1d003

p2s256

...Submission

of the batch job

Execution of the batch

job

Login-Host

Login-Host

Login-Host

Slide 6 - Guido Juckeland

Mars

Altix Batchsystem

Login-Host

- Compile- File transfer

Master-Host

Jupiter

Saturn

Uranus

Mars

Submission of the

batch job

Execution of the batch

job

Slide 7 - Guido Juckeland

What is a job?

A piece of work (e.g. a script, command, or application call) that was handed over to the control of the batch system (using „bsub“)

The batch system determines:

– Execution time of the job (when?)

– Placement of the job on the hosts (where?)

– If needed the interupt of the job

The batch system also does the accounting for jobs with respect to the available project CPU time.

Special case: Interactive Jobs

Slide 8 - Guido Juckeland

A job‘s life cycle

Slide 9 - Guido Juckeland

What is a host?

Deimos/Phobos:

– Login node

– Compute node

Altix:

– Alle 4 Partitionen

A host contains a number of job slots (number of sequential jobs that could be placed on a host):

– Deimos: 1 node -> 2-8 job slots (since 2-8 CPUs)

– Phobos: 1 node -> 2 job slots (since 2 CPUs)

– Altix: Mars -> 346 job slots (since 346 available CPUs) Jupiter, Saturn, Uranus -> 506 job slots (since 506

available CPUs)

Slide 10 - Guido Juckeland

What is a queue?

Queue = Alignment of things/people waiting for some event

Batch queue = Queue for batch jobs

Usually different queues have different limits with respect to max. run time or max. avail job slots

One batch queue for the whole system

– Advantage: Easy to administer

– Disadvantage: User has to specify all job parameters (CPU time, memory usage,…)

Multiple batch queues for the whole system (Altix, PC farm)

– Advantage: Easy to use for the user

– Disadvantage: More difficult to manage

Slide 11 - Guido Juckeland

Batch queues on the Altix/Deimos

Slide 12 - Guido Juckeland

Batch queues on Mars

Name Users CPUsTime limit

default /max

Hosts

interactive All 1 - 32 12 h / 12 h All

ilr ILR 1 - 768 12 h /24 h All

small All 1 -63 12 h / 5 d All

intermediate All 64 – 255 12 h / 5 d All

large All 256 – 1866 12 h / 24 h All

Slide 13 - Guido Juckeland

Batch queues on Deimos

Name Users CPUsTime limit

default /max

Hosts

interactive All 1 - 32 12 h / 12 h All

small All 1 - 8 12 h / 5 d All

rtcSelected

users1 - 2 200 h

All but fat_quads

intermediate All 9 – 127 12h / 5d All

large All 128 – 256 12 h / 24 h All

nightexpress MPI_CBG 1 4 h All

gauss Gauss users 1 - 8 120 h fat_quads

Slide 14 - Guido Juckeland

Host groups on Deimos

Slide 15 - Guido Juckeland

Host groups on Deimos

p1_hosts - Phase 1 nodes

p2_hosts - Phase 2 nodes

single_hosts - Single CPU nodes

dual_hosts - Dual CPU nodes

quad_hosts - 16 GByte Quad CPU nodes

fat_quads - 32 GByte Quad CPU nodes

single{1|2}_hosts - Phase 1/2 Single CPU nodes

dual{1|2}_hosts - Phase 1/2 Dual CPU nodes

quad{1|2}_hosts - Phase 1/2 16 GByte Quad nodes

express_hosts - Knoten für Queue nightexpress

Slide 16 - Guido Juckeland

Starting, stopping, and monitoring batch jobs

Slide 17 - Guido Juckeland

Starting a batch job (1)Command: bsub

Call with: bsub -n <n> [parameter] <command> [Command parameters]

Parameters:

– -n <n>: Number of job slots to use (CPUs)

– -q <queue name>: Selects a specific queue for the job

– -W <time>: Max. runtime of the job (format: H:MM)

– -e <file>: Redirects all error output (stderr) to „file“

– -o <file>: Redirects all standard output (stdout) to „file“

– -M <Speicher in KByte>: Max. amount of main memory needed by the job

– -m <Host(group)>: Specifies a certain host (group) for the job execution

– -x: Uses the execution host exclusively by the job

– -Is: Interakte job (bsub -Is bash -l)

Slide 18 - Guido Juckeland

Starting a batch job (2)

– -w done(<Job-ID>): Start job when job with ID „job-id“ is done successfully

Example:

juckel@deimos101:~> bsub pwd

Job <647938> is submitted to default queue <express>.

Sequential program:

– Altix/Phobos/Deimos/SX-6: bsub ./a.out

MPI-parallel program:

– Altix: bsub -n <n> pamrun ./a.out

– Deimos: bsub -n <n> -a openmpi mpirun.lsf ./a.out

OpenMP-parallel / Multithreaded program:

– Altix: bsub -R "span[hosts=1]" -n <n> ./a.out

– Deimos: bsub -R "span[hosts=1]" -n {1-8} ./a.out

Slide 19 - Guido Juckeland

Modifying a batch job

Command: bmod

Attention! Works usually only with pending jobs (PEND)

Call with: bmod [Parameter] <Job-ID>

Parameter:

– -n <n>: Number of job slots to use (CPUs)

– -q <queue name>: Selects a specific queue for the job

– -W <time>: Max. runtime of the job (format: H:MM)

– -e <file>: Redirects all error output (stderr) to „file“

– -o <file>: Redirects all standard output (stdout) to „file“

– -M <Speicher in KByte>: Max. amount of main memory needed by the job

– -m <Host(group)>: Specifies a certain host (group) for the job execution

– -x: Uses the execution host exclusively by the job

Slide 20 - Guido Juckeland

Cancelling a batch job

Command: bkill

Call with: bkill <Job-ID>

Special cases:

– bkill 0 cancels all jobs of a user

– bkill sends only a SIGKILL to the application -> if the process executed by the job does not respond to that signal, LSF cannot abort the job

Slide 21 - Guido Juckeland

Modification of the order of pending jobs

Default: Execution order = Order of arrival

Commands: btop/bbot

Call with: btop/bbot <Job-ID>

Example:

juckel@:deimos101~> bjobs

JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME

647942 juckel PEND express host pwd Jun 15 10:33

647943 juckel PEND express host pwd Jun 15 10:33

647944 juckel PEND express host pwd Jun 15 10:33

juckel@deimos101:~> bbot 647942

Job <647942> has been moved to position 1 from bottom.

juckel@deimos101:~> bjobs

JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME

647943 juckel PEND express host pwd Jun 15 10:33

647944 juckel PEND express host pwd Jun 15 10:33

647942 juckel PEND express host pwd Jun 15 10:33

Slide 22 - Guido Juckeland

Monitoring a job status (1)

Command: bjobs

Call with: bjobs [Parameter] [Job-ID]

Parameters:

– With all: Shows a list of all jobs of the user with [Job-ID]

– -p [Job-ID]: Shows the reason why the job is pending

– -l [Job-ID]: Shows detailed information about the job with [Job-ID]

– -q <queuename>: Shows all the users jobs in the queue <queuename>

– -r : Show only jobs in status (RUN)

Example:

juckel@deimos101:~> bjobs

JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME

647943 juckel PEND express host pwd Jun 15 10:33

647944 juckel PEND express host pwd Jun 15 10:33

647942 juckel PEND express host pwd Jun 15 10:33

Slide 23 - Guido Juckeland

Monitoring a job status (2)

Possible states of a job:

– PEND (pending): Job is waiting to be executed

– RUN (running): Job is currently executed

– UNKN (unknown): Satus of the job is unknown (usually happens when the job ran on a node that crashed)

– SUSP (suspended): Job was stopped

Slide 24 - Guido Juckeland

Information about a job after it has finished

Command: bhist

Call with: bhist -l <Job-ID>

Example:juckel@deimos101:~> bhist -l 647942

Job <647942>, User <juckel>, Project <benchit>, Command <pwd>

Thu Jun 15 10:33:26: Submitted from host <host>, to Queue <express>, CWD <$HOME>;

Thu Jun 15 10:34:06: Job moved to position 1 relative to <bottom> by user or ad

ministrator <juckel>;

Thu Jun 15 10:39:19: Dispatched to <n31>;

Thu Jun 15 10:39:19: Starting (Pid 23143);

Thu Jun 15 10:39:19: Running with execution home </home/juckel>, Execution CWD

</home/juckel>, Execution Pid <23143>;

Thu Jun 15 10:39:19: Done successfully. The CPU time used is 0.0 seconds;

Thu Jun 15 10:39:19: Post job process done successfully;

Summary of time in seconds spent in various states by Thu Jun 15 10:39:19

PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL

353 0 0 0 0 0 353

Slide 25 - Guido Juckeland

Displaying all jobs in the system (1)

Command: qstat -a

Example:

juckel@deimos101:~> qstat -a

...

small; type=BATCH; [ENABLED]; pri=20

10 run; 0 wait;

REQUEST NAME REQUEST ID USER STATE

1: CRANp_S2_PBE 639563 rbarthel RUNNING

2: ts_rekom_eta 643986 drees RUNNING

3: tricomplex_3 646038 drees RUNNING

4: BCPPp 647727 rbarthel RUNNING

5: iC_C20H9p 647930 rbarthel RUNNING

6: iA_C21OH10p 647931 rbarthel RUNNING

7: iC_C20H9 647934 rbarthel RUNNING

8: iA_C20H9 647935 rbarthel RUNNING

Slide 26 - Guido Juckeland

Displaying all jobs in the system(2)

Command: lsf_info (developed at ZIH)

Example:juckel@deimos101:~> lsf_info

Running Jobs

-------------

JOBID USER PROJECT QUEUE PROC START TIME WALL TIME USED UTILIZATION

647945 mlieber ozon large 64 15.Jun 10:43 0:01 of 0:15 44%

639563 rbarthel nano1 small 1 13.Jun 05:15 2T/ 5h of 2T/23h 99%

643986 drees akat small 2 13.Jun 11:43 1T/23h of 3T/ 0h 43%

646038 drees akat small 2 14.Jun 16:35 18:09 of 3T/ 0h 11%

647727 rbarthel nano1 small 1 14.Jun 21:07 13:37 of 2T/23h 97%

...

Pending Jobs

------------

JOBID USER PROJECT QUEUE SUBMITTED PROC

601109 muel zhr stresstest 28.Mai 11:27 1

601110 muel zhr stresstest 28.Mai 11:27 1

...

Slide 27 - Guido Juckeland

System status

Command: nodestat (developed at ZIH)

Call with: nodestat [Hostgroup]

Example:deimos101:~ # nodestat

--------------------------------------------------------------------------------

nodes available: 724/724 nodes damaged: 0

------------------------------------+-------------------------------------------

jobs running: 1576 | cores closed (exclusive jobs): 118

jobs wait: 909 | cores closed by ADMIN: 20

jobs suspend: 0 | cores working: 2300

jobs damaged: 0 |

------------------------------------+-------------------------------------------

normal working cores: 2576 cores free for jobs: 138

--------------------------------------------------------------------------------

Slide 28 - Guido Juckeland

Displaying the status of the batch queues (1)

Command: bqueues

Call with: bqueues [Parameter] [Queue name]

Parameters:

– without: Shows a summary of all queues or the specified queue

– -l [Queuename]: Shows detailed information about all queues or the specified queues

Example:

juckel@merkur:~> bqueues

QUEUE_NAME PRIO STATUS MAX JL/U JL/P JL/H NJOBS PEND RUN SUSP

interactive 30 Open:Active 60 60 1 - 0 0 0 0

large 25 Open:Active 124 124 1 - 1072 1072 0 0

gauss 20 Open:Active 64 64 1 32 24 0 24 0

small_long 20 Open:Active 120 120 1 - 4 0 4 0

small 20 Open:Active 120 120 1 120 84 0 84 0

...

Slide 29 - Guido Juckeland

Displaying the status of the batch queues(2)

Possible states of the queues:

– Active: Queue accepts and executes jobs

– Inactive: Queue accepts jobs but execution is delayed

– Closed: Queue does not accept jobs and jobs are not executed

Slide 30 - Guido Juckeland

Information about the compute hosts

Kommando: bhosts

Call with: bhosts (Altix), bhosts batch_hosts (Phobos,Deimos)

Beispiel:juckel@merkur:~> bhosts

HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV

merkur ok - 124 42 42 0 0 0

venus ok - 124 106 106 0 0 0

Possible states of hosts:

– OK: Host accepts jobs

– Closed: Host does not accept jobs (host is either full or closed by the admin)

– Unkown: Host is crashed

Slide 31 - Guido Juckeland

Batch scripts

Slide 32 - Guido Juckeland

Layout

#!/bin/sh

#BSUB -n 4

#BSUB -q small

#BSUB -a openmpi

mpirun.lsf ./a.out

Slide 33 - Guido Juckeland

Submission of a job with a batch script

Command: bsub

Call with: bsub [Parameters] < <batch script>

Example:

juckel@deimos101:~> bsub < test.sh

Job <647954> is submitted to queue <small>.

You need help later on?

• There are no stupid questions or requests!!

• Central drop off point:[email protected]

• Central information point:http://www.tu-dresden.de/zih/hpc

• Read our mail-announcements:[email protected]

• Discuss your problem with other ZIH HPC users:[email protected]

Slide 34 - Guido Juckeland