monitoring and trouble shooting on biohpc · monitoring and trouble shooting on biohpc 1 updated...

38
Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] [email protected]

Upload: others

Post on 26-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Monitoring and Trouble Shooting on BioHPC

1 Updated for 2017-03-15

[web] portal.biohpc.swmed.edu

[email] [email protected]

Page 2: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Why Monitoring & Troubleshooting

2

data

Monitoring jobs running on the cluster

Understand how current HPC resource is used

Optimize usage to maximum capacities

code

Page 3: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Why Monitoring & Troubleshooting

3

Try to understand if the job is:• CPU intensive• Memory intensive• I/O intensive• A combination of above

Try to figure out:• Where are the bottlenecks• How to boost the computational efficiency-Completing more tasks during available time window-Run an analysis with larger data set in the same amount of time

Page 4: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

What to Monitor

4

CPU Usage- lscpu- pstree- top

Memory Usage- free- vmstat

I/O Usage- iostat

Network/Bandwidth- ifstat

First, start by profiling the application on an interactive node.

Page 5: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

CPU Usage

5

Achieve speedup on HPC?Increased frequencies Increased scalability

lscpu: display information about CPU architecture

Page 6: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

CPU Usage: command line tools

6

Job running on the compute node: astrocyte_cli test <workflow> align-bowtie-se.sh bowtie/1.0.04 samples

pstree: display a tree of processes

* You may also use top and pstree command to verify if your job is running across multiple nodes

Page 7: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

CPU Usage: command line tools

7

top: display Linux tasks, provides a dynamic real-time view of a running system.

Page 8: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Memory Usage: The Memory Hierarchy

8

http://cse1.net/recaps/4-memory.html

Page 9: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Memory Usage: command line tools

9

free: displays the total amount of free and used physical and swap memory in the system, as well as the buffers used by the kernel

Mem (RAM): can be used by currently-running processSwap (Virtual Memory): is used when the amount of physical memory (RAM) is full. Constant swapping should be avoided

buffers: file system metadatacached: pages with actual contents of files for future faster access, not currently “used” memory

Page 10: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Memory Usage: command line tools

10

vmstat: (Virtual Memory Statistics) outputs instantaneous reports about your system's processes, memory, paging, block I/O, interrupts and CPU activity.

Page 11: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Disk Usage & I/O

11

Parallel Filesystems on BioHPC

Advantages:scalabilitythe capability to distribute large files across multiple nodes

IssuesInadequate I/O capability can severely degrade overall cluster performance

Page 12: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Disk Usage & I/O: command line tools

12

iostat: generates reports that can be used to change system configuration to better balance the input/output load between physical disks.

%iowait is the percentage of time your processors are waiting on the disk

Page 13: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Network/Bandwidth Usage

13

Minimizing communication

Page 14: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Network/Bandwidth Usage: command line tools

14

ifstat: reports the network bandwidth in a batch style mode

Page 15: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

All-in-One tools

15

Too many tools?

All-in-on tools- Dstat- Linux Collectl Profile- HPCTools

Page 16: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Dstat: Versatile resource statistics tool

16

DAG:a versatile replacement for vmstat, iostat, netstat and ifstat. http://dag.wiee.rs/home-made/dstat/

Page 17: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Linux Collectl Profiler

17

• Information from monitoring an application can aid the user to run it optimally

• Collectl is a tool which monitors a broad set of subsystems of a server while user application is running on it

• Helpful to know your application’s usage of cpu, memory, disk, etc to determine if system resources are being stressed or over utilized

• Many subsystems in summary or detail available to monitor, but initial interest to a user running an application.• CPU• Memory• Disk – Lustre• InfiniBand• NFS usage• TCP summary

Page 18: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

collectl --showsubsys

18

Shows ALL subsystems that data can be collected for and plotted in Summary plots:

b - buddy info (memory fragmentation)c - cpud - diskf - nfsi - inodesj - interrupts by CPUm - memoryn - networks - socketst - tcpx - interconnect (currently supported: OFED/Infiniband)y - slabs

Page 19: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

collectl --showsubsys

19

Shows all subsystems that data collected can be shown in Detailed plots:

C - individual CPUs, including interrupts if -sj or -sJD - individual DisksE - environmental (fan, power, temp) [requires ipmitool]F - nfs dataJ - interrupts by CPU by interrupt numberM - memory numa/nodeN - individual NetworksT - tcp details (lots of data!)X - interconnect ports/rails (Infiniband/Quadrics)Y - slabs/slubsZ - processesL - lustre

Page 20: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Why Monitoring - Linux Collectl Profiler – Getting LUSTRE metrics

20

• In your script that you sbatch to run a job, execute collectl running in the background:

#!/bin/bashmodule add collectl/4.1.2cd /project/biohpcadmin/s175049mkdir testcollectl -scLmx -P -f /project/biohpcadmin/s175049/test &>/dev/null &dd if=/dev/zero of=stripe4 bs=4M count=4096kill %1

Data is collected for subsystems that are listed in –s option Collectl data files are written to user directory “test” above

Page 21: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Why Monitoring - Linux Colplot Visualizer

21

• View data with Gnuplot either while job is running or after it is finished:

% colplot –dir /project/biohpcadmin/s175049/test –plot cpu,mem,inter,cltdet

% colplot –showplot

shows ALL the different args to –plot to display the plots you want

• May need to refine timeline by specifying specific timeframe to view:

% colplot –dir /project/biohpcadmin/s175049/test –plot \cpu,mem,inter,cltdet -time 08:20-08:30

Page 22: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Why Monitoring - Linux Collectl & Colplot

22

• Documentation with examples and tutorials:

collectl.sourceforge.net/Documentation.html

colplot.sourceforge.net/Documentation.html

• Collectl and colplot man pages:

linux.die.net/man/1/collectl

collectl-utils.sourceforge.net/coplot.html

Page 23: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

What’s next

23

Page 24: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Optimization: Use appropriate compiler options

24

Intel Math Kernel Library: a library of optimized math routines for science,

engineering and financial applications.

• Basic Linear Algebra Subroutines

• LAPCK

• Fast Fourier Transform (FFT)

• Vector Math Library

• Build in OpenMP multithreading (set OMP_NUM_THREADS>1)

Modules with MKL on BioHPC• R/2.15.3-Intel• R/3.3.2-gccmkl• julia/0.4.6• JAGS/4.2.0…

Compile your own MKLusing the –mkl complier option(detailed options refer to: https://software.intel.com/en-us/node/528512)

Page 25: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Optimization : Load big data into memory to reduce I/O

25

8GB RAM

256GB RAM

Significantly reduced I/O

Page 26: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Optimization : Single-Instruction, Multiple-Data

26

Vector Processing Unit

for ( i = 0; i < n; i++)A[i] = A[i] + B[i];

for ( i = 0; i < n; I += 8)A[I : (i+8)] = A[I : (i+8)] + B[i : (i+8)];

Scalar Loop

SIMD Loop

* Each SIMD addition operator acts on 8 numbers at a time

“Intel® AVX data types allow packing of up to 32 elements in a

register if bytes are used. The number of elements depends upon the

element type: 8 single-precision floating point types or 4 double-

precision floating point types.”

https://software.intel.com/en-us/node/524040

https://en.wikipedia.org/wiki/SIMD

Another example is GPU

Page 27: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Optimization: GNU Parallel

27

keeping the CPUs active and thus saving time

A shell tool for executing jobs in parallel using one or more computers.Make best use of CPU resource with balanced job load

http://www.gnu.org/software/parallel/

If all jobs are independent to each other ...

• Predefined the job pool to match the total number of Cores

• Spawns a new process when one finishes

• module load parallel

Page 28: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Optimization: Multithreading

28

If communication between jobs are needed ...

• pthread• openMP

Shared memoryAdvantages:

user friendly programmingfast data sharing between tasks

Disadvantage:programmer’s responsibility for

synchronization construction that ensure “correct” access of shared memory

• phenix• bowtie2

libs tools

Page 29: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Optimization: Shared Memory

29

Possible bottleneck:

concurrent read: Maybeconcurrent write: No

http://www.delphicorner.f9.co.uk/articles/op4.htm

Modified from Figure 1 in https://developer.marklogic.com/blog/how-marklogic-supports-acid-transactions

Page 30: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Optimization: Message Passing Interface

30

If communication between jobs are needed ...e.g.: MPI job across multiple nodes

master nodeslave node 1

slave node 2

slave node 3

Page 31: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Optimization: Message Passing Interface

31

Possible bottleneck: • communication cost• unbalanced load Decompose dataset in a smart way to:

• Minimize the overlaps (proportion to communication cost)

• Balance the data between nodes

Example: METIS – Graph partition toolhttp://glaros.dtc.umn.edu/gkhome/metis/metis/overview

What is the maximum speed-up you could achieve?

Page 32: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Optimization: Multithreading & Message Passing

32

MPI + pthread

If you try to run relion job across 2 nodes on 256GB partition, 48*2 = 96 cores

No. of MPI jobs No. of threads No. MPI * No. threads

2 48 96

4 24 96

8 12 96

16 6 96

Q: Which one has the shortest computation time?

Page 33: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Demo: Project Gutenberg “big data” reader

33

Data: 18792 booksSize: ≈ 10 GBType: plain/text

Count the number of occurrences of the words:

“dog”“cat”“boy”“girl”

Goal: Complete as fast as possible by reducing bottlenecks and inefficiencies

Page 34: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Demo: Project Gutenberg “big data” reader: Solution I (single-processor, many files)

34

file_00.txt

LUSTRE

Read text into node RAM

CPU_00count keywords

file_01.txt

LUSTRE

Read text into node RAM

CPU_00count keywords

file_02.txt

LUSTRE

Read text into node RAM

CPU_00count keywords

Page 35: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Demo: Project Gutenberg “big data” reader: Solution II (multi-processor, partition file set)

35

file_00.txt

LUSTRE

Read line of text into

node RAM

CPU_00count

keywords

file_01.txt

LUSTRE

Read line of text into

node RAM

CPU_00count

keywords

file_02.txt

LUSTRE

Read line of text into

node RAM

CPU_00count

keywords

file_03.txt

LUSTRE

Read line of text into

node RAM

CPU_01count

keywords

file_04.txt

LUSTRE

Read line of text into

node RAM

CPU_01count

keywords

file_05.txt

LUSTRE

Read line of text into

node RAM

CPU_01count

keywords

file_06.txt

LUSTRE

Read line of text into

node RAM

CPU_02count

keywords

file_07.txt

LUSTRE

Read line of text into

node RAM

CPU_02count

keywords

file_08txt

LUSTRE

Read line of text into

node RAM

CPU_02count

keywords

file_09.txt

LUSTRE

Read line of text into

node RAM

CPU_03count

keywords

file_10.txt

LUSTRE

Read line of text into

node RAM

CPU_03count

keywords

file_11.txt

LUSTRE

Read line of text into

node RAM

CPU_03count

keywords

Page 36: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Demo: Project Gutenberg “big data” reader: Solution III (single-processor, one large file, chunked)

large_txt.bin (all text from all books in one large file)

Distribute file chunks to RAMLUSTRE

Distribute memory to CPU_00 in limited

chunks

chunk_00

CPU_00count keywords

chunk_01

CPU_00count keywords

chunk_02

CPU_00count keywords

Page 37: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Demo: Project Gutenberg “big data” reader: Solution IV (multiple-processors, one large file, chunked)

large_txt.bin (all text from all books in one large file)

Load all text into node memoryLUSTRE

Partition memory to all procesors in

chunks

multiple chunks

CPU_00count keywords

multiple chunks

CPU_01count keywords

multiple chunks

CPU_02count keywords

multiple chunks

CPU_03count keywords

Page 38: Monitoring and Trouble Shooting on BioHPC · Monitoring and Trouble Shooting on BioHPC 1 Updated for 2017-03-15 [web] portal.biohpc.swmed.edu [email] biohpc-help@utsouthwestern.edu

Demo: Project Gutenberg “big data” reader: Results

38

time python inefficient_reader.py: 7.2 minSolution I

time python multithreaded_inefficient_reader.py 2.0 minSolution II

time python efficient_reader.py: 3.5 minSolution III

time python multithreaded_efficient_reader.py 0.7 minSolution IV