hpc resources at w&m and how to use them file vim tutorials nano homepage. logging into hpc...
TRANSCRIPT
HPC Resources at W&M and ...How To Use Them
February 5th, 2016
Eric J. Walter
● What is HPC?● HPC ticket system ● What hardware is available?● Linux Shell / Text editors● How to find out about software that is available?● Software Modules● HPC file systems● Using the batch system
What is High Performance Computing?
Combines aspects of the following:
● Advanced computer architecturemulti-processor machines / multi-core processors
● High-seed networksGigabit Ethernet / Infiniband
● Mass storageshort and medium term storage
● Numerical methodsunderstand how to run efficiently
● Parallel algorithmsutilize multiple cores / processors / nodes
Today’s desktops/laptops are getting more powerfulSimply running your application on HPC resources may not benefit you!
Node
Processor
Core
SciCloneTyphoon
typhoo
n
4 cores / node72 nodes -> 288 cores2 or 6 GB mem / coreOpteron Santa-RosaDDR Infiniband
Vortex
vortex
12 cores / node36 nodes -> 432 cores2.7 or 10.7 GB mem / coreOpteron SeoulFDR Infiniband
● SciClone has 1232 cores total● 2 Intel & 2 Opteron clusters● Typhoon uses SLES 10 – rest RHEL 6.2● Total usage for May 2015 : 74%● hurricane cluster has 2xM2075 GPUs (Fermi) ● change all passwords on Hurricane
TwisterGulfstream
Tornado Gale
Other servers for filesystems / backup etc.
Hurricane
hurricane
whirlwind
8 cores / node52 nodes -> 416 cores8 or 24 GB mem / coreIntel WestmereQDR Infiniband
8 cores / node12 nodes -> 96 cores6 GB mem / coreIntel WestmereQDR Infiniband
.sciclone.wm.edu
Linux Shell Usagehttp://www.hpc.wm.edu/SciCloneTutorials/LinuxUnix - page of linux tutorials
Main things to learn about linux/shell
● learn to log in and out of machines● manipulating files/folders● configuring your environment● compiling (?)● running jobs
Questions to ask yourself:
● what software do I want loaded by default?● do I need to compile software? Is it
available?● where can I store files and data?● where should I run my jobs?
Linux Text EditorsPopular text editors: emacs or vi/vim
emacs: huge, bloated, not installed by default, but the champion!
vi/vim: tiny, always available, some users love it
nano: editor with training wheels, very easy to use, not very powerful
http://www.gnu.org/software/emacs/tour/http://www.jesshamrick.com/2012/09/10/absolute-beginners-guide-to-emacs/http://www2.lib.uchicago.edu/keith/tcl-course/emacs-tutorial.html
Emacs Tutorials
http://www.vim.org/http://vim.wikia.com/wiki/Tutorialhttp://linuxconfig.org/vim-tutorial
Vim Tutorials
http://www.nano-editor.org/Nano homepage
Logging into HPC machinesMust use Secure Shell client (SSH)● Linux / Mac built-in (terminal)● Windows – SSH Secure Shell Client / PuTTY
[ewalter@particle ~]$ ssh hurricane.sciclone.wm.eduPassword: Last login: Tue Feb 2 13:57:59 2016 from particle.hpc.wm.edu-------------------------------------------------------------------------------- William and Mary Information Technology / SciClone Cluster...1 [hurricane]
Questions to ask yourself:
● Am I on or off campus? If you are off-campus – log into stat.wm.edu first using your W&M username and password
● Is my username the same as my current machine?If it is different use: ssh <username>@<host>.<domain>
● Do I need graphics?If yes, then log in with -X
Questions to ask yourself:
● Am I on or off campus? If you are off-campus – log into stat.wm.edu first using your W&M username and password
● Is my username the same as my current machine?If it is different use: ssh <username>@<host>.<domain>
● Do I need graphics?If yes, then log in with -X
Software Modules
http://www.hpc.wm.edu/SciCloneTutorials/ModuleBasics - online module help
15 [hurricane] module avail
---------------------------------- /usr/local/Modules/modulefiles -----------------acml/5.1.0/gcc mpc/0.8.2(default)acml/5.1.0/pgi mpfr/2.4.2acml-int64/5.1.0/gcc mpfr/2.4.2a(default)acml-int64/5.1.0/pgi mpi4py/1.3.1/gccacml-mp/5.1.0/gcc mummer/3.23acml-mp/5.1.0/pgi mvapich2-ib/1.2x1/pgiacml-mp-int64/5.1.0/gcc mvapich2-ib/1.9/gccacml-mp-int64/5.1.0/pgi mvapich2-ib/1.9/gcc-4.8.4admb/11.2b/gcc mvapich2-ib/1.9/pgiallpathslg/47017/gcc mvapich2-ib/1.9a2/gcc . . . . . . . .
16 [hurricane] module listCurrently Loaded Modulefiles: 1) modules 3) torque/2.3.7 5) gcc/4.7.0 7) acml/5.1.0/gcc 2) maui/3.2.6p21 4) isa/nehalem 6) mvapich2-ib/1.9/gcc
Best to change your startup modules...
Configuring your Environment
$PLATFORM variable:
11 [hurricane] echo $PLATFORMrhel6-xeon
This means that startup is controlled by .cshrc.rhel6-xeon for hurricane
Can use module load and unload commands for current shell. Best to use startup
Wiki
URL: www.hpc.wm.edu/SciClone/Home
Software
FACT: Software is requested and installed more quickly than documentation can be writtenSubmit a ticket for any software requests. Please make sure you need the software.
HPC Filesystems / Backup3 types of filesystems:
home – backed up nightly ; small used for input files and code, etc.data – backed up weekly ; large files ; medium term storagescratch – NOT backed up ; large output files short term storage
/sciclone/data20 is our fastest fileserver currently – gets a lot of useUse local scratch if you can!
146 [hurricane] df -hFilesystem Size Used Avail Use% Mounted on.../dev/mapper/VolGroup30-LogVol31 917G 482G 390G 56% /sciclone/home00tn00:/usr/local 134G 113G 15G 89% /usr/localtn00:/export 46G 15G 30G 33% /importmh00:/var/spool/mail 7.9G 4.3G 3.3G 57% /var/spool/mailgfs00:/sciclone/home04 591G 429G 157G 74% /sciclone/home04ty00:/sciclone/scr02 273G 81M 273G 1% /sciclone/scr02tn00:/sciclone/scr10 7.9G 2.3G 5.3G 31% /sciclone/scr10tn00:/sciclone/scr30 17T 13T 4.7T 72% /sciclone/scr30gfs00:/sciclone/data10 16T 15T 1.9T 89% /sciclone/data10gfs00:/sciclone/vims20 26T 19T 6.7T 74% /sciclone/vims20tw00-i8:/sciclone/data20 73T 57T 16T 79% /sciclone/data20/dev/md1 8.1T 5.0T 2.8T 65% /sciclone/scr20vx00:/sciclone/home10 2.7T 202M 2.6T 1% /sciclone/home10vx00:/sciclone/scr00 318G 4.8G 297G 2% /sciclone/scr00
Using the Batch Systemhttp://www.hpc.wm.edu/SciCloneTutorials/LaunchJobs
HPC uses Torque (PBS) to schedule and run jobsNodes are selected via the nodespecqsub – submits the job to the batch system
27 [hurricane] qsub -I -l walltime=30:00 -l nodes=1:x5672:ppn=8qsub: waiting for job 1552781.ty00.sciclone.wm.edu to startqsub: job 1552781.ty00.sciclone.wm.edu ready
11 [wh01]
nodespec
Interactive job puts you on a node ready to work
Other nodespecstyphoon : c9hurricane / whirlwind : x5672vortex : c18x
Using the Batch SystemYou can also submit a batch job which does not run interactivelyFirst you must write a batch script:
34 [hurricane] cat run#!/bin/tcsh#PBS -N test#PBS -l nodes=1:x5672:ppn=8#PBS -l walltime=0:10:00#PBS -j oe
cd $PBS_O_WORKDIR
./a.out
#!/bin/tcsh interpret the following in tcsh syntax-N name of the job-l job specifications (walltime ; nodespec)-j combine stderr and stdout
cd $PBS_O_WORKDIR cd to where I submitted the job
./a.out run the job
35 [hurricane] qsub run
qsub, qdel, qstat – most widely used batch commands
WE'RE HERE TO HELP (within reason)
Where to get help?
HPC webpage – (soon to be old wiki page): http://www.hpc.wm.edu/SciClone/HomeHPC ticket system mail: [email protected]
Using the ticket system is useful since it is monitored by three of us
CompilersAll platforms have GNU and Portland C/C++/FORTRAN compilersCheck the Tutorials - Subcluster Essentials / Using XX pages for what flags are suggested for optimization.
Important to know what the isa of the target processor is.
Available HPC hardware and software
http://www.hpc.wm.edu/SciClone/Home - Wiki homepage
http://www.hpc.wm.edu/SciClone/Software - Software list (out of date)http://www.hpc.wm.edu/SciCloneTutorials/ModuleBasics - Software modules
http://www.hpc.wm.edu/SciCloneTutorials/Home - Tutorials pagehttp://www.hpc.wm.edu/SciCloneUserGuide/Home - User's guide http://www.hpc.wm.edu/SciClone/Hardware - Hardware components
http://www.hpc.wm.edu/SciClone/presentations/ClusterOverview_permissions.pdf
Recent hardware overview:
48 cores / 96 GB 32 cores / 64 GB
ice01 ice02
Opteron Magny-CoursQDR Infiniband
snow8 cores / node15 nodes -> 120 cores2 GB mem / core Opteron ShanghaiDDR/QDR Infiniband
Storm
wind
16 cores / node26 nodes -> 416 cores2 GB mem / coreOpteron Magny-CoursQDR Infiniband
Storm
● Storm has 1080 cores total● All Opteron● RHEL 6.2● Total usage for May 2015 : 67%● Has Lustre filesystem
gust
Other servers for filesystems / backup etc.
4 cores / node98 nodes -> 392 cores2-8 GB mem / coreOpteron Santa-RosaDDR Infiniband
rain
hail8 cores / node9 nodes -> 72 cores2-8 GB mem / coreOpteron ShanghaiQDR Infiniband
.hpc.wm.edu
Chesapeake
Chesapeake
● Chesapeake has 380 cores total● All Opteron● RHEL 6.2● Total usage for May 2015 : 32% ● Still need to incorporate Indian nodes
Chickahominy
York
Rappahannock
Other servers for filesystems / backup etc.
potomac
12 cores / node30 nodes -> 360 cores2.7 GB mem / coreOpteron SeoulQDR Infiniband
in1 in2
.hpc.vims.edu
Sharing Files & Folders : Permissions Isee
https://www.nersc.gov/users/storage-and-file-systems/unix-file-permissionsfor more information
35 [hurricane] ls -l resultstotal 3-rw------- 1 ewalter hpcf 194 Jun 16 14:37 ww.dat-rw------- 1 ewalter hpcf 194 Jun 16 14:37 yy.dat-rw------- 1 ewalter hpcf 194 Jun 16 14:37 zz.dat
list files in directory
user group other
r-readw-writex-execute
**directories need to be ‘x’ to be entered/passed through
dire
ctory
associateduser
associatedgroup
print working directory
33 [hurricane] pwd/sciclone/home04/ewalter
list directory34 [hurricane] ls -ld resultsdrwx------ 2 ewalter hpcf 512 Jun 16 14:37 results
http://www.hpc.wm.edu/SciClone/presentations/Permissions.pdf
Sharing Files & Folders : Permissions III want to share my results directory and files with those in the seadas group:
Am I in the seadas group?:
Change the group for the directory and all files below:
51 [hurricane] chmod g+rX -R results/
52 [hurricane] ls -ld results/drwxr-x--- 2 ewalter seadas 512 Jun 16 14:37 results/53 [hurricane] ls -l results/total 3-rw-r----- 1 ewalter seadas 194 Jun 16 14:37 ww.dat-rw-r----- 1 ewalter seadas 194 Jun 16 14:37 yy.dat-rw-r----- 1 ewalter seadas 194 Jun 16 14:37 zz.dat
Make: files group readable folders group readable and executable
46 [hurricane] groups ewalterewalter : hpcf wmall hpcstaff www seadas vasp wm sysadmin hpcadmin hpsmh
47 [hurricane] chgrp -R seadas results/
49 [hurricane] ls -ld results/drwx------ 2 ewalter seadas 512 Jun 16 14:37 results/
50 [hurricane] ls -l results/total 3-rw------- 1 ewalter seadas 194 Jun 16 14:37 ww.dat-rw------- 1 ewalter seadas 194 Jun 16 14:37 yy.dat-rw------- 1 ewalter seadas 194 Jun 16 14:37 zz.dat
Sharing Files & Folders : Permissions IIIWhat happens when a new file is created?
78 [hurricane] cd results/
80 [hurricane] touch newfile81 [hurricane] ls -ltotal 3-rw------- 1 ewalter hpcf 0 Jun 16 15:13 newfile-rw-r----- 1 ewalter seadas 194 Jun 16 14:37 ww.dat-rw-r----- 1 ewalter seadas 194 Jun 16 14:37 yy.dat-rw-r----- 1 ewalter seadas 194 Jun 16 14:37 zz.dat
This is bad since the new file isn’t in the seadas group!
88 [hurricane] chmod g+s results/89 [hurricane] cd results/
91 [hurricane] touch newfile92 [hurricane] ls -ltotal 3-rw------- 1 ewalter seadas 0 Jun 16 15:14 newfile-rw-r----- 1 ewalter seadas 194 Jun 16 14:37 ww.dat-rw-r----- 1 ewalter seadas 194 Jun 16 14:37 yy.dat-rw-r----- 1 ewalter seadas 194 Jun 16 14:37 zz.dat
add “setgid” to folder only
file group inherits the group from the folder it is in
Sharing Files & Folders : Permissions IVThe user’s umask controls what permissions files and folders are given when created:
93 [hurricane] umask77
HPC default umask is 077Can be changed in your startup file (.cshrc etc.)
135 [hurricane] touch file077136 [hurricane] ls -l file077 -rw------- 1 ewalter seadas 0 Jun 16 15:24 file077
137 [hurricane] umask 022138 [hurricane] touch file022139 [hurricane] ls -l file0*-rw-r--r-- 1 ewalter seadas 0 Jun 16 15:24 file022-rw------- 1 ewalter seadas 0 Jun 16 15:24 file077
140 [hurricane] umask 027141 [hurricane] touch file027142 [hurricane] ls -l file0*-rw-r--r-- 1 ewalter seadas 0 Jun 16 15:24 file022-rw-r----- 1 ewalter seadas 0 Jun 16 15:24 file027-rw------- 1 ewalter seadas 0 Jun 16 15:24 file077
Common Permissions Tasks IChange the permissions of a directory chmod:
Change the permissions of a directory and everything under it chmod -R:
24 [hurricane] ls -ld VASPdrwx------ 4 ewalter hpcf 512 Apr 4 2014 VASP25 [hurricane] chmod go+rX VASP26 [hurricane] ls -ld VASPdrwxr-xr-x 4 ewalter hpcf 512 Apr 4 2014 VASP27 [hurricane] chmod o-rX VASP28 [hurricane] ls -ld VASPdrwxr-x--- 4 ewalter hpcf 512 Apr 4 2014 VASP29 [hurricane] chmod g-rX VASP30 [hurricane] ls -ld VASPdrwx------ 4 ewalter hpcf 512 Apr 4 2014 VASP
32 [hurricane] ls -ld VASPdrwx------ 4 ewalter hpcf 512 Apr 4 2014 VASP33 [hurricane] ls -l VASPtotal 52457-rw-r----- 1 ewalter hpcf 22932904 Apr 4 2014 potpaw_LDA.52.tar.gz-rw-r----- 1 ewalter hpcf 25958479 Apr 4 2014 potpaw_PBE.52.tar.gzdrwxr-x--- 2 ewalter hpcf 15360 Aug 6 00:01 vasp.5.3
34 [hurricane] chmod -R g+rX VASP
35 [hurricane] ls -ld VASPdrwxr-x--- 4 ewalter hpcf 512 Apr 4 2014 VASP36 [hurricane] ls -l VASPtotal 52457-rw-r----- 1 ewalter hpcf 22932904 Apr 4 2014 potpaw_LDA.52.tar.gz-rw-r----- 1 ewalter hpcf 25958479 Apr 4 2014 potpaw_PBE.52.tar.gzdrwxr-x--- 2 ewalter hpcf 15360 Aug 6 00:01 vasp.5.3
Use this command if you want to allow group access to your home, scrXX, and dataXX directories
HPC default umask is 077
How do I change the initial permissions that files and folders are given when created:
You need to edit your .cshrc file in your home directory and add: umaskumask 077 files get “-rw-------” folders get “drwx------” umask 027 files get “-rw-r-----” folders get “drwxr-x---” umask 022 files get “-rw-r--r--” folders get “drwxr-xr-x”
What groups are I in?: groups52 [hurricane] groups ewalterewalter : hpcf wmall hpcstaff www seadas vasp sysadmin wm hpcadmin wheel hpsmh
My primary (default) group is hpcf and the rest are secondary
Change a group associated with a file or directory: chgrp 54 [hurricane] ls -ld projectdrwx------ 2 ewalter hpcf 512 Aug 18 20:47 project55 [hurricane] chgrp hpcstaff project56 [hurricane] ls -ld projectdrwx------ 2 ewalter hpcstaff 512 Aug 18 20:47 project
Common Permissions Tasks II