the linux command line & shell scripting - biohpc · pdf filethe linux command line &...

34
The Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web] portal.biohpc.swmed.edu [email] [email protected]

Upload: dinhdiep

Post on 01-Feb-2018

238 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

The Linux Command Line & Shell Scripting

1 Updated for 2017-11-18

[web] portal.biohpc.swmed.edu

[email] [email protected]

Page 2: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Study Resources : A Free Book

2

* Some of the materials covered in today’s training is from this book

500+ pages

Page 3: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Study Resources : Tutorial Website

3

Page 4: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Study Resources : Cheat Sheets

4

On the portalTraining -> Slides & Handouts

Page 5: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Study Resources: Man pages

5

Get a command’s help page: man <command>

Press q or Ctrl-C to exit the man page

Page 6: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

You can follow along using:

1. the Nucleus Web terminal on the BioHPC portal:

https://portal.biohpc.swmed.edu/terminal/ssh/

2. Putty or SSH from your PC

3. Terminal from your MacBook

ssh <username>@nucleus.biohpc.swmed.edu

Study Resources : Follow Along…

6

Page 7: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

The Shell and Command Line Interfaces

7

The interaction between user and the operating system is provide by a shellGraphic User Interfaces & Command Line Interfaces By far the most common shell on Linux is bash (Bourne Again SHell)It has a lot of built-in commands

user name

machine name

current directory

name of the command

options/switches

arguments

Page 8: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Linux Basics: The File System

8

Files on a Linux system are arranged in a hierarchical directory structure.

/

apps bin home2 project work

{uid} {department} {department}

{uid}{PI_lab}

{uid or project_name}

shared

shared

shared

shared shared

…tmp

Page 9: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Linux Command Line: Files and Directories

9

Relative path v.s. Absolute path (full path)

• Relative path— path related to the present working directory (pwd)

If my current directory is /home2/ydu, relative path is: Training

pwd will return : /home2/ydu

If change directory to home2, relative path becomes ydu/Training

pwd will return: /home2

• Absolute path—specify the location of a file or directory from the root directory (/)

/home2/ydu/Training

/

home2

ydu

Training

(I am here)

Page 10: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Linux Command Line: Files and Directories

10

Editors

vi / vim

Cryptic commands! Cheat sheet on the portal.Quick tutorial: http://www.washington.edu/computing/unix/vi.html

EmacsAn extensible, customizable text editorQuick tutorial: http://www.gnu.org/software/emacs/tour/

nano

Easier to use.Quick tutorial: http://mintaka.sdsu.edu/reu/nano.html

Any text editor from your PC or Mac

Mount your directories as network driveshttps://portal.biohpc.swmed.edu/content/guides/biohpc-cloud-storage/

Page 11: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Demo

11

Example 1: Manage Files/FoldersExample 2: WildcardsExample 3: Disk UsageExample 4: Create Shared Folder/Files

Example 5: Text File Manipulation with Linux Tools(view/find patterns/sort/replace)as important as knowing how to use Linux itselfpopular and convenient in BioData processing and Analysis

Page 12: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 1: Manage Files/Folders

12

Command What does it do?

pwd Print Working Directory – where am I?

ls List files in current directory

ls -lah List files with detail (l = long format, a = all files including hidden, h = human readable file sizes)

cd Training Change directory to today’s training folder

mkdir example1 Make a directory called example1

touch example1/newfile.txt Create an empty new file inside example1 folder

ls –lah example1 List files inside the example1 directory

mkdir new_example Make the directory new_example

cp example1/newfile.txt new_example/ Copy newfile.txt from example1 into new_example

cp –r example1 newer_example Copy example1 recursively (directory and everything inside) to a directory called newer_example)

Page 13: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 1: Manage Files/Folders

13

Command What does it do?

mv new_example old_example Move/rename new_example directory to old_example

rmdir old_example Try to remove old_example

cd old_example Change directory into old_example

ls -alh List files with detail

cd .. Change directory to parent folder

rmdir old_example Delete an empty folder

rm –r newer_example Recursively delete all files and folders inside newer_example

Page 14: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 2: Wildcards

14

* Match any number of characters

ls * Any filels notes* Any file beginning with notesls *.txt Any file ending in .txtls *2015* Any file with 2015 somewhere in its name

? Match a single character

ls data_00?.txt Matches data_001, data002, data_00A etc.

[] Match a set of characters (bracket expression)

ls data_00[0123456789].txtls data_00[0-9].txt Matches data_001 – data_009, not data_00A

Page 15: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 3: Disk Usage

15

quota: display disk usage and limits (How much space do I have)

Page 16: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 3: Disk Usage

16

du: estimate file space usage (How much space am I taking up)

Total folder size:

Detailed usage for each sub-folders:

Page 17: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 3: Disk Usage

17

mysandyboxmysandybox.tar.gz

Space usage : 0.07T 0.02TNumber of Files : 30613

Archive files and folders will reduce both space usage and number of filesWe plan to limit the number of files for each user in the future.

Page 18: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 3: Disk Usage

18

Command What does it do?

gunzip mydata.txt.gz Extract a .gz gzipped file

tar xvf archive.tar Extract (x) everything from a tar archive file (f). Show verbose output (v).

cd .. Change directory into parent directory (CLI)

du –sh example3 Estimate space usage of folder example3

cd example3 Change directory into parent directory example3

tar cvzf new.tar.gz file1 folder1 Create (c) a compressed tar.gz archive file (f) adding files and folders to it.

gzip mydata.txt Compress a file using gzip – becomes a .gz file

cd .. Change directory into parent directory (CLI)

du –sh example3 Estimate space usage of folder example3

tar zxvf example3/archive.tar.gz Extract a compressed gzipped tar archive in one step

Page 19: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Permissions

19

drwxr-xr-x 4 dtrudgian biohpc_admin 58 Feb 16 15:13 all_training

drwxr-xr-x 7 dtrudgian biohpc_admin 140 Feb 12 10:36 Apps

drwxr-xr-x 2 dtrudgian biohpc_admin 26 Feb 16 15:12 cli_training

drwxr-xr-x 8 dtrudgian biohpc_admin 4.0K Feb 16 14:25 Cluster_Installs

drwxr-xr-x 3 dtrudgian biohpc_admin 4.0K Feb 16 11:49 Desktop

drwxr-xr-x 2 dtrudgian biohpc_admin 10 Feb 16 14:10 Documents

drwxr-xr-x 9 dtrudgian biohpc_admin 135 Feb 16 14:32 Downloads

-rw-r--r-- 1 dtrudgian biohpc_admin 336 Feb 16 15:16 error.txt

drwxr-xr-x 10 dtrudgian biohpc_admin 4.0K Feb 9 12:45 Git

drwxr-xr-x 17 dtrudgian biohpc_admin 4.0K Feb 16 15:17 ownCloud

drwxr-xr-x 2 dtrudgian biohpc_admin 10 Feb 16 14:18 Pictures

drwxr-xr-x 5 dtrudgian biohpc_admin 102 Feb 4 11:19 portal_jobs

Owner GroupPermissions

ls -l

Page 20: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

File Permissions

20

Page 21: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Octal Permissions

21

r = 4w = 2x = 1

Add up the permissions you need for each class, e.g.rx = 5rw = 6rwx = 7

-rw-r--r-- 1 dtrudgian biohpc_admin 336 Feb 16 15:16 error.txt

6 4 4Owner can read+writeGroup can readOthers can read

Page 22: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Permissions

22

chmod u/g/a +/- r/w/x filename

Class:u = user (owner)g = groupa = all

+ Add permission- Remove permission

r readw writex execute

chmod g+rw test.txt Add read/write permissions for the groupchmod a+x script.sh Add execute permission for everyonechmod g-x script.sh Remove execute permission for the group

chmod 700 script.sh -rwx------chmod 640 script.sh –rw-r-----

Page 23: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 4: Create Shared Folder

23

Command What does it do?

cd /project/biohpcadmin/shared/ Change directory to your lab’s shared folder

ls -al List files with detail to check permission

mkdir shared_data Create shared folder

cd shared_data Change directory to newly created shared_data folder

cp /project/biohpcadmin/ydu/ideas.txt . Copy file to shared folder, you group members can read and copy it

chmod g+w ideas.txt Grant write permission to allow group members edit the ideas.txt

cd .. Change directory to parent folder

chmod g+w shared_data Grant write permission to allow group members create new files inside shared_data folder

• You can only apply chmod command to the files/folders owned by you.• Send email to [email protected] if you need help to set up a shared folder

Page 24: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 5: Text File Manipulation

24

Command What does it do?

cat story.txt Show the content of story.txt in the terminal

less story.txt Show the content of story.txt in a pager that allows scrolling & searching

head –n 15 story.txt Show the first 15 lines of story.txt

tail –n 5 story.txt Show the last 5 lines of story.txt

head –n 15 story.txt | tail –n 5 Shows lines 11-15 of story.txt. We take the first 15 lines from story.txt using head, and extract the bottom of that selection by piping it through tail.

grep "elephant" story.txt Find all lines in story.txt that contain “elephant” case-sensitive.

grep –i "mouse" story.txt Find all lines in story.txt that contain “mouse” non-case sensitive.

grep –c "at" story.txt Count the number of lines in story.txt that contain “at”.

grep “^The” story.txt Find lines beginning with “The”

grep “water\.$” story.txt Find lines ending with “water”. Note that ‘.’ is a special character so we have to escape it.

grep searches for patterns using regular expressions - http://www.robelle.com/smugbook/regexpr.html

Page 25: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Example 5: Text File Manipulation

25

Command What does it do?

sort gifts.txt Sort the lines in a file alphabetically

sort -r gifts.txt Sort the lines in a file in reverse alphabetical order

sort –nr numbers.txt Sort the lines in a file in reverse numerical order

sort gifts.txt > sorted_gifts.txt Send the output of the sort command into sorted_gifts.txt overwriting existing content.

sort gifts.txt >> sorted_gifts.txt Send the output of the sort command into sorted_gifts.txt, appending to the end of the file.

sort gifts.txt 2 > error.txt Send the error output of the sort command into error.txt

sort gifts.txt 2 > /dev/null Discard the error output of the command ‘which nonsense’

sort gifts.txt 2>&1 > output.txt Combine the error output into the standard output, and direct into the file output.txt

sed "s/dog/cat/g" story.txt Change all instances of cat to dog, printing the results(substitute)/old/new/g(lobal)

sed can do a lot - http://www.grymoire.com/Unix/sed.htmlawk can do even more - http://www.grymoire.com/Unix/Awk.html

Page 26: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Bash Shell Scripting

26

In addition to the interactive mode, bash shell script is a programing language in itself

run an entire script of commandsone command at a time

A little knowledge can make difficult things easy, and time-consuming things quick.

Page 27: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Bash Shell Scripting : Variables script_varables.sh

27

#!/bin/bash

MY_NAME=daveecho Hello $MY_NAME

A variable holds information to be used later

Set them using: name=valueGet their value using: $name

Set the variable MY_NAMERun echo using the value in MY_NAME

Specify we are using bash shell

Page 28: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Bash Shell Scripting : How to run

28

chmod +x script_variables.sh./scripts_variable.sh

bash script_variables.sh-or-

sh script_variables.sh

How to run a bash script

• Method 1: convert scripts into an executable file

• Method 2: use bash or sh

Page 29: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Bash Shell Scripting : Input/Output script_IO.sh

29

#!/bin/bash

NOW=$(date +%Y-%m-%d)

DATA_DIR="data_$NOW"mkdir $DATA_DIR

module load matlabmatlab –nodisplay -nosplash < hello.m > $DATA_DIR/output.txt

We can assign the output of programs/commands into variables:

Create a directory with a name incorporating the date

Put the date (YY-MM-DD) into NOW

Run matlab with input hello.m, and send the output into our data directory

Page 30: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Bash Shell Scripting : If-else Statement script_if.sh

30

http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_07_01.html

#!/bin/bash

echo "This scripts checks for the dummy file."echo "Checking..."

if [ -f "dummy" ]; thenecho "dummy exists."

elseecho "Could not find it! "

fi

file exists test is -f

Read about the available tests here

Page 31: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Bash Shell Scripting : Loops script_loops.sh

31

Read about the other loops here

#!/bin/bash

for i in $( ls ); do

echo "item: $i"

done

Set variable i to each value in the output of the ls command

Echo the current value of $i

http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-7.html

This script lists all of the files in the current directory using a for loop:

Page 32: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Bash Shell Scripting : Pass Command Line Arguments script_arguments.sh

32

#!/bin/bash

echo "Script Name: $0 "echo "Total Number of Argument s: $#"

echo "1st Argument: $1"echo "2nd Argument: $2 "

echo "All Arguments are: $*"$* – Store all command line arguments$# – Store count of command line arguments$0 – Store name of script itself$1 – Store first command line argument$2 – Store second command line argument$3 – Store third command line argument…

Page 33: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Bash Shell Scripting : Reorganize data to fit the software requirement

33

img_000000000_TIRF_488_000.tif

img_000000000_TIRF_561_000.tif

TEST_OUTPUT

1

488.tif 561.tif

2 3

Images generated by Total Internal Refraction Fluorescence Microscope Input data structure required by DeBias

folder_reorganize.sh

TEST_INPUT

1_10 1_11 1_12

Page 34: The Linux Command Line & Shell Scripting - BioHPC  · PDF fileThe Linux Command Line & Shell Scripting 1 Updated for 2017-11-18 [web]   [email] biohpc-help@utsouthwestern.edu

Bash Shell Scripting : Reorganize data to fit the software requirement

34

#!/bin/bashINPUT_FOLDER=$1 # pass input folder name as 1st argument from terminalOUTPUT_FOLDER=$2 # pass input folder name as 2nd argument from terminali=1 # initialize iterator as 1

for SUBFOLDER in $(ls $INPUT_FOLDER);do # loop through all foldersecho $SUBFOLDER # display current sub folder nameIMG488_ORG=$INPUT_FOLDER/$SUBFOLDER/img_000000000_TIRF_488_000.tif # define 488 image nameIMG561_ORG=$INPUT_FOLDER/$SUBFOLDER/img_000000000_TIRF_561_000.tif # define 561 image nameIMG_OUTPUT=${OUTPUT_FOLDER}/$i # display current sub folder namelet "i=$i+1" # increment iteratormkdir $IMG_OUTPUT # create output folder for imagesif [ -f $IMG488_ORG ];then # if 486 image file exists,

cp $IMG488_ORG $IMG_OUTPUT/488.tif # copy image file to output folder and renameelse # if 486 image file not exists,

echo "$IMG488_ORG do not exist" # display error messagefiif [ -f $IMG561_ORG ];then

cp $IMG561_ORG $IMG_OUTPUT/561.tif else

echo "$IMG561_ORG do not exist“fi

done