getting started with xsede - university of...
TRANSCRIPT
January 10, 2012
Getting Started with XSEDE
Andrew Grimshaw and Karolina Sarnowska-
Upton
Audience
•End users and developers who want to –Access and use NSF funded XSEDE compute, data, and storage resources –Create a secure shared resource environment with collaborators around the world
–Access and use compute, data, and storage resources available via XSEDE located at other institutions
•Advanced user support personnel who work with end users and developers
Goals
•At the end of this tutorial you will… –Understand underlying system and resource model of XSEDE
–Be able to install and configure client-side tools –Understand grid command shell and basics of GUI –Be able to define and run jobs on XSEDE
–Be able to use Global Federated File System
–Be able to share data resources into XSEDE
3
Agenda
•XSEDE Architecture Overview and Context •XSEDE Genesis II Client Installation
•Using the client interfaces •Running a job with XSEDE
•Adding data resources
4
January 10, 2012
XSEDE Architecture Overview & Context
Andrew Grimshaw and Karolina Sarnowska-Upton
Initial XSEDE architecture: High-order bits
• Don’t disrupt the user community! Maintain existing TeraGrid services
• Focus on user-facing access layer – For power users, “first, do no harm”
– For other users, expand use via interfacees, new hosted XSEDE User Access Services (XUAS) and Global Federated File System (GFFS)
• Promote standards and best practices to enhance interoperability, portability, and implementation choice
6
XSEDE provides capabilities
• Access and share data between campuses and centers – Access data on center resources from the campus,
campus resources from a center, or campus A resources from campus B
• Access and share compute resources from home, campus, or center – to run a job directly on a particular resource
– submit to one or more global queues
– to execute a workflow
7
XSEDE Architecture
8
Applications,
GUIs, Portals and
Gateways, XUAS
APIs and CLIs Transparent
access via the file
system
Services &
Web Services
Infrastructure
Resources
Access Layer
XSEDE Enterprise
Services
JSDL/BES
RNS/ByteIO
GridFTP
WSI-BSP
HPC-BP
Community Provided
Services GRAM5
REST/RMI Amazon EC2
Application Deployment
Core Enterprise
Resources, e.g.,
RP resources
Other Resources, e.g.
Campus centers, Amazon,
Research group data
Implementations and Architecture
• The architecture defines the interfaces, communication, and interactions between software components
• The architecture defines how quality attributes are realized – Security, reliability, availability, performance, ..
• Architecture components (that implement interfaces) may have more than one implementation – Thus, we distinguish between the architecture and the
implementation
9
Implementation Choices
• We have made initial choices of implementations we will use – Process to evolve architecture & implementations
• Three major configuration items (software systems) providing implementations. They are (in alphabetical order) – Genesis II :CLIs, APIs, GUI, GFFS, XES services
– Globus: XAUS (XD-Data), gridFTP
– UNICORE 6: GUI, XES (BES at the SPs)
• XES services run on Grid Interface Units
10
XSEDE is a System of Systems
11
XSEDE is a system of systems: Different organizations may be running different standards-compliant software stacks.
A Typical Service Provider Setup
Connection to internet
Site backbone
Login nodes
Supercomputer and local storage
Grid Interface Unit(s)
Site wide file system and archival storage
Data DataData
Local scheduler e.g., PBS
Supercomputer and local storage
Data
A Typical Campus Setup
Connection to
internet Campus backbone
Campus
cluster
Researcher
cluster
Researcher
data set
Department
file system
Grid Interface
Unit(s)
Simple Grid Interface Unit
Local distributed file systems
Local disk
Web Service
Container
Local queuing systems Grid Interface Unit
January 10, 2012
XSEDE Genesis II Client Installation
Andrew Grimshaw and Karolina Sarnowska-Upton
Agenda
• Install Genesis II grid client
16
Acquire Installer
• Installers are delivered with Increment 1 TRR materials. • Select the installer for the appropriate
operating system platform. • Run the installer.
17
The Installation Process Questions
•OK to install? –License follows Apache license agreement
18
The Installation Process Questions
• Installation directory path? –where code and configuration files will be placed
19
**Directory to store container state will be created at ~/.genesisII-2.0
Grid Choice Question
• Shows supported grids; pick XSEDE for Increment 1 Deliverable.
20
Installer Progress...
Voila – Client Installation Complete
22
January 10, 2012
XSEDE Genesis II Client Usage
Andrew Grimshaw and Karolina Sarnowska-Upton
Agenda
• Prerequisites
– Client installed
• Access grid via:
– Cmd-line grid shell
– GUI client
– FUSE file system mount
• Learn access control basics
24
Using the Grid Client
• Multiple access methods – Cmd-line grid shell
– GUI client-ui
– FUSE file system mount
• You will learn to: – Login
– Navigate namespace
– Use GUI
– Manage access control
– Setup FUSE mount
25
Login via the CLI
• Note: All of the things we will talk about can also be done from the grid shell without using the GUI, it is just not as convenient
• Login using your grid credentials login
• Check grid credentials whoami
26
Fire up the GUI
• Type “grid”
• At the command line type “client-ui”
• You should see something like this
• Let’s look around – /queus
– /users
– /home
– /groups
27
/users versus /home
• /users is a directory of end user identities
– Used to log in and to add people to access control lists, e.g., chmod myfile +r /users/karolina
• /home shows home directories in GFFS of users … you can put files and directories there
– E.g., /home/grimshaw/data.txt
28
GUI Grid Client: Start-Up Basics
• Browse to /home
Click on your directory icon
• Open GUI sub shell
– Select “Tools”, then “grid shell”
• Shell as tab completion, history, help, etc.
29
GUI Grid Client: Tearing off a Browser
• Create additional GUI browser of grid global namespace by:
Clicking Tear icon and draging to tear off browser
30
GUI Grid Client: View Access Control
• To view access control information: Browse to and highlight resource, then select Security tab
31
Exercise: Give read access to your neighbor
GUI Grid Client: Edit Access Control
• Select credential to be added – Add specific user by browsing to user identity under /users – Add everyone by selecting Everyone icon – Add specific username/password token by filling in dialog box and
selecting icon
32
• Drag and drop credential to add desired rwx permission
That’s it for the GUI for now
Let’s look at mapping the
directory structure into the local
file system using FUSE
33
FUSE Mounting the Grid: Overview
• Filesystem in Userspace (FUSE) is a loadable kernel module for Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code
• We use FUSE to provide accesses to gird resources directly from your Linux file system via a directory mount point
34
FUSE Mounting the Grid: Setup Basics
• Ensure you are logged into the grid GenesisII/grid whoami
• Create empty Unix directory to use as mount point mkdir XSEDE
• Mount grid at mount point nohup GenesisII/grid fuse --mount
local:XSEDE &
– Now you can access XSEDE via your file system
– Can add command to your Unix login dotfile to setup FUSE
mount automatically on Unix login
35
Result
• XSEDE resources regardless of location can be accessed via the file system – Files and directories can be accessed by programs and
shell scripts as if they were local files
– Jobs can be started by copying job descriptions into directories
– One can see the jobs running or queued by doing an “ls”.
– One can “cd” into a running job and access the working directory where the job is running directly
36
GUI Grid Client: Editing Files
• Edit files in default editor (from client-ui sub-shell or grid shell) edit <filename>
• In Linux, EDITOR environment variable needs to be set before running grid client; e.g.: export
EDITOR=/usr/bin/vim
37
GUI Grid Client: Configuring Preferences
• Select Preferences under File menu to configure:
– Credential verbosity
– Shell fonts
– Default job history level
– XML display mode
38
January 10, 2012
Running a Job with XSEDE
Andrew Grimshaw and Karolina Sarnowska-Upton
Audience & Goals
• Audience – End users and developers who want to
• Access and use NSF funded XSEDE compute resources
• Create secure shared compute environment with collaborators around the world
• Access and use compute resources available via XSEDE located at other institutions
– Advanced user support personnel who work with end users and developers
• Goals: at the end of this tutorial you will – Be able to define and run jobs on XSEDE
Prerequisites
• Installed Genesis II client software
• Grid account with permission to run jobs
• Basic grid shell and client GUI understanding
41
XSEDE Activities (a.k.a. jobs)
• What are jobs in XSEDE?
• How are jobs executed?
• How are jobs specified?
• How to interact with jobs while they are running?
• Compute Grid Use module
– JSDL tool
– Grid queue
– Interacting with jobs
– Job state change notification
What are Jobs in XSEDE?
• A job is a unit of work that executes a program – Really pretty generic: much like PBS or LSF job – Program may be sequential, threaded, hybrid GPGPU program, or
traditional parallel using MPI or OpenMP – Programs can be command line programs or shell scripts that take zero
or more parameters
• Jobs MAY specify files to be staged in before execution and out after execution – This MAY include executables and libraries
• Jobs MAY specify file systems to mount, e.g., SCRATCH or GFFS (Global Federated File System)
• Jobs MAY specify resource requirements such as operating system, amount of memory, number of CPU’s, or other matching criteria
• Jobs MAY be parameter sweep jobs with arbitrary number of dimensions
43
How are Jobs Executed?
• Job are executed by grid resources that implement the OGSA Basic Execution Services (BES) interface
– These are referred to as BESes
• Users submit jobs directly to BES or to a grid queue
44
BESes: Basic Execution Services
• BESes run jobs on particular compute resources – Manage data staging for jobs
– Monitor job progress/completion
– Maintains job state
• “Compute resources” may be workstations, clusters, or supercomputers
• Each BES has a set of resource properties such as operating system, memory, number of cores, etc. that can be used to match jobs to BESes for execution
XCG Tutorial
Grid Queues
• Work much like any other queuing system
• Grid users submit jobs to grid queue
• Maintain: – List of (BES) compute resources available for scheduling
– Description of capabilities of each compute resource
– List of jobs and statuses
• Match jobs to available compute resources – Ask matching resources to run jobs
• Monitor job progress/completion
• Cmd-line and GUI tools to manage jobs in queue – qsub, qstat, qkill, qcomplete, queue manager
XCG Tutorial
Grid Queues – Cmd-line View
• Check queue/job status with:
qstat <queue-path>
XCG Tutorial
Grid Queues – GUI Queue Manager
48
Click in the Max Slots column in the row for the desired resource, type in a
number, and save.
• Queue Manager presents information about jobs and resources currently managed by queue
Grid Queues – Job Execution
XCG
Tutorial
jo
b1
jo
b4
jo
b3
jo
b2
Grid-Queue
BES1
BES3 BES2 BES executes job
Job Execution – The Working Directory
XCG
Tutorial
BES1
job1 job3 job2
activities
runA runB
my_job_data
BES stages data
to/from job working
dir as specified in
JSDL
BES creates unique
working dir for
each job
User submits job/queue schedules on BES
working-dir
How are Jobs Specified?
• Jobs are specified using the Open Grid Forum standard Job Specification Description Language 1.0 (JSDL) – XML-based language – Widely adopted – Not intended for human consumption
• Job information that is specified – Identity – Application description – Resource requirements – Data staging
51
XCG Tutorial
JSDL Fragment
Gdfg
Job Name
Resource
Requirement
Application
Description
Creating JSDL Files using the Grid Job Tool
• Manual Creation: – Use editor to create XML file – Difficult and error-prone due
to XML’s eccentricities – Easiest method: start with
existing JSDL and modify (carefully)
• Using Grid Job Tool: – GUI builder for JSDL files – User describes job in GUI – Description can be saved as
GridJobTool “project” file • edit/re-use project to create
new JSDL files
– Automatically generates XML from user provided description
– Started with grid command job-tool
XCG Tutorial
How to Launch Job Tool from GUI Browser
• Select directory where you want JSDL project file located
OR
• Select execution container (BES or queue) where you want to execute job
54
How are Jobs Submitted for Execution?
• Recall: Jobs submitted to BES or grid-queue
• Jobs can be submitted via
– Grid shell run (to BES) or qsub (to queue) commands
– JSDL tool menu option from GUI grid shell
– Copying JDSL file to BES’s “submission-point” pseudo-directory
55
Using “run” to Execute Jobs
• Check command syntax
– help run
• EXAMPLE run command for gnomad
– run --jsdl=<jsdf-file> <path-to-bes>
56
Using a Grid Queue to Execute Jobs
• General purpose XSEDE grid queue location /queues/grid-queue
• Submission syntax qsub <queue-path> <JSDL-file>
OR cp <queue-path>/submission-point <JSDL-file>
• Example submission qsub /queues/grid-queue local:gnomad.xml
57
Job Submission Exercises
• GOAL: Run some simple jobs
– Create and execute hostname.jsdl
• Single job and parameter sweep
• Example files located at
/examples
58
Interact with Jobs via Queue Manager
• You can stop, check status, examine job history, or reschedule a job
• You can interact with a job’s working directory if job is in a running state on a (Genesis II) BES
59
View Job Information in Queue Manager
• Status – QUEUED: job waiting to be scheduled on BES resources – REQUEUED: job failed execution at least once and has been automatically re-queued – ERROR: job failed the maximum allowable execution attempts and will not be re-queued – On <BES name>: job passed to <BES name> for execution
• Note: Does not connote status within BES (job may be running, queued, staging data, etc.)
– FINISHED: job executed successfully
• Attempts – Number of times queue has tried schedule job for execution – Some failures do not increment attempts
• grid software failures • job preempted due to local BES policies
• Ticket – Unique ID assigned by grid queue to job on submission
• Queue keeps status of active and completed jobs – Jobs in final status (ERROR and FINISHED) need to be cleaned up by user
qcomplete <queue name> { --all | <job ticket>+ }
XCG Tutorial
Examine Job History in Queue Manager
61
• Right-clicking on job provides information about job’s history in different levels of detail
Scratch file system
• Persists on BES between runs
• Good for caching large or frequently used files
62
Interact with Job Working Directory
• When using Genesis II BES resources, job working directory is accessible via GFFS
• Working-directory is located in queue where job was submitted at <queue-path>/jobs/mine/running
• For each running job, there is a directory with job ticket number with two entries: – status
• file containing state of job (e.g. queued, running)
– working-dir • session execution directory of running job • read/write/create/delete files here to interact with running job
• If job was submitted directly to BES, job directory is located at <bes-path>/activities
63
XCG Tutorial
JSDL File Contents Explained
• Identifier Info – Descriptive information
about job, e.g. job name
<JobIdentification>
<JobName>Adder</JobName>
</JobIdentification>
<Resources>
<OperatingSystem>
<OperatingSystemType>
<OperatingSystemName>LINUX
</OperatingSystemName>
</OperatingSystemType>
</OperatingSystem>
</Resources>
• Resource Requirements
– Describe resources job requires
• Memory
• OS
• Architecture
• Number of processors
• Run time
XCG Tutorial
JSDL File Contents Explained
• Application Description – Describe execution
• Executable name • Arguments • Routing for stdout and
stderr
<Application>
<POSIXApplication>
<Executable>adder.sh</Executable>
<Output>stdout</Output>
<Argument>seven.dat</Argument>
<Argument>fourty-two.dat</Argument>
<Argument>sum.dat</Argument>
<Argument>10</Argument>
</POSIXApplication>
</Application>
XCG Tutorial
JSDL File Contents Explained
• DataStaging – Describe data to copy in/out – Several transport options:
• http • scp (secure copy) • RNS (grid directory structure) • Email (out only)
– Copy in (data staging source): • Source is URL of remote file to
be copied in • FileName is name within job
working directory where file will be copied to
– Copy out (data staging target): • Target is URL of remote file to
be copied to • FileName is name within job
working directory of file to be copied out
– Other file handling info
<DataStaging>
<FileName>adder.sh</FileName>
<CreationFlag>overwrite</CreationFlag>
<DeleteOnTermination>true</DeleteOnTermination>
<Source>
<URI>http://www.cs.virginia.edu/adder.sh</URI>
</Source>
</DataStaging>
<DataStaging>
<FileName>sum.dat</FileName>
<CreationFlag>overwrite</CreationFlag>
<DeleteOnTermination>true</DeleteOnTermination>
<Target>
<URI>rns:sum.dat</URI>
</Target>
</DataStaging>
XCG Tutorial
Example JSDL
Gdfg
Job Name
Resource
Requirement
Application
Description
XCG
Tutorial
Gdfg
Data Staging
Requests
January 10, 2012
Adding DATA Resources into the Grid
Andrew Grimshaw
Ways to Add Data into the Grid
• Create files and directories
• Export file system directory
70
Creating Files in the Grid
• Creating a file (or directory) places its state on same grid container as its containing directory
• For example, all these following commands place files and directories in container where /home/bob resides
echo “hello” > /home/bob/newFile
mkdir /home/bob/testDir
cp local:testFile grid:/home/bob/testFile
71
Creating Directories on Specific Containers
• Files can be created on other containers by specifying creating a containing directory on target container
• Directory placement location can be changed by explicitly specifying grid container to be used – Path to service on target container is given to
directory creation command (service is EnhancedRNSServicePortType)
mkdir --rns-service=<rns-service-path> <new-dir-path>
72
Exports: Mapping Data into the Grid
• Basic idea: create grid resource that securely proxies access to local files and directories via RNS and ByteIO web services
• We use an “export” service to proxy a local file system directory tree into grid
• To create “export”, create instance of LightWeightExportPortType
– Via the command line
– Via the GUI (for local hosts)
73
XCG Tutorial
Exporting: Mapping a local directory
structure into the global namespace
Export Service
user
/home myFiles
Export services redirects
calls from grid export to
local file system
Export service mounts
local directory into
global namespace
User runs export command
myExport
Export Creation Example: Cmdline
• Creating an export maps specified directory on container host into specified GFFS path
• To run export command, you need to know – Location of files you want to export – On which container you will create export resource
(service is LightWeightExportPortType) – Location in global namespace where you want to
mount export export --create <path-to-service> <local-path-to-files>
<GFFS-path-for-export>
• Quitting export turns off export service (underlying files in local file system are left intact) export --quit <GFFS-path-for-export>
75
Export Creation Example: GUI
• Provide: – Location of files you want to export – Location in global namespace where you want to mount files
76
Export Security Settings Recommendation
• Give users extended access control to enable export creation
• Allow only admin users to create exports
77