intel® cluster studio xe (ics) introduction lab manual basic usage

14
Intel® Cluster Studio XE Introduction Development Product Division Software and Service Group Student Workbook ©2012 Intel® Corporation - i - Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage of ICS tools for Linux* ___________________________________________________________________ Development Product Division

Upload: others

Post on 26-Mar-2022

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Intel® Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook ©2012 Intel® Corporation

- i -

Intel® Cluster Studio XE (ICS) Introduction Lab Manual

Basic usage of ICS tools for

Linux* ___________________________________________________________________

Development Product Division

Page 2: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Intel® Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook ©2012 Intel® Corporation

- ii -

Disclaimer The information contained in this document is provided for informational purposes only and represents the current view of Intel Corporation ("Intel") and its contributors ("Contributors") on, as of the date

of publication. Intel and the Contributors make no commitment to update the information contained in this document, and Intel reserves the right to make changes at any time, without notice.

DISCLAIMER. THIS DOCUMENT, IS PROVIDED "AS IS." NEITHER INTEL, NOR THE CONTRIBUTORS MAKE ANY REPRESENTATIONS OF ANY KIND WITH RESPECT TO PRODUCTS REFERENCED HEREIN, WHETHER SUCH PRODUCTS ARE THOSE OF INTEL, THE CONTRIBUTORS, OR THIRD PARTIES. INTEL, AND ITS CONTRIBUTORS EXPRESSLY DISCLAIM ANY AND ALL WARRANTIES, IMPLIED OR EXPRESS, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR ANY

PARTICULAR PURPOSE, NON-INFRINGEMENT, AND ANY WARRANTY ARISING OUT OF THE

INFORMATION CONTAINED HEREIN, INCLUDING WITHOUT LIMITATION, ANY PRODUCTS, SPECIFICATIONS, OR OTHER MATERIALS REFERENCED HEREIN. INTEL, AND ITS CONTRIBUTORS DO NOT WARRANT THAT THIS DOCUMENT IS FREE FROM ERRORS, OR THAT ANY PRODUCTS OR OTHER TECHNOLOGY DEVELOPED IN CONFORMANCE WITH THIS DOCUMENT WILL PERFORM IN THE INTENDED MANNER, OR WILL BE FREE FROM INFRINGEMENT OF THIRD PARTY PROPRIETARY

RIGHTS, AND INTEL, AND ITS CONTRIBUTORS DISCLAIM ALL LIABILITY THEREFOR. INTEL, AND ITS CONTRIBUTORS DO NOT WARRANT THAT ANY PRODUCT REFERENCED HEREIN OR ANY PRODUCT OR TECHNOLOGY DEVELOPED IN RELIANCE UPON THIS DOCUMENT, IN WHOLE OR IN PART, WILL BE SUFFICIENT, ACCURATE, RELIABLE, COMPLETE, FREE FROM DEFECTS OR SAFE FOR ITS INTENDED PURPOSE, AND HEREBY DISCLAIM ALL LIABILITIES THEREFOR. ANY PERSON MAKING, USING OR SELLING SUCH PRODUCT OR TECHNOLOGY DOES SO AT HIS OR HER OWN RISK.

Licenses may be required. Intel, its contributors and others may have patents or pending patent applications, trademarks, copyrights or other intellectual proprietary rights covering subject matter contained or described in this document. No license, express, implied, by estoppels or otherwise, to

any intellectual property rights of Intel or any other party is granted herein. It is your responsibility to seek licenses for such intellectual property rights from Intel and others where appropriate. Limited License Grant. Intel hereby grants you a limited copyright license to copy this document for your use and internal distribution only. You may not distribute this document externally, in whole or in part, to

any other person or entity. LIMITED LIABILITY. IN NO EVENT SHALL INTEL, OR ITS CONTRIBUTORS HAVE ANY LIABILITY TO YOU OR TO ANY OTHER THIRD PARTY, FOR ANY LOST PROFITS, LOST DATA, LOSS OF USE OR COSTS OF PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES, OR FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF YOUR USE OF THIS DOCUMENT OR RELIANCE UPON THE INFORMATION CONTAINED HEREIN, UNDER ANY CAUSE OF ACTION OR THEORY OF LIABILITY, AND IRRESPECTIVE OF WHETHER INTEL, OR ANY CONTRIBUTOR

HAS ADVANCE NOTICE OF THE POSSIBILITY OF SUCH DAMAGES. THESE LIMITATIONS SHALL APPLY NOTWITHSTANDING THE FAILURE OF THE ESSENTIAL PURPOSE OF ANY LIMITED REMEDY.

Intel and Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

*Other names and brands may be claimed as the property of others.

Copyright © 2009, Intel Corporation. All Rights Reserved.

Page 3: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Intel® Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook ©2012 Intel® Corporation

- iii -

Table of Contents Intel® Cluster Studio XE Introduction Lab Manual ..................................................................................... i

Basic usage of contained tools for Linux* ................................................................................................... i

Development Product Division ..................................................................................................... i

Disclaimer ...................................................................................................................................................... ii

Intel® Cluster Studio XE Introduction Lab Manual Lab Manual ................................................................ 1

Activity 1 – Compile, link and run simple MPI program ............................................................... 1

Activity 2 – Generate and view trace file of simple MPI program ................................................ 4

Activity 3 – Use Message Checker on erroneous MPI program.................................................. 8

Activity 4 – Use Inspector XE for finding memory issues ............................................................ 9

Activity 5 – Use Amplifier XE for a simple Hot Spot analysis .................................................... 10

Activity 6 – Compile and run a Co Array Fortran Program ........................................................ 11

Page 4: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

1

Intel® Cluster Studio XE Introduction Lab Manual Lab Manual

Time Required 60 minutes (Lab1-5)

Note: Lab 6 not yet ready

Objective

In this lab session, you will use Intel® Cluster Studio XE’s tools in a basic way on simple test programs.

After successfully completing this lab’s activities, you will be able to:

• Compile, link and run Intel® MPI programs

• Check and Analyze Intel® MPI programs with ITAC

• Find memory issues with Inspector XE and perform a simple Hotspot

Analysis with VTune Amplifier

• Compile link and run a simple Co Array Fortran program (optional)

Before starting with Activities you may read ICS_Training/README.

Page 5: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

2

Activity 1 – Build and run simple MPI program

Time Required 10 minutes

Objective

After successfully completing this activity, you will be able to:

• Compile and link a simple MPI program with Intel MPI

• Run the application on a cluster

We use the simple test program test.c, test.cc or test.f for a start. The program simply prints the “Hello World” message from each rank. C, C++ and Fortran versions are available. For compilation and linkage it is recommended to use scripts mentioned in the lecture.

Running the program with mpirun should be done first on a single node and later on several nodes – if available – using a hostfile or using the default from batch system

Build & Run the program

1. Copy the tar file ICS_Training.tar.gz to your target cluster. Unpack the tar file and cd into ICS_Training/01_Introduction/01_Build. The directory should be empty!

2. Make sure the Cluster Studio environment is set. Check if I_MPI_ROOT is set. If it is not set

please look for the Cluster Studio directory. Default is: /opt/intel/… Source the environment script:

$ source /opt/intel/ics/ictvars.sh check again if I_MPI_ROOT is set. You may alternatively use the environment script ICS_Training/env_script.

3. Copy the simple test program in your preferred programming language:

$ cp $I_MPI_ROOT/test/test.c . test.cc and test.f are also available!

4. Compile the test program using a compile script mentioned in the lecture. Name the resulting

program test.x

5. Run the program using mpirun on a single node. How many cores are available?

6. Run the program on at least 2 different nodes. Set I_MPI_DEBUG to 4 and check on which

node and core the program ran.

Page 6: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

3

Review Questions

Question 1: How do compile scripts map to different languages e.g. which compile script is used for Fortran programs?

Question 2: What is the difference between mpicc and mpiicc?

Question 3: Does mpiifort always take the ifort compiler contained in the same package? Can you

force mpiifort to take a different version? People who don’t like Fortran might answer the corresponding question for mpiicc.

Question 4: What is the strategy IMPI uses to fill up nodes? What is the default process pinning when

we have less ranks than physical core: e.g. when we have 8 physical cores per node and 2 nodes. We intend to use only 12 ranks for the available 16 cores.

Page 7: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

4

Activity 2 – Generate and view ITAC trace file of simple MPI program

Time Required 20 minutes

Objective

After successfully completing this activity, you will be able to:

• Generate a trace file from programs using Intel MPI

• View the trace file on the Linux cluster or on your own Laptop

There are several ways of getting an ITAC trace file from programs running with Intel MPI. It is assumed that Intel MPI is used as a shared library. It is, however, possible to link Intel MPI statically but this can cause e.g. problems with non matching glibc versions.

When trace generation is successful a number of files prefixed by the program name will appear. This STF (Scalable Trace file Format) was originally designed for performance reasons. In most cases it is convenient, without any penalty, to reduce this to a single file.

Copy the program and run it with tracing

1. Go to 02_ITAC directory and copy the program test.x from Activity 1:

$ cd ICS_Training/02_ITAC $ cp ../02_ITAC/test.x .

2. Generate a trace by using the “-trace” flag on mpirun. Use a convenient number of ranks.

(Minimum == 2)

3. Rename the program to test_1.x and set the LD_PRELOAD environment variable. Run the

program with a different number of ranks but without using “-trace”

4. Copy the source of e.g. test.c to this directory and compile it using the “-trace” flag in the

compile script. Name the program test_2.x and run it without using “-trace” on mpirun

5. Reduce to single trace file: set the environment variable:

$ export VT_LOGFILE_FORMAT=STFSINGLE Rename test_2.x to test_3.x and run the program again.

Finally we should have 3 trace files. For test_3.x there should be just single file test_3.x.single.stf.

For N=1,2 there should be several files.

View the trace files

It is most convenient to just use the ITAC GUI traceanalyzer from the ICS package that will be automatically in your path. You must, of course, have permission to open an X window on the target

computer. If you are not allowed to work directly on this computer you may copy all generated trace files to another Linux* computer (for test_3.x.stf you just need this file) with traceanalyzer installed.

Page 8: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

5

You may also copy the trace file(s) to a computer (laptop) running Windows and use the Windows version of traceanalyzer (The install file is provided with the ICS package: traceanalyzer.exe)

1. X-Window connection is running: start the GUI on Linux* computer $ traceanalyzer test_1.x.stf

2. Traceanalyzer is installed on Laptop: Copy test_3.x.stf to your laptop and double click on it

3. Function profile opens first. Find out which rank takes the longest MPI time? Hint view the tabs

on top of the function profile view and try Load Balance:

4. Use the context menu (right mouse click) on the MPI time to get information about the

individual MPI routines (Ungroup Group MPI)

Page 9: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

6

5. The most important view of ITAC is the Event timeline that shows the temporal development

including MPI time on individual ranks and message lines between ranks. Open the Event timeline by selection of Charts -> Event Timeline

Page 10: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

7

6. Can you get more information about the messages sent? Hint: try the context menu

Review Questions

Question 1: How do you sort timings in the function profile?

Question 2: Which time is displayed in the function profile

Question 3: Can you compute the time per process?

Question 4: What kind of information do you get for each message from the contex menu.

Question 5: Which MPI routine hides behind the small red bar before MPI_Send?

Page 11: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

8

Activity 3 – Use Correctness Checker on erroneous MPI program

Time Required 10 minutes

Objective

Run erroneous program with message checker library (part of ITAC).

After successfully completing this activity, you will be able to:

Run message checker to see if it finds MPI related errors

Find the error location in the source code

Message checker can be used in the same way as the default ITAC library. You just have to use the

flag –check instead of –trace. When using the LD_PRELOAD environment variable you have to specify the checking library called: libVTmc.so

Without any additional setting Message checker will give you a short summary at the end of the run It

will also show warnings and errors during run time. We will learn more details later in the correctness and debugging session.

1. Enter directory ICS_training/01_Introduction/03_Correctness. This directory should contain a

modified version of test.c with an error. Compile this file with mpiicc using the additional “-g” flag for getting more debug information. Run the program with additional “-check” flag on for

mpirun.

2. Do the same by setting LD_PRELOAD to libVTmc.so but without using the –check flag on

mpirun.

Review Questions

Question 1: What is the root cause of the MPI error? Using “diff” is unsportsmanlike!

Question 2: Which MPI functions are involved?

Question 3: Can we exactly locate these functions?

Question 4: What happens when you forget to compile with –g?

Page 12: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

9

Activity 4 – Use Inspector XE for finding memory issues

Time Required 10 minutes

Objective

Examine the application for memory issues.

After successfully completing this activity, you will be able to:

Run Inspector XE with MPI programs

Locate root cause of errors in source code

Like in the previous activity we use Inspector XE on a modified version of test.c containing an error.

But this time it is no MPI related error but a memory bug. Most of these bugs may be found already in sequential run of the program. It is, however, convenient to run Inspector XE on a MPI program because errors could happen in branches that are only used in parallel mode.

Build and Run the program with Inspector XE

1. Enter directory: ICS_training/01_Introduction/04_Inspector. This directory contains a modified version of test.c called test_memory.c

2. Compile this program as in the previous activity using the “-g” debug flag. Name the program:

test_memory.x

3. Run the mi3 analysis on this program with Inspector XE. Look up the syntax in the

Introduction.pdf lecture.

View results

4. The Inspector XE GUI is not included in the ICS package. Results may be viewed by using the

command line interface (use lecture):

Review Questions

Question 1: What is the error in test_memory.c. Please ignore all possible findings that are not related

to our module ( == test_memory.x)

Question 2: What happens when you compile without “-g”

Page 13: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

10

Activity 5 – Use VTune™ Amplifier XE for a simple Hot Spot analysis

Time Required 10 minutes

Objective

Perform a simple hot spot analysis on the program

After successfully completing this activity, you will be able to:

Profile the application with VTune™ Amplifier XE

Observe the call stack for a hotspot.

Vtune™ Amplifier XE is a powerful tool for profiling applications. We will just demonstrate the very

basic usage for MPI programs. The information may help us getting an overview on program structure and performance. This Information is helpful for further instrumentation with ITAC. More sophisticated usage of VTune™ Amplifier XE will be taught in other lectures. We will use another feature of AXE

when analyzing the threaded part of a hybrid MPI/OpenMP application.

So far we only used the default IMPI test as lab example. For this activity we change to the 2D

poisson example. This differential equation can model heat flow for example. We will study this example for the rest of the training because it represents features of a huge class of scientific application but is easy to install and run.

The poisson solver uses a quadratic grid.

Build & Run the program

1. Change to ICS_Training/01_Introduction/05_Amplifier and copy the poisson directory: $ cp –r ../../Poisson-C .

$ cd Poisson-C open Makefile and add “-g” to CFLAGS and LDFLAGS $ make If everything works correctly the executable poisson.x will appear.

2. Run the hotspot analysis in a similar way as we did for Inspector XE. The exact command line

is found in the Introduction lecture. Use the “hotspots” analysis type. Use the flag “-n 2000” for poisson.x execution. This will change the grid dimension from 1000x1000 to 2000x2000.

View results

3. The VTune™ Amplifier XE GUI may not be available. We can, however, get a first impression by running the command line tool on the results.

Page 14: Intel® Cluster Studio XE (ICS) Introduction Lab Manual Basic usage

Cluster Studio XE Introduction

Development Product Division Software and Service Group Student Workbook © 2012 Intel® Corporation

11

Activity 6 – Compile and run a Co Array Fortran Program (not yet

available)

Time Required 10 minutes

Objective

Compile and run the CAF hello world

After successfully completing this activity, you will be able to:

How to compile a CAF program with ifort

Modify the output with additional synchronization

This activity is optional because Co Array Fortran (CAF) may only interest Fortran Programmers. CAF became interesting because it is now part of the Fortran 2008 Standard. These Fortran extensions provide a simple way to parallel programming.

Build & Run the program