) cluster checker 2.1 test modules reference guide - intel® … ·  · 2013-08-07intel r...

32
Intel R Cluster Checker 2.1 Test Modules Reference Guide

Upload: phungmien

Post on 20-May-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

Intel R© Cluster Checker 2.1Test Modules Reference Guide

Intel R© Cluster Checker 2.1

Contents

1 clock 21.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3.1 deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 dgemm 22.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.3.1 mflops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.3.2 deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.3.3 m, n, k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.3.4 memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

3 disk bandwidth 33.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.3.1 bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.3.2 deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.3.3 options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.3.4 workdir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

3.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

4 ethernet 44.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4.3.1 driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.3.2 name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.3.3 options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.3.4 version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54.7 NOTES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

5 generic 55.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

5.3.1 test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

6 hardware 66.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

6.3.1 exclude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76.3.2 options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

6.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

i

Intel R© Cluster Checker 2.1

6.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

7 hpl 87.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

7.3.1 fabric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87.3.2 mpi-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97.3.3 process-number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97.3.4 Ns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97.3.5 Ps and Qs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97.3.6 NBs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

7.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

8 libraries 108.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

8.3.1 exclude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

9 micconf 119.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

9.3.1 options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119.3.2 mpss-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119.3.3 include . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119.3.4 exclude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129.3.5 diff-c . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129.3.6 diff-mb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129.3.7 diff-uv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

9.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

10 micmpi 1210.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1210.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1210.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

10.3.1 device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1310.3.2 mpi-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1310.3.3 process-number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13



11 micperf 1411.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1411.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1411.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

11.3.1 stream-deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1411.3.2 stream-bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

11.4 XML CONFIG EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1411.5 NODEFILE CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1411.6 NODELIST EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1411.7 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1411.8 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

ii

Intel R© Cluster Checker 2.1

12 mount 1512.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1512.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

12.2.1 Compliance mode: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1512.2.2 Execution modes other than compliance: . . . . . . . . . . . . . . . . . . . . . . . . . 15

12.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1512.3.1 user . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1512.3.2 sticky . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

12.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1512.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1512.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1512.7 NOTES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

13 mpi internode 1613.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1613.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1613.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

13.3.1 device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1613.3.2 mpi-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1613.3.3 process-number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

13.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1613.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1613.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1713.7 NOTES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

14 mpi local 1714.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1714.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

14.2.1 Compliance mode: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1714.2.2 Execution modes other than compliance: . . . . . . . . . . . . . . . . . . . . . . . . . 17

14.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1714.3.1 device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1714.3.2 mpi-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1714.3.3 process-number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

14.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1814.5 INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1814.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

14.6.1 Compliance mode: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1814.6.2 Other modes: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

14.7 NOTES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

15 packages 1815.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1815.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1815.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

15.3.1 node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1915.3.2 head . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1915.3.3 exclude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1915.3.4 include . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1915.3.5 command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

15.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1915.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1915.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

16 ping 2016.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2016.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2016.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

16.3.1 time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2016.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2016.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2016.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

iii

Intel R© Cluster Checker 2.1

17 process 2017.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2017.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2017.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

17.3.1 elapsed time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2117.3.2 exclude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2117.3.3 exempt uids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2117.3.4 percent cpu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2117.3.5 percent memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2117.3.6 zombie allowed elapsed time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

17.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2117.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2117.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

18 remote login 2218.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2218.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2218.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

18.3.1 cmd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2218.3.2 time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2218.3.3 version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

18.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2218.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2218.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

19 shells 2219.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2319.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2319.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

19.3.1 none . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2319.4 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2319.5 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

20 storage 2320.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2320.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2320.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2320.4 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2320.5 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

21 stream 2421.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2421.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2421.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

21.3.1 bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2421.3.2 deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

21.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2421.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2421.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

22 tools 2422.1 DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2422.2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2422.3 CONFIGURATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

22.3.1 python-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2522.3.2 python-version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2522.3.3 perl-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2522.3.4 perl-version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2522.3.5 tclsh-path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2522.3.6 tclsh-version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

22.4 EXAMPLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2522.5 MODULE INFORMATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2522.6 DEPENDENCIES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

iv

Intel R© Cluster Checker 2.1

Disclaimer and Legal Information

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NOLICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROP-ERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS ANDCONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER ANDINTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OFINTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PAR-TICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OROTHER INTELLECTUAL PROPERTY RIGHT.

A "Mission Critical Application" is any application in which failure of the Intel Product could result, directlyor indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FORANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL ANDITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, ANDEMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES ANDREASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OFPRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSIONCRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENTIN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.

Intel may make changes to specifications and product descriptions at any time, without notice. Designers mustnot rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intelreserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilitiesarising from future changes to them. The information here is subject to change without notice. Do notfinalize a design with this information.

The products described in this document may contain design defects or errors known as errata which maycause the product to deviate from published specifications. Current characterized errata are available onrequest.

Contact your local Intel sales office or your distributor to obtain the latest specifications and before placingyour product order.

Copies of documents which have an order number and are referenced in this document, or other Intel literature,may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm

Intel, the Intel logo, the Intel Inside logo, Xeon, and Intel Xeon Phi are trademarks of Intel Corporation in theU.S. and/or other countries.

* Other names and brands may be claimed as the property of others.

c© 2013 Intel Corporation. All rights reserved.

1

Intel R© Cluster Checker 2.1

1 clock

Check the cluster clock synchronization

1.1 DESCRIPTION

The clock module checks that system clocks on each compute node are reasonably synchronized.Having nodes with synchronized clocks ensures proper functionality. Component developers usually assumethis synchronization. When logging distributed events, synchronized timekeeping will help to order the events.

1.2 METHOD

The clock module calculates the difference between the node time and the cluster median time and comparesit to a threshold value. The date command is used to gather timing information.

1.3 CONFIGURATION

1.3.1 deviation

The deviation tag specifies the maximum deviation (in seconds) of the clock on any node from the clustermedian.Allowed values: Decimal values greater than zero.Default value: 300 seconds

1.4 EXAMPLE

<clock><deviation>300</deviation>

</clock>

1.5 MODULE INFORMATION

• Level: 2

1.6 DEPENDENCIES

• Commands: date

• Test Modules: remote login

2 dgemm

Check the floating point performance of a node

2.1 DESCRIPTION

The dgemm module checks the floating point performance of each cluster compute node. The test moduleexecutes the DGEMM library routine from the Intel(R) Math Kernel Library to measure the floating pointperformance and deviation over the cluster nodes.

2.2 METHOD

By default, a pre-built binary is used to calculate performance. If no thresholds are configured, the resultsare considered indeterminate. The deviation in reported performance is checked when there are three or morevalid results from the compute nodes.

2

Intel R© Cluster Checker 2.1

2.3 CONFIGURATION

2.3.1 mops

The mflops tag specifies the minimum acceptable floating point performance in MFLOPS.NOTE: If not configured, no comparison will be made and the obtained mflops, will be displayed.Default value: none

2.3.2 deviation

The deviation tag specifies the number of allowed standard deviations from median, used to search foroutlier values. The allowed range is (median +/- deviation x stddev).Default value: 3

2.3.3 m, n, k

These tags specify the matrix dimensions used in DGEMM. The total memory required is (mn + mk + nk)x sizeof(double) bytes.Default values: m = 6000, n = 6000, k = 120NOTE: (memory requirement = 200 MB if sizeof(double) is 8 bytes)

2.3.4 memory

The memory tag specifies the percentage of memory to use. This tag automatically calculates m and n (bysetting m = n) for a fixed value of k (k = 120), to best match the configured percentage of total memory ofthe node. This tag has precedence over m, n and k. If memory is used, m, n and k won't be used.Default value: none

2.4 EXAMPLE

<dgemm><deviation>3</deviation><k>120</k><m>6000</m><mflops>17000</mflops><n>6000</n>

</dgemm>

2.5 MODULE INFORMATION

• Level: 3

2.6 DEPENDENCIES

• Commands: grep, awk

• Test Modules: remote login

3 disk bandwidth

Single-node Disk Bandwidth

3.1 DESCRIPTION

The disk bandwidth module checks the disk I/O bandwidth of each compute node and its deviation amongcluster compute nodes. Deviation is checked only if there are three or more valid results from the computenodes.Having a uniform disk bandwidth allows distributed computations to run without the need for complex bal-ancing schemes.

3

Intel R© Cluster Checker 2.1

3.2 METHOD

The IOzone* filesystem benchmark is used to exercise I/O. The test module executes the benchmark in automode with 64MB files using direct access. Only the read values will be checked.

3.3 CONFIGURATION

3.3.1 bandwidth

The bandwidth tag specifies the minimally acceptable disk bandwidth, in MB/s.NOTE: If not configured no comparison will be made and the obtained bandwidth will be displayed.Default value: none

3.3.2 deviation

The deviation tag specifies the number of allowed standard deviations from median, used to search foroutlier values. The allowed range is (median +/- deviation x stddev).Default value: 3

3.3.3 options

The options tag specifies a string with the options to be used to execute the benchmark. Options areexpected to be valid.Default value: -az -i0 -i1 -y 512 -s 65536 -+n -+r -I

3.3.4 workdir

The workdir tag specifies the base path to use as working directory instead of /tmp. The directory shouldexist and have proper permissions. If the directory is not local, the test will fail.Default values: /tmp

3.4 EXAMPLE

<disk_bandwidth><bandwidth>40</bandwidth><deviation>3</deviation><options>-az -i0 -i1 -y 512 -s 65536 -+n -+r -I</options><workdir>/tmp</workdir>

</disk_bandwidth>

3.5 MODULE INFORMATION

• Level: 3

3.6 DEPENDENCIES

• Test Modules: remote login

4 ethernet

Ethernet driver uniformity and wellness

4.1 DESCRIPTION

The ethernet module verifies Intel(R) Ethernet network drivers, including their uniformity and wellness.

4.2 METHOD

The load status and the uniformity of the required drivers is checked across compute nodes. If version oroptions are explicitly defined, they will be verified to be the requested ones.For more details check http://www.intel.com/design/network/drivers/.

4

Intel R© Cluster Checker 2.1

4.3 CONFIGURATION

4.3.1 driver

The driver tag is a container that can be used to override detection or to check unsupported drivers.

4.3.2 name

The name tag specifies the name of the driver to be checked.

4.3.3 options

The options tag specifies the options that are required in the driver configuration.

4.3.4 version

The version tag specifies the string to be used as required driver version.

4.4 EXAMPLE

<ethernet><driver>

<name>ixgb</name><options>TxDescriptors=256</options><version>1.0.135-k2-NAPI</version>

</driver></ethernet>

4.5 MODULE INFORMATION

• Level: 1

4.6 DEPENDENCIES

• Commands: cut, grep, lsmod, lspci, md5sum, modinfo, modprobe

• Test Modules: remote login

4.7 NOTES

The modprobe configuration files (/etc/modprobe.d or /etc/modprobe.conf) must be readable.

5 generic

Perform user specified check(s)

5.1 DESCRIPTION

The generic module runs one or more user specified commands and either verifies that the output is uniformfor all nodes or verifies that the output matches the user specified reference output. Only "compute" typenodes are used, i.e., the "head" and "knc-compute" type nodes do not execute the generic module. Theeffective user ID should be non-root.This module is an easy and convenient way to extend Intel(R) Cluster Checker.

5.2 METHOD

See Description.

5

Intel R© Cluster Checker 2.1

5.3 CONFIGURATION

5.3.1 test

The test tag provides a container for the user-specified command.Tags which may be used inside test: command, result

command The command tag specifies the command to execute. This tag is mandatory.

result The result tag specifies the "correct" output of the corresponding command. This tag is optional.If result is not specified, then the output is verified to be uniform for all nodes.For both the command output and the result tag value, the module trims the leading and trailing whitespace.Otherwise, the comparison is whitespace sensitive.

5.4 EXAMPLE

<generic><test>

<command>stat -c %a /tmp</command><result>1777</result>

</test><test>

<command>uname -r</command></test>

</generic>

5.5 MODULE INFORMATION

• Level: 2

5.6 DEPENDENCIES

• Test Modules: remote login

6 hardware

Check hardware uniformity among cluster nodes

6.1 DESCRIPTION

The hardware module checks the uniformity of the hardware among the cluster compute nodes. The testmodule checks that specific attributes of some hardware devices have uniform values among compute nodes.

6.2 METHOD

The lshw utility is used to list the hardware devices attributes on each node.(see http://ezix.org/project/wiki/HardwareLiSter#Usage)The items compared by default can be modified using the include and exclude configuration tags. Whenexecuting the test module as a privileged user, some extra items are shown, including the base board modeland BIOS version. The module runs lshw and captures the xml output, then it creates a 'key -> value' list.Each element key is created by concatenating the lshw xml output attributes called "id", "type", "value" andthe name of the final xml tag. There are exceptions for xml tags called "setting", "capability" and "resource",in which only the above mentioned attributes are used and the tag name is not appended.Example 1: Given the following xml snippet from the output of lshw<list>

<node id="computer" claimed="true" class="system" handle=""><node id="core" claimed="true" class="bus" handle=""></node><node id="pci" claimed="true" class="bridge" handle="PCIBUS:0000:00"><node id="pci:0" claimed="true" class="bridge" handle="PCIBUS:0000:01"><description>PCI bridge</description>

6

Intel R© Cluster Checker 2.1

The pci:0 description will be saved as:# key # # value #'core-pci-pci:0-description' -> 'PCI bridge'

The 'id' computer is removed because it is common to the entire output.Example 2:

<node id="core" claimed="true" class="bus" handle=""><node id="pci" claimed="true" class="bridge" handle="PCIBUS:0000:00"><node id="pci:0" claimed="true" class="bridge" handle="PCIBUS:0000:01"><resource type="irq" value="48"/>

The pci:0 resource and type irq will be saved as:# key # # value #'core-pci-pci:0-irq-48' -> '48'

Example 3:<node id="core" claimed="true" class="bus" handle=""><node id="pci" claimed="true" class="bridge" handle="PCIBUS:0000:00">

<node id="pci:0" claimed="true" class="bridge" handle="PCIBUS:0000:01"><node id="network:0" disabled="true" ...>

<setting id="driverversion" value="3.0.6-k"/>

The pci:0 and network:0 driverversion will be saved as:# key # # value #'core-pci-pci:0-network:0 -driverversion -3.0.6-k' -> '3.0.6-k'

There are cases were the key may not be unique, for those a number will be appended, e.g:# key # # value #'pci-storage-ioport' -> '3108(size=8)''pci-storage-ioport -0' -> '3114(size=4)''pci-storage-ioport -1' -> '3100(size=8)'

NOTE: The "value" attribute is also used in the value field, just for convenience.NOTE: Memory size comparison is relaxed, values are allowed to differ up to 5%.

6.3 CONFIGURATION

6.3.1 exclude

The exclude tag excludes items from the comparison. The string is interpreted as a POSIX matching regularexpression. This configuration tag can be repeated multiple times. Note that to exactly match meta characters( ^ [. � (${n ()+j ?<>), they should be escaped.NOTE: To exclude an item, the regular expression should math the key, e.g:

# key # # value #'pci-storage-ioport' -> '3108(size=8)''pci-storage-ioport -0' -> '3114(size=4)''pci-storage-ioport -1' -> '3100(size=8)'

To exclude all storage items, the tag should be:<exclude>.*storage.*</exclude>

Default values: noneNOTE: These tags need to be reviewed if mixed with the option tag.

6.3.2 options

The options tag appends extra options when running the hardware listing utility. It can be used to enableor disable tests, if required, during troubleshooting.Allowed values:

• -class CLASS only show a certain class of hardware

• -C CLASS same as '-class CLASS'

• -c CLASS same as '-class CLASS'

7

Intel R© Cluster Checker 2.1

• -disable TEST disable a test (like pci, isapnp, cpuid, etc. )

• -enable TEST enable a test (like pci, isapnp, cpuid, etc. )

• -sanitize sanitize output (remove sensitive information like serial numbers, etc.)

• -numeric output numeric IDs (for PCI, USB, etc.)

Default value: none ('-quiet -sanitize -xml' will always be appended)NOTE: If the option -class or -C is used, it is possible that previously configured exclusions fail because thekey has changed.

6.4 EXAMPLE

<hardware><exclude>core-pci-pci:4-description</exclude><exclude>core-pci-pci:0-irq-48</exclude><exclude>core-pci-pci:0-network:0 -driverversion -3.0.6-k</exclude><options>-disable cpuinfo</options>

</hardware>

6.5 MODULE INFORMATION

• Level: 3

6.6 DEPENDENCIES

• Test Modules: remote login

7 hpl

Run the Optimized HPL Benchmark

7.1 DESCRIPTION

The hpl module runs an optimized version of the High Performance Linpack* (HPL) benchmark. This binaryis part of the Intel(R) Math Kernel Library package.

7.2 METHOD

The hpl module uses a pre-built binary to exercise all cluster compute nodes at once. It uses the totalnumber of nodes in the cluster as an input parameter, but is scaled down to execute quickly, but still yieldrepresentative benchmark performance.

7.3 CONFIGURATION

7.3.1 fabric

The fabric tag provides a container for specifying the network interconnect fabric to use for the benchmark. Itmust contain one tag specifying the device type to be used (device) and optionally, a performance threshold,against which a comparison is performed (tflops). The fabric block may be repeated to test multipleinterconnects. If no fabric container is configured, the test module runs mpirun with no device and noperformance threshold.Default value: noneTags which may be used inside fabric: device, tflops

8

Intel R© Cluster Checker 2.1

device The device tag specifies a string to specify which Intel(R) MPI Library device to use. It may bespecified only once per fabric. Any extra options can be provided by using an "options" XML attribute. Theoptions should be placed in order, with global modifiers first, as required by the library. This tag can onlybe used inside a fabric container. If no device is configured, then no device is passed to MPI and it getsreported as "default".Allowed values: fabric | intra-node fabric : inter-node fabricwhere

• fabric is one of {shm, dapl, tcp, tmi, ofa}

• intra-node fabric is one of {shm, dapl, tcp, tmi, ofa}

• inter-node fabric is one of {dapl, tcp, tmi, ofa}

Default value: noneNOTE: The test module uses I MPI FABRICS style introduced in version 4.0.

tops The tflops tag specifies the minimum acceptable floating point performance, in Teraflops. Thistag should be used inside a fabric container for each device. If used outside a fabric container it will beused to compare the "default" device.NOTE: If not configured, no comparison will be made and the obtained TFLOPS, will be displayed.

7.3.2 mpi-path

The mpi-path tag specifies the base path to the Intel(R) MPI Library installation directory. Setting thisparameter automatically sets up the environment.Allowed values: Any path.

7.3.3 process-number

The process-number tag specifies the number of MPI processes to start on each node.Allowed values: Integer value.Default value: Number of physical cores in the head node

7.3.4 Ns

The Ns tag specifies the size of the problem to be used in the calculation. It applies only to the fabric onwhich it was defined.Default value: 8000 x sqrt ( number of nodes )

7.3.5 Ps and Qs

These tags are factors to define the division of the matrix, one for each dimension. They can be set by theuser with the Ps and Qs tags. Both tags should be provided, otherwise they will be automatically calculated.It is important to take into account that the multiplication of Ps x Qs must be equal to the total number ofMPI processes (sum of all nodes). If no values are configured, the test module automatically calculates themaccording to the following rules.Default value: Ps x Qs = Total # of MPI processes (# all nodes x # physical cores). Ps <= Qs. Ps as bigas possible, complying with former rules.

7.3.6 NBs

The NBs tag specifies the size of the atomic blocks used in the DGEMM operation.Default value: 168

7.4 EXAMPLE

<hpl><NBs>148</NBs><Ns>8000</Ns><Ps>1</Ps><Qs>4</Qs><fabric>

<device>shm:tcp</device>

9

Intel R© Cluster Checker 2.1

<tflops>2</tflops></fabric><fabric>

<device options="-genv I_MPI_DEBUG 5">shm:dapl</device><tflops>2.5</tflops>

</fabric><mpi-path>/opt/intel/mpi-rt/4.0</mpi-path>

<process-number>2</process-number></hpl>

7.5 MODULE INFORMATION

• Level: 4

7.6 DEPENDENCIES

• Commands: bc, mktemp, mpirun, sed

• Test Modules: mount, remote login, tools, mpi internode

8 libraries

Check libraries wellness and compliance

8.1 DESCRIPTION

The libraries module checks that runtime libraries meet requirements both for wellness and compliance.Besides checking for a specific set of base, x11 and runtime libraries, which needs to be provided for all nodes;it also checks that all 32-bit libraries have 64-bit counterparts. This module can be used to check compliancewith Intel(R) Cluster Ready Specification versions 1.2 and 1.3.Having a functional runtime ensures that binaries run without extra configuration steps. Minimum softwareruntime requirements ensure that functional clusters are built when following the specification. This set oflibraries can also be used by component developers as a baseline for interoperability, support and validation.

8.2 METHOD

The libraries module retrieves a full list of available libraries and their versions from the dynamic linkercache, using the ldconfig command. It also looks for required runtime library files under /opt/intel. Forthe counterpart check, it confirms that all 32-bit libraries in the dynamic linker cache have a 64-bit counterpart.If the path to the runtime libraries is located in a shared filesystem; then the search is optimized and only areference node is checked for compliance.The required libraries and their versions are listed in the libs.csv file in a directory in the installation path ofIntel(R) Cluster Checker.

8.3 CONFIGURATION

The version of the Intel(R) Cluster Ready Specification to be checked can be set using the complianceand certification options, please see the User's Guide for more information about specifying the Intel(R)Cluster Ready Specification version.

8.3.1 exclude

The exclude tag specifies the full name of a library to exclude from comparisons.Default value: none

8.4 EXAMPLE

<libraries><exclude>libI810XvMC.so.1</exclude>

</libraries>

10

Intel R© Cluster Checker 2.1

8.5 MODULE INFORMATION

• Level: 2

8.6 DEPENDENCIES

• Commands: find, ldconfig

• Test Modules: remote login

9 micconf

Checks Intel(R) Xeon Phi(TM) configuration health and uniformity

9.1 DESCRIPTION

The micconf module checks that the Intel(R) Xeon Phi(TM) configuration is healthy and uniform.It first checks that the micinfo information is correct and uniform in compute nodes. Any error, undefinedvalue or difference is reported. Then it checks the sanity of the coprocessor cards by running the miccheckdiagnostic tool in every compute node.The micinfo fields names are normalized to make their handling easier. The micinfo health check makes sureno undefined values are shown. The micinfo uniformity check makes sure relevant fields are uniform.Frequency, voltage and speed should be reported as non-zero. Temperature should be a value between 40and 100. Only differences less than 128 MB, 100000 uV or 20 C are allowed.The default behavior can be altered by custom configuration.

9.2 METHOD

The outputs of micinfo and miccheck are parsed and compared on compute nodes. By default, miccheckis run with its default configuration.

9.3 CONFIGURATION

9.3.1 options

The options tag can be used to include or exclude any test executed by miccheck. If no options are provided,miccheck will be run by default.Default value: noneAllowed values: miccheck command line options

9.3.2 mpss-path

The mpss-path tag specifies the location of micinfo and miccheck commands.Default value: /opt/intel/mic/bin

9.3.3 include

The include tag includes one or more micinfo fields for uniformity comparison.Field names are simplified by using lowercase and removing spaces. For instance the following field is shortenedas device0-board-vendor-id.Device No: 0, Device Name: L1OM Board Vendor ID : 8086Default value:On the host system:host-system-info-driver-version host-system-info-mpss-version host-system-info-os-versionOn each attached device:board-device-id board-ecc-mode board-mic-board-stepping board-mic-processor-family board-mic-processor-family-ext board-mic-processor-model board-mic-processor-model-ext board-mic-processor-stepping-id board-mic-processor-type board-sku board-subsystem-id board-vendor-id core-frequency core-total-no-of-active-corescore-voltage gddr-gddr-density gddr-gddr-frequency gddr-gddr-size gddr-gddr-technology gddr-gddr-vendorgddr-gddr-voltage thermal-die-temp thermal-fsc-strap thermal-smc-firmware-version version-flash-versionversion-uos-version

11

Intel R© Cluster Checker 2.1

9.3.4 exclude

The exclude tag excludes one or more micinfo fields from checks.Default value: none

9.3.5 diff-c

The diff-c tag changes the default allowed deviation in Celsius degrees of temperature.Default value: 20

9.3.6 diff-mb

The diff-mb tag sets the default allowed deviation in MBs of memory size.Default value: 128

9.3.7 diff-uv

The diff-uv tag sets the default allowed deviation in uV of voltage.Default value: 100000

9.4 EXAMPLE

<micconf><diff-c>20</diff-c><diff-mb>128</diff-mb><diff-uv>100000</diff-uv><exclude>device0-board-pcie-speed</exclude><include>device0-thermal-fan-rpm</include><options>-t install,load,driver,detect 0</options><mpss-path>/opt/intel/mic/bin</mpss-path>

</micconf>

9.5 MODULE INFORMATION

• Level: 2

9.6 DEPENDENCIES

• Commands: miccheck, micinfo• Test Modules: remote login

10 micmpi

Intel(R) MPI Library internode check for Intel(R) Xeon Phi(TM) coprocessors

10.1 DESCRIPTION

The micmpi module test checks the basic functionality of the Intel(R) MPI Library Runtime Environmentover the cluster list of Intel(R) Xeon Phi(TM) coprocessors by running an MPI Hello World program usingone or more Intel(R) MPI Library devices.In order to be able to run MPI jobs on Intel(R) Xeon Phi(TM), a version 4.1 or higher of Intel(R) MPI Libraryis needed.

10.2 METHOD

The micmpi module runs an MPI Hello World program on one or more Intel(R) MPI Library devices. Thetest module copies the MPI Hello world program in the home folder of the user running the tool, assumingthis folder is shared among all the nodes. By default, it exercises 4 MPI processes on each Intel(R) XeonPhi(TM), leaving Intel(R) MPI Library to select the device to be used.The test module uses the I MPI FABRICS style introduced in Intel(R) MPI Library version 4.0. It also assumesthat the $HOME and MPI path are shared between the hosts and the Intel(R) Xeon Phi(TM).

12

Intel R© Cluster Checker 2.1

10.3 CONFIGURATION

10.3.1 device

The device tag specifies a string to specify which Intel(R) MPI Library device to use. It may be defined morethan once. Any extra options can be provided by using the "options" XML attribute. The options should beprovided in order, as required by the library. If no device is specified, the Intel(R) MPI Library will use themost appropriate fabric for communication between processes.Allowed values: fabric | intra-node fabric : inter-node fabricwhere

• fabric is one of {shm, dapl, tcp, tmi, ofa}

• intra-node fabric is one of {shm, dapl, tcp, tmi, ofa}

• inter-node fabric is one of {dapl, tcp, tmi, ofa}

Default value: none

10.3.2 mpi-path

The mpi-path tag specifies the base path to the Intel(R) MPI Library installation directory. Setting thisparameter automatically sets up the environment.

10.3.3 process-number

The process-number tag specifies the number of MPI processes to start on each node.Allowed values: Integer value.Default value: 4

10.4 NODEFILE CONFIGURATION

When creating a nodefile for your cluster, to define an Intel(R) Xeon Phi(TM) node, you must assign thenode type as follows: #type: knc-compute.

10.5 NODEFILE EXAMPLE

headnode #type: headcompute00compute00 -mic0 #type: knc-compute

10.6 CONFIG EXAMPLE

<cluster><test><micmpi>

<device>shm:tcp</device><device options="-genv I_MPI_DEBUG 5">shm:dapl</device><mpi-path>/opt/intel/impi/4.1</mpi-path><process-number>2</process-number>

</micmpi></test>

</cluster>

10.7 MODULE INFORMATION

• Level: 2

10.8 DEPENDENCIES

• Commands: mktemp, mpirun

• Test Modules: micperf

13

Intel R© Cluster Checker 2.1

10.9 NOTES

This test module does not build the MPI Hello World binary using the MPI Library compiler wrappers (i.e.,mpicc). It uses a pre compiled binary.

11 micperf

Checks Intel(R) Xeon Phi(TM) native performance

11.1 DESCRIPTION

The micperf module runs the STREAM benchmark natively in all available coprocessors and compares thedeviation between them.

11.2 METHOD

The micperf module runs the STREAM benchmark natively on all available coprocessors.

11.3 CONFIGURATION

11.3.1 stream-deviation

The stream-deviation tag specifies the number of allowed standard deviations from median, used to searchfor outlier values. The allowed range is (median +/- deviation x stddev).Default value: 3

11.3.2 stream-bandwidth

The stream-bandwidth tag specifies the minimum acceptable Triad memory bandwidth, in MB/s.Default value: none

11.4 XML CONFIG EXAMPLE

<cluster><test><micperf>

<stream-bandwidth>51067</stream-bandwidth><stream-deviation>3</stream-deviation>

</micperf></test>

</cluster>

11.5 NODEFILE CONFIGURATION

When creating a nodefile for your cluster, to define an Intel(R) Xeon Phi(TM) node, you must assign thenode type as follows: #type: knc-compute.

11.6 NODELIST EXAMPLE

headnode #type: headcompute00compute00 -mic0 #type: knc-compute

11.7 MODULE INFORMATION

• Level: 3

11.8 DEPENDENCIES

• Test Modules: micconf

14

Intel R© Cluster Checker 2.1

12 mount

Check that known directories are correctly mounted

12.1 DESCRIPTION

The mount module checks that well known directories are correctly mounted or meet certain requirements.The /proc filesystem and shared memory device (/dev/shm) should be mounted. The /home directory shouldbe a shared, common directory accessible from any cluster compute node. The permissions on the temporarydirectory should be correct.

12.2 METHOD

12.2.1 Compliance mode:

Running in compliance mode ensures that the home directory of the user running the tool is under /homeand it's inode number is the same on all nodes. The stat command is used to gather the information. Whenthe tool is executed by a privileged user, the home directory is tested for the first user in /etc/passwd filewith a home directory under /home.

12.2.2 Execution modes other than compliance:

Other execution modes also check the status of the /proc filesystem and /dev/shm using the /proc/mountsfile information.The mount module also checks that the permissions on the temporary directory are 1777 as reported by thestat command. If $TMPDIR is set, it will be used as the temporary directory, otherwise, /tmp will be used.

12.3 CONFIGURATION

12.3.1 user

If the user tag is provided, the test module will use the home directory of this user for comparing in all thenodes.

12.3.2 sticky

If the sticky tag is used, the mount module will consider permissions 0777 on /tmp directory to be alsocorrect.

12.4 EXAMPLE

<mount><sticky/><user> icr </user>

</mount>

12.5 MODULE INFORMATION

• Level: 1

12.6 DEPENDENCIES

• Commands: awk cat test• Test Modules: remote login

12.7 NOTES

This module assumes that /proc is the mount point for procfs.This module assumes that the temporary directory is named /tmp.If this module is being run by a privileged user and /home is managed by automount, at least one user accountshould be created prior to running this test module.

15

Intel R© Cluster Checker 2.1

13 mpi internode

Check the functionality of the Intel(R) MPI Library Runtime Environment

13.1 DESCRIPTION

The mpi internode module checks the basic functionality of the Intel(R) MPI Library Runtime Environmentover the cluster compute nodes by running an MPI Hello World program using one or more Intel (R) MPILibrary devices.

13.2 METHOD

The mpi internode module runs an MPI Hello World program on one or more Intel(R) MPI Library devices.The test module copies the MPI Hello world program to the home folder of the user running the tool (assumingthis folder is shared among all the nodes). By default, it exercises 4 MPI processes on each compute node,leaving the Intel(R) MPI Library select the device to be used.The test module uses I MPI FABRICS style, introduced in version 4.0.

13.3 CONFIGURATION

13.3.1 device

The device tag specifies a string to denote which Intel(R) MPI Library device to use. The tag may be definedmore than once. Any extra options can be provided by using an "options" XML attribute. The options shouldbe provided in order, as required by the library. If no device is specified, the Intel(R) MPI Library will use themost appropriate fabric for communication between processes.Allowed values: fabric | intra-node fabric : inter-node fabricwhere

• fabric is one of {shm, dapl, tcp, tmi, ofa}

• intra-node fabric is one of {shm, dapl, tcp, tmi, ofa}

• inter-node fabric is one of {dapl, tcp, tmi, ofa}

Default value: none

13.3.2 mpi-path

The mpi-path tag specifies the base path to the Intel(R) MPI Library installation directory. Setting this tagautomatically sets up the environment.

13.3.3 process-number

The process-number tag specifies the number of MPI processes to start on each node.Allowed values: Integer value.Default value: 4

13.4 EXAMPLE

<mpi_internode><device>shm:tcp</device><device options="-genv I_MPI_DEBUG 5">shm:dapl</device><mpi-path>/opt/intel/impi/4.0.3</mpi-path><process-number>2</process-number>

</mpi_internode>

13.5 MODULE INFORMATION

• Level: 2

16

Intel R© Cluster Checker 2.1

13.6 DEPENDENCIES

• Commands: mktemp, mpirun

• Test Modules: mount, remote login, tools, mpi local

13.7 NOTES

This test module does not build the MPI Hello World binary using the MPI Library compiler wrappers (i.e.,mpicc).

14 mpi local

Check the functionality of the Intel(R) MPI Library Runtime Environment

14.1 DESCRIPTION

The mpi local module checks the consistency of the MPI job startup commands among all the nodes and thebasic functionality of the Intel(R) MPI Library Runtime Environment on each node by running a single-nodeMPI Hello World program on one or more Intel(R) MPI Library devices.

14.2 METHOD

14.2.1 Compliance mode:

If running in compliance mode, the test module checks that the paths to mpirun and mpiexec are consistenton all nodes by using the which command. If no paths are found, the test fails.NOTE: It tries to resolve the path to mpirun and mpiexec by using the command which. It does not use<mpi-path>.

14.2.2 Execution modes other than compliance:

If running in a mode other than compliance, the module additionally checks the functionality of the MPIruntime environment, exercising (by default) 4 MPI processes over different network devices, using the shmand tcp I MPI FABRICS. Furthermore, if the /etc/dat.conf file or the DAT OVERRIDE variable are present,it also locally exercises the DAPL fabric device.The test module uses I MPI FABRICS style introduced in version 4.0.

14.3 CONFIGURATION

14.3.1 device

The device tag specifies a string to specify which Intel(R) MPI Library device to use. It may be defined morethan once. Any extra options can be provided by using an "options" XML attribute. The options should beprovided in order, as required by the library.Allowed values: fabric | intra-node fabric : inter-node fabricwhere

• fabric is one of {shm, dapl, tcp, tmi, ofa}

• intra-node fabric is one of {shm, dapl, tcp, tmi, ofa}

• inter-node fabric is one of {dapl, tcp, tmi, ofa}

Default value: none

14.3.2 mpi-path

The mpi-path tag specifies the base path to the Intel(R) MPI Library installation directory. Setting this tagautomatically sets up the environment.NOTE: This is not used to perform the uniformity checks.

17

Intel R© Cluster Checker 2.1

14.3.3 process-number

The process-number tag specifies the number of MPI processes to start on each node.Allowed values: Number greater than zero.Default value: 4

14.4 EXAMPLE

<mpi_local><device>shm:tcp</device><device options="-genv I_MPI_DEBUG 5">shm:dapl</device><mpi-path>/opt/intel/impi/4.0.3</mpi-path><process-number>2</process-number>

</mpi_local>

14.5 INFORMATION

• Level: 2

14.6 DEPENDENCIES

14.6.1 Compliance mode:

• Commands: mpirun, mpiexec

• Test Modules: remote login

14.6.2 Other modes:

• Commands: mktemp, stat, rm, mpirun, mpiexec

• Test Modules: mount, remote login

14.7 NOTES

This test module does not build the MPI Hello World binary using the MPI Library compiler wrappers (i.e.,mpicc).

15 packages

Check software packages presence and uniformity among nodes

15.1 DESCRIPTION

The packages module checks the uniformity and correctness of the installed packages. It can use a referencelist of packages to check for correctness. If no reference list file is configured, the test module checks packageuniformity among nodes.

15.2 METHOD

If a reference list of packages is configured, the test module will check that these packages are installed on thenodes; however, if no reference list is configured it checks uniformity of installed packages among the nodes.The reference list should contain one package per line, and lines starting with '#' will be interpreted as acomment.The test module obtains the list of installed packages on each node using the following commands:

• rpm --query --all # for Red Hat based systems

• dpkg -l | awk "{ print $2,$3 }" # for Debian based systems

18

Intel R© Cluster Checker 2.1

• pacman -Q # for Arch based systems

• ls /var/log/packages/ # for Slackware based systems

• ls -d /var/db/pkg/*//* # for Gentoo based systems

15.3 CONFIGURATION

15.3.1 node

The node tag specifies the path to the file with the list of installed packages, to be checked on computenodes, including head nodes acting as compute nodes.Default value: none

15.3.2 head

The head tag specifies the path to the file containing the list of installed packages to be checked on the headnode. This is not intended to be used for head nodes acting as compute nodes.Default value: none

15.3.3 exclude

The exclude tag specifies the name of a package to be excluded from the comparison. The string isinterpreted as a POSIX matching regular expression. It may be specified multiple times to exclude more thanone package.Note that to exactly match meta characters^[.*(${\()+|?<>

they should be escaped.Example: to exclude libstdc++-4.4.6-3.el.i686:

<exclude>libstdc\+\+-4\.4\.6-3\.el6\.i686</exclude>

Default value: none

15.3.4 include

The include tag explicitly verifies that the specified package is installed. It may be specified multiple timesto include more than one package.Default value: none

15.3.5 command

The command tag specifies a custom command that lists the packages installed in the system. The output ofthe command has to list one package per line.If none of the default commands can list the installed packages, a custom command should be used.Default value: none

15.4 EXAMPLE

<packages><command>pkg_info</command><exclude>rpm-4.3.3-9nonptl</exclude><exclude>xterm-*</exclude><head>head.list</head><include>mpich-ch_p4-gcc-oscar-module -1.2.7-4</include><node>node.list</node>

</packages>

15.5 MODULE INFORMATION

• Level: 3• Scope: head, compute

19

Intel R© Cluster Checker 2.1

15.6 DEPENDENCIES

• Commands: sed, rpm (Red Hat-based systems), dpkg (Debian-based systems), pacman (Arch-basedsystems), ls (for Gentoo and Slackware based systems).

• Test modules: remote login

16 ping

Check that all nodes respond to ping from the head node.

16.1 DESCRIPTION

The ping module checks that all nodes respond to ping from the head node.If running in compliance or certification mode, it checks that the number of nodes complies with the specifi-cation.A cluster is defined as containing one or more head/login nodes, multiple compute nodes, potentially groupedlogically into sub-clusters. Each node may provide multiple capabilities. A certified cluster shall contain atleast four compute nodes per sub-cluster.Minimum hardware requirements ensure functional clusters are built when following the specification.

16.2 METHOD

The ping command is used to gather status about all other nodes from the head node.

16.3 CONFIGURATION

16.3.1 time

The time tag specifies the maximum time allowed for a ping response in milliseconds.Default value: 100

16.4 EXAMPLE

<ping><time>50</time>

</ping>

16.5 MODULE INFORMATION

• Level: 1

16.6 DEPENDENCIES

• Commands: ping

17 process

Check for stale processes

17.1 DESCRIPTION

The process module checks that the process list does not contain runaway processes (in terms of CPU ormemory usage), zombies, or other stale processes.

17.2 METHOD

On each compute node, the test module uses information from the current processes in execution as returnedby the ps command.

20

Intel R© Cluster Checker 2.1

17.3 CONFIGURATION

17.3.1 elapsed time

The elapsed time tag specifies time (in seconds) that is used to define a stale process. See also exempt -uids.Allowed values: Integers greater than zero.Default value: 3600

17.3.2 exclude

The exclude tag specifies process names that are excluded from the check. The string is interpreted as aPOSIX regular expression. This option may be repeated to exclude more than one process name. Note thatto exactly match meta characters^[.*(${\()+|?<>

they should be escaped.Default value: monitoring processes and filesystem daemons

17.3.3 exempt uids

The exempt uids tag specifies uids lower than this value are exempt from the elapsed time check. Daemons,etc., started from system accounts should not be flagged as stale regardless of how long they have beenrunning. See also elapsed time.Allowed values: Integers greater than or equal to zero.Default value: 500

17.3.4 percent cpu

The percent cpu tag specifies the percentage of cpu that is used to define a runaway process. Note: onsome systems, the percent cpu is defined relative to a single core, on others it is relative to all cores.Allowed values: [1-100]Default value: 5

17.3.5 percent memory

The percent memory tag specifies the percentage of memory that is used to define a runaway process.Allowed values: [1-100]Default value: 1

17.3.6 zombie allowed elapsed time

The zombie allowed elapsed time tag specifies the time (in seconds) that is used to allow transient zom-bies. Intel(R) Cluster Checker and other applications may create transient zombies that are quickly, but notinstantly reaped. Do not flag these transient zombies as "true" zombie processes unless their elapsed time isgreater than this value.Allowed values: Greater than zero.Default value: 1

17.4 EXAMPLE

<process><elapsed_time>3600</elapsed_time><exclude>ntpd</exclude><exclude>portmap</exclude><exempt_uids>500</exempt_uids><percent_cpu>5</percent_cpu><percent_memory>1</percent_memory><zombie_allowed_elapsed_time>1</zombie_allowed_elapsed_time>

</process>

17.5 MODULE INFORMATION

• Level: 2

21

Intel R© Cluster Checker 2.1

17.6 DEPENDENCIES

• Commands: ps

• Test modules: remote login

18 remote login

Check remote connectivity

18.1 DESCRIPTION

The remote login module checks remote connectivity to all nodes, including remote command versionuniformity and proper execution time.

18.2 METHOD

The cmd tag is used to run simple commands in all nodes.

18.3 CONFIGURATION

18.3.1 cmd

The cmd tag specifies the remote execution command to be used.Default value: ssh

18.3.2 time

The time tag specifies the maximum time allowed for <cmd> to respond, in milliseconds.Default value: 100

18.3.3 version

The version tag specifies the output of '<cmd> -V' that should be received. (Notice that if a configured<cmd> is used, it should support the switch '-V' and it should print its version).NOTE: The version is ignored on knc-compute nodes.Default value: The output of a random node will be used as reference

18.4 EXAMPLE

<remote_login><cmd>ssh</cmd><time>1</time><version>OpenSSH_5.3p1, OpenSSL 1.0.0-fips 29 Mar 2010</version>

</remote_login>

18.5 MODULE INFORMATION

• Level: 1

18.6 DEPENDENCIES

• Commands: ping

19 shells

Check that all nodes have the required shells

22

Intel R© Cluster Checker 2.1

19.1 DESCRIPTION

The shell module checks that all nodes have the required shells.

19.2 METHOD

For all shells, the module checks that the interpreter is present in /bin and is able to run a "Hello World"script using it. The list of shells tested are: sh, bash, csh, ksh and tcsh.

19.3 CONFIGURATION

19.3.1 none

19.4 MODULE INFORMATION

• Level: 1

19.5 DEPENDENCIES

• Commands: chmod, echo, mktemp, test

• Test Modules: remote login

20 storage

Check Intel(R) Cluster Ready specification compliance

20.1 DESCRIPTION

The storage module checks that the storage capacity available to the head node meets the requirements inthe Intel(R) Cluster Ready Specification. This module can check against Intel Cluster Ready Specificationsversions 1.2 or 1.3.Compliance with minimum hardware requirements ensures that functional clusters are built when followingthe specification.

20.2 METHOD

The disk space information is retrieved by using the df command. Shared memory partitions are not consid-ered. If no head node is detected, then the first compute node is used.

20.3 CONFIGURATION

The version of the Intel(R) Cluster Ready Specification to be checked can be set using the complianceand certification options, please see the User's Guide for more information about specifying the Intel(R)Cluster Ready Specification version.

20.4 MODULE INFORMATION

• Level: 1

20.5 DEPENDENCIES

• Commands: df

• Test Modules: remote login

23

Intel R© Cluster Checker 2.1

21 stream

Check the memory bandwidth of a node using the STREAM benchmark

21.1 DESCRIPTION

The stream module checks the memory bandwidth of each compute node using the Triad STREAM bench-mark and its deviation among the cluster nodes. Deviation is checked only if there are three or more validresults from the compute nodes.

21.2 METHOD

STREAM is configured to use a 30 million element array by default using a pre-compiled binary, which requiresnearly 687 MB of memory.

21.3 CONFIGURATION

21.3.1 bandwidth

The bandwidth tag specifies the minimum acceptable Triad memory bandwidth, in MB/s.NOTE: If not configured, no comparison will be made and the obtained bandwidth, will be displayed.Default value: none

21.3.2 deviation

The deviation tag specifies the number of allowed standard deviations from median, used to search foroutlier values. The allowed range is (median +/- deviation x stddev).Default value: 3

21.4 EXAMPLE

<stream><bandwidth>1000</bandwidth><deviation>3</deviation>

</stream>

21.5 MODULE INFORMATION

• Level: 3

21.6 DEPENDENCIES

• Test Modules: remote login

22 tools

Check that all nodes have the required tools

22.1 DESCRIPTION

The tools module checks that all nodes have the required tools and functionality.

22.2 METHOD

For all tools, the module checks that the interpreter is present in /usr/bin, the versions are uniform, andthey run a "Hello World" one-liner. The list of tools tested are: perl, python, tclsh.NOTE: If the versions are not explicitly configured in the configuration file and the execution was configuredto check compliance, then versions are compared against the Intel(R) Cluster Ready specification.

24

Intel R© Cluster Checker 2.1

22.3 CONFIGURATION

The version of the Intel(R) Cluster Ready Specification to be checked can be set using the complianceand certification options, please see the User's Guide for more information about specifying the Intel(R)Cluster Ready Specification version.

22.3.1 python-path

The python-path tag specifies the path where the python interpreter is located.Default value: /usr/bin

22.3.2 python-version

The python-version tag specifies the version of python expected.Default values (in compliance mode):

• (Intel(R) Cluster Ready 1.2) 2.3.4

• (Intel(R) Cluster Ready 1.3) 2.4.3

22.3.3 perl-path

The perl-path tag specifies the path where the perl interpreter is located.Default value: /usr/bin

22.3.4 perl-version

The perl-version tag specifies the version of perl expected.Default values (in compliance mode):

• (Intel(R) Cluster Ready 1.2) 5.6.1

• (Intel(R) Cluster Ready 1.3) 5.8.8

22.3.5 tclsh-path

The tclsh-path tag specifies the path where the tclsh interpreter is located.Default value: /usr/bin

22.3.6 tclsh-version

The tclsh-version tag specifies the version of tclsh expected.Default values (in compliance mode):

• (Intel(R) Cluster Ready 1.2) 8.4.7

• (Intel(R) Cluster Ready 1.3) 8.4.13

22.4 EXAMPLE

<tools><perl-path>/usr/bin</perl-path><perl-version>5.8.8</perl-version>

<python-path>/usr/bin</python-path><python-version>2.4.3</python-version>

<tclsh-path>/usr/bin</tclsh-path><tclsh-version>8.4.13</tclsh-version>

</tools>

22.5 MODULE INFORMATION

• Level: 2

25

Intel R© Cluster Checker 2.1

22.6 DEPENDENCIES

• Commands: echo

• Test Modules: remote login

26

Intel R© Cluster Checker 2.1

Optimization Notice

Intel compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique toIntel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel doesnot guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizationsnot specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User andReference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

27