grid (european scientific computing infrastructure)

92
GRID (European Scientific Computing infrastructure) .

Upload: wilma

Post on 16-Jan-2016

25 views

Category:

Documents


1 download

DESCRIPTION

GRID (European Scientific Computing infrastructure). CERN. CERN currently comprises 20 European Member States: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: GRID (European Scientific Computing infrastructure)

GRID(European Scientific Computing infrastructure)

.

Page 2: GRID (European Scientific Computing infrastructure)

CERN

CERN currently comprises 20 European Member States:Austria, Belgium, Bulgaria, the Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Italy, The Netherlands, Norway, Poland,

Portugal, the Slovak Republic, Spain, Sweden, Switzerland and the United Kingdom.Observer and Non-member States:India, Israel, Japan, the Russian Federation, the United States of America, Turkey, Algeria, Argentina, Armenia, Australia, Azerbaijan, Belarus, Brazil, Canada, China, Croatia, Cyprus, Estonia, Georgia, Iceland, India, Iran, Ireland, Mexico, Morocco, Pakistan, Peru, Romania, Serbia, Slovenia, South Africa, South Korea, Taiwan and the Ukraine.

Latvija, atšķirībā nn Igaunijas, nav CERN dalībvalsts studenti no Latvijas nevar vasarās tur praktizēties...

Page 3: GRID (European Scientific Computing infrastructure)
Page 4: GRID (European Scientific Computing infrastructure)
Page 5: GRID (European Scientific Computing infrastructure)
Page 6: GRID (European Scientific Computing infrastructure)
Page 7: GRID (European Scientific Computing infrastructure)

CERN LHC

An image of one of the first lead-ion collisions seen by the ALICE experiment on 7 November, 2010.

Page 8: GRID (European Scientific Computing infrastructure)
Page 9: GRID (European Scientific Computing infrastructure)

Data visualisation of a particle collision inside the Large Hadron Collider, recorded by the ALICE experiment.

Page 10: GRID (European Scientific Computing infrastructure)

LHC Atlas: search for Higgs bosonGreatest Mysteries: What Causes Gravity? E=mc2 ; v=g * t ; "Much of today's research in elementary particle physics focuses on the search for

a particle called the Higgs boson. This particle is the one missing piece of our present understanding of the laws of nature, known as the Standard Model. This model describes three types of forces: electromagnetic interactions, strong interactions, which bind atomic nuclei; and the weak nuclear force, which governs beta decay. (The Standard Model does not describe the fourth force, gravity.)

“Over the next 15 years, we should begin to find a real understanding of the origin of mass.”

"Newton thought that gravity's force was instantaneous. Einstein assumed that it moved at the speed of light, but until now, no one had measured it,"

General relativity (GR) suggests that gravitation (unlike electromagnetic forces) is a pure geometric effect of curved space-time, not a force of nature that propagates.

Most scientists assume that gravity travels at the speed of light and there exist gravitational waves and gravitational radiation (single positive experemnt in 2003)

I put my last dollar, the Higgs shall not be found, just as Gravity waves in the Einstein sense won't be found.

I agree with you. But all of the public tax money is on one side of the issue! The Higgs is more like a plumber with duct tape, holding the standard model

together

Page 11: GRID (European Scientific Computing infrastructure)
Page 12: GRID (European Scientific Computing infrastructure)

The ‘Standard Model’ of Particle Physics

Proposed byAbdus Salam, Glashow and Weinberg

Tested by experimentsat CERN & elsewhere

Perfect agreement betweentheory and experiments

in all laboratories

Page 13: GRID (European Scientific Computing infrastructure)

Open Questions beyond the Standard Model

What is the origin of particle masses?

due to a Higgs boson?Why so many flavours of matter particles?What is the dark matter in the Universe?Unification of fundamental forces?Quantum theory of gravity?

LHC

LHC

LHC

LHC

LHC

Page 14: GRID (European Scientific Computing infrastructure)

Why do Things Weigh?

0

Where do the masses come from?

Newton:Weight proportional to Mass

Einstein:Energy related to Mass

Neither explained origin of Mass

Are masses due to Higgs boson? (the physicists’ Holy Grail)

Page 15: GRID (European Scientific Computing infrastructure)
Page 16: GRID (European Scientific Computing infrastructure)

Has the Higgs Boson been Discovered?

Interesting hints around Mh = 125 GeV ?

CMS sees broadenhancement

ATLAS prefers125 GeV

Page 17: GRID (European Scientific Computing infrastructure)

CERN,100m zem zemes, Atlas

2007. gadā

Page 18: GRID (European Scientific Computing infrastructure)
Page 19: GRID (European Scientific Computing infrastructure)
Page 20: GRID (European Scientific Computing infrastructure)
Page 21: GRID (European Scientific Computing infrastructure)

A global, federated e-Infrastructure

EGEE infrastructure~ 200 sites in 39 countries~ 20 000 CPUs> 5 PB storage> 10 000 concurrent jobs per

day> 60 Virtual Organisations

EUIndiaGrid

EUMedGrid

SEE-GRID

EELA

BalticGrid

EUChinaGridOSGNAREGI

Page 22: GRID (European Scientific Computing infrastructure)

Scale of EGEE Production Service

98k jobs/day

Page 23: GRID (European Scientific Computing infrastructure)
Page 24: GRID (European Scientific Computing infrastructure)
Page 25: GRID (European Scientific Computing infrastructure)

Grid sertifikāta iegūšana

Iegūt apstiprinātu BalticGrid sertifikātu – pirmais solis ceļā uz Grid izmantošanu

Informācija: http://grid.lumii.lv/section/show/12 Domain of the Institution (domain.zz): lumii.lv Common Name (John Smith): Janis Berzins

Page 26: GRID (European Scientific Computing infrastructure)

Certification Procedure

Page 27: GRID (European Scientific Computing infrastructure)

Creating a Certification Request

Page 28: GRID (European Scientific Computing infrastructure)

BalticGridCA-user.cnf

## OpenSSL configuration file for generating certificate requests for Baltic Grid CA.#

# This definition stops the following lines choking if HOME isn't# defined.HOME = .###RANDFILE = $ENV::HOME/.rnd

[ req ]default_bits = 1024default_keyfile = userkey.pemdefault_md = sha1 # which md to use.distinguished_name = req_distinguished_namestring_mask = nombstr

[ req_distinguished_name ]0.domainComponent = Domain Component (org)0.domainComponent_default = org1.domainComponent = Domain Component (BalticGrid)1.domainComponent_default = balticgridorganizationalUnitName = Domain of the Institution (domain.zz)commonName = Common Name (John Smith)commonName_max = 64

Page 29: GRID (European Scientific Computing infrastructure)

Result-----BEGIN RSA PRIVATE KEY-----Proc-Type: 4,ENCRYPTEDDEK-Info: DES-EDE3-CBC,C280CE744C634255

BrT3IotvrbcpTVeqKssGQnpx2dcnqqGIRb0Jt8pJEUjTX24IsdAg+LxOUEJ70y1aaXMgQmFyemRpbnMwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBANepPbidunic4dq8iKj1eEDlicCZ51cKX43Hn17Ca+IKvS7cTBavbFicm6mkfNoCO+erZWL3nlrhGXuhUyCHZJctA9Fu37II3ik7SZe6LahCKu55ZrCP9bEXucvQ7giI2FUcgvjEcK/I9+NnO+chkJwCTafa32SxZsG7MOnwv14XAgMBAAGgADANBgkqhkiG9w0BAQUFAAOBgQC8oV1AQv1jj2D3gb0aBUwA1CaVqJN+bq2wwmeQSP1+rJXicSlfpIEqI8TwoT6FvEt2EnPAtbXpWMjFtbuM816+tEdkrGLw0wfHdlTCwswcRtHn3QVl4jxA/wReb+CYGXuhUyCHZJctA9Fu37II3ik7SZe6LahCKu55ZrCP9bEXucvQ7giI2FUcgvjEcK/I9+NnO+chkJwCTafa32SxZsG7MOnwv14XAgMBAAGgADANBgkqhkiG9w0BAQUFAAOBgQC8oV1AQv1jj2D3gb0aBUwA1CaVqJN+bq2wwmeQSP1+rJXicSlfpIEqI8TwoT6FvEt2EnPAtbXpWMjFtbuM816+tEdkrGLw0wfHdlTCwswcRtHn3QVl4jxA/wReb+CYl/OAjuw1hvqYG6ZY6n5zmxZsCnViLMIItW2NMJGBR43CrtJuUHly13hf3eTZiIZqGVjHrRPzj8GC6AOBzQ9KkG/Gcale4ALU1czmSIjwAABL1DNUc8nF/w==-----END RSA PRIVATE KEY-----

-----BEGIN CERTIFICATE REQUEST-----MIIBnjCCAQcCAQAwXjETMBEGCgmSJomT8ixkARkWA29yZzEaMBgGCgmSJomT8ixkARkWCmJhbHRpY2dyaWQxETAPBgNVBAsTCGx1bWlpLmx2MRgwFgYDVQQDEw9HdW50aXMgQmFyemRpbnMwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBANepPbidunic4dq8iKj1eEDlicCZ51cKX43Hn17Ca+IKvS7cTBavbFicm6mkfNoCO+erZWL3nlrhGXuhUyCHZJctA9Fu37II3ik7SZe6LahCKu55ZrCP9bEXucvQ7giI2FUcgvjEcK/I9+NnO+chkJwCTafa32SxZsG7MOnwv14XAgMBAAGgADANBgkqhkiG9w0BAQUFAAOBgQC8oV1AQv1jj2D3gb0aBUwA1CaVqJN+bq2wwmeQSP1+rJXicSlfpIEqI8TwoT6FvEt2EnPAtbXpWMjFtbuM816+tEdkrGLw0wfHdlTCwswcRtHn3QVl4jxA/wReb+CYCSSIx0n3iP6KFP7PMzqLMiGm4jbUVoDiA6ZfKq1HAqPHig==-----END CERTIFICATE REQUEST-----

Page 30: GRID (European Scientific Computing infrastructure)
Page 31: GRID (European Scientific Computing infrastructure)

Sertifikāts

Certificate: Data: Version: 3 (0x2) Serial Number: 13 (0xd) Signature Algorithm: sha1WithRSAEncryption Issuer: O=BalticGrid, CN=Baltic Grid Certification Authority Validity Not Before: Mar 24 12:30:32 2005 GMT Not After : Mar 24 12:30:32 2006 GMT Subject: O=BalticGrid, OU=latnet.lv, CN=Guntis Barzdins Subject Public Key Info: Public Key Algorithm: rsaEncryption RSA Public Key: (1024 bit) Modulus (1024 bit): 00:c1:54:28:7c:de:67:95:b0:7b:53:24:85:a1:c4: dd:b3:b3:12:b4:06:c4:b0:13:93:c0:5b:ad:2a:ad: 0a:8a:6c:d7:f3:c1:65:d5:1a:3f:f2:e8:ed:da:37: a0:52:e0:05:17:3f:ee:45:91:a8:07:8d:8f:7f:96: aa:fc:7c:4f:27:c6:fc:82:b8:89:54:42:60:ea:18: ff:fa:a4:1e:f7:00:22:66:b2:5b:bb:85:c9:a8:12: 87:f3:6f:96:c2:05:c8:a0:eb:9c:54:03:f1:05:c3: f4:27:ab:6b:30:47:dd:4b:12:b8:21:d9:25:fe:e6: 68:70:23:ae:35:15:80:b5:e7 Exponent: 65537 (0x10001) X509v3 extensions: X509v3 Basic Constraints: critical CA:FALSE X509v3 Key Usage: critical Digital Signature, Non Repudiation, Key Encipherment, Data Encipherment X509v3 Subject Key Identifier: B3:0B:DD:96:09:86:37:1F:CF:5D:D5:78:5B:6D:AB:6F:D0:BC:5A:24 X509v3 Authority Key Identifier: keyid:24:4E:75:31:6A:6C:DF:AA:4D:AD:C6:34:39:23:5F:18:DB:17:47:86 DirName:/O=BalticGrid/CN=Baltic Grid Certification Authority serial:00

X509v3 Certificate Policies: Policy: 1.3.6.1.4.1.19974.11.1.0.1

X509v3 Issuer Alternative Name: URI:http://grid.eenet.ee/BalticGridCA/

Signature Algorithm: sha1WithRSAEncryption 67:e8:50:7d:28:84:d7:cb:88:de:4a:14:da:f4:09:16:05:38: 4a:55:23:11:b5:87:77:05:7d:07:d8:1c:03:45:19:6f:6f:97: ef:7d:1b:c8:7f:29:98:c5:d8:35:cf:2e:2e:b2:16:7e:19:8c: 3c:32:79:2d:ed:9a:7b:50:e3:26:df:79:59:84:8f:c6:34:d4: 3a:c1:65:5b:79:2e:6e:eb:62:50:2f:0a:47:00:08:54:ee:54: 6d:91:9f:ff:58:f0:b5:79:aa:68:12:e9:2c:15:9d:06:41:3b: 3f:29:4b:ba:be:e1:ef:e1:aa:7c:83:5b:be:3a:e1:16:5f:02: 65:70:c6:7d:15:7b:e0:43:3e:f9:c1:b3:96:80:fb:a0:aa:a8: 83:79:0e:0b:87:b7:09:b6:60:6d:64:2c:de:de:c3:1c:4c:cc: e5:54:4c:33:26:d9:31:35:29:30:df:8b:7b:e6:a8:31:6e:a4: 57:ef:51:53:6c:df:7b:f6:6d:8e:d0:ad:ba:72:87:17:47:aa: d4:fa:ff:4d:d0:cc:45:a5:28:e5:a3:46:84:cf:c4:4b:94:f8: ba:27:b5:35:e3:79:f8:49:3d:90:b0:41:5d:71:e5:15:6c:25: d3:61:73:31:c8:c5:3d:5e:a1:68:fe:82:9a:4a:0f:ea:5b:13: b4:6a:be:be-----BEGIN CERTIFICATE-----MIIDdTCCAl2gAwIBAgIBDTANBgkqhkiG9w0BAQUFADBDMRMwEQYDVQQKEwpCYWx0aWNHcmlkMSwwKgYDVQQDEyNCYWx0aWMgR3JpZCBDZXJ0aWZpY2F0aW9uIEF1dGhvcml0eTAeFw0wNTAzMjQxMjMwMzJaFw0wNjAzMjQxMjMwMzJaMEMxEzARBgNVBAoTyH8pmMXYNc8uLrIWfhmMPDJ5Le2ae1DjJt95WYSPxjTUOsFlW3kubutiUC8KRwAIVO5UbZGf/1jwtXmqaBLpLBWdBkE7PylLur7h7+GqfINbvjrhFl8CZXDGfRV74EM++cGzloD7oKqog3kOC4e3CbZgbWQs3t7DHEzM5VRMMybZMTUpMN+Le+aoMW6kV+9RU2zfe/ZtjtCtunKHF0eq1Pr/TdDMRaUo5aNGhM/ES5T4uie1NeN5+Ek9kLBBXXHlFWwl02FzMcjFPV6haP6CmkoP6lsTtGq+vg==-----END CERTIFICATE-----

Page 32: GRID (European Scientific Computing infrastructure)

Getting Started

http://grid.lumii.lv/section/show/12

1. Get a digital certificate

2. Join a Virtual Organisation (VO) For LHC join LCG and choose a

VO

3. Get access to a local User Interface Machine (UI) and copy your files and certificate there

Authentication – who you are

https://voms.balticgrid.org:8443/voms/balticgrid/

Authorisation – what you are allowed to do

Page 33: GRID (European Scientific Computing infrastructure)

Šis būs vajadzīgs

ssh zars.latnet.lv User interface (sertificētie dabūs kontus)

openssl pkcs12 -export -in usercert.pem -inkey userkey.pem –out certificate.p12 Izveidot sertifikata kopiju MS Explorer derīgā formāta

http://winscp.net/download/winscp380.exe Failu kopēšanai starp zars.latnet.lv un PC

Page 34: GRID (European Scientific Computing infrastructure)
Page 35: GRID (European Scientific Computing infrastructure)
Page 36: GRID (European Scientific Computing infrastructure)

Job Preparation

############# athena.jdl #################Executable = "athena.sh";StdOutput = "athena.out";StdError = "athena.err";InputSandbox = {"athena.sh", "MyJobOptions.py", "MyAlg.cxx", "MyAlg.h", "MyAlg_entries.cxx", "MyAlg_load.cxx", "login_requirements", "requirements", "Makefile"}; OutputSandbox = {"athena.out","athena.err", "ntuple.root", "histo.root", "CLIDDBout.txt"};Requirements = Member("VO-atlas-release-9.0.4", other.GlueHostApplicationSoftwareRunTimeEnvironment);################################################

Input files

Output Files

Choose ATLAS Version (Satisfied by ~32 Sites)

Prepare a file of Job Description Language (JDL):

My C++ CodeJob Options

Script to run

Page 37: GRID (European Scientific Computing infrastructure)

Job Submission

[lloyd@lcgui ~/atlas]$ grid-proxy-initYour identity: /C=UK/O=eScience/OU=QueenMaryLondon/L=Physics/CN=steve lloydEnter GRID pass phrase for this identity:Creating proxy .............................. DoneYour proxy is valid until: Thu Mar 17 03:25:06 2005[lloyd@lcgui ~/atlas]$

Make a copy of your certificate to send out (~ once a day):

[lloyd@lcgui ~/atlas]$ edg-job-submit --vo atlas -o jobIDfile athena.jdlSelected Virtual Organisation name (from --vo option): atlasConnecting to host lxn1188.cern.ch, port 7772Logging to host lxn1188.cern.ch, port 9002================================ edg-job-submit Success ==================================== The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is:

- https://lxn1188.cern.ch:9000/0uDjtwbBbj8DTRetxYxoqQ

The edg_jobId has been saved in the following file: /home/lloyd/atlas/jobIDfile============================================================================================[lloyd@lcgui ~/atlas]$

Submit the Job: VO JDLFile to hold job IDs

Page 38: GRID (European Scientific Computing infrastructure)

[lloyd@lcgui ~/atlas]$ edg-job-status -i jobIDfile------------------------------------------------------------------1 : https://lxn1188.cern.ch:9000/tKlZHxqEhuroJUhuhEBtSA2 : https://lxn1188.cern.ch:9000/IJhkSObaAN5XDKBHPQLQyA3 : https://lxn1188.cern.ch:9000/BMEOq90zqALvkriHdVeN7A4 : https://lxn1188.cern.ch:9000/l6wist7SMq6jVePwQjHofg5 : https://lxn1188.cern.ch:9000/wHl9Yl_puz9hZDMe1OYRyQ6 : https://lxn1188.cern.ch:9000/PciXGNuAu7vZfcuWiGS3zQ7 : https://lxn1188.cern.ch:9000/0uDjtwbBbj8DTRetxYxoqQa : allq : quit------------------------------------------------------------------Choose one or more edg_jobId(s) in the list - [1-7]all:7*************************************************************BOOKKEEPING INFORMATION:

Status info for the Job : https://lxn1188.cern.ch:9000/0uDjtwbBbj8DTRetxYxoqQCurrent Status: Done (Success)Exit code: 0Status Reason: Job terminated successfullyDestination: lcg00125.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-shortreached on: Wed Mar 16 17:45:41 2005*************************************************************[lloyd@lcgui ~/atlas]$

RAL

Valencia

CERNTaiwan

Job Status

Taiwan

Find out its status:

Ran at:

Page 39: GRID (European Scientific Computing infrastructure)

Job Retrieval

[lloyd@lcgui ~/atlas]$ edg-job-get-output -dir . -i jobIDfileRetrieving files from host: lxn1188.cern.ch ( for https://lxn1188.cern.ch:9000/0uDjtwbBbj8DTRetxYxoqQ )********************************************************************************* JOB GET OUTPUT OUTCOME Output sandbox files for the job: - https://lxn1188.cern.ch:9000/0uDjtwbBbj8DTRetxYxoqQ have been successfully retrieved and stored in the directory: /home/lloyd/atlas/lloyd_0uDjtwbBbj8DTRetxYxoqQ*********************************************************************************

[lloyd@lcgui ~/atlas]$ ls -lt /home/lloyd/atlas/lloyd_0uDjtwbBbj8DTRetxYxoqQtotal 11024-rw-r--r-- 1 lloyd hep 224 Mar 17 10:47 CLIDDBout.txt-rw-r--r-- 1 lloyd hep 69536 Mar 17 10:47 ntuple.root-rw-r--r-- 1 lloyd hep 5372 Mar 17 10:47 athena.err-rw-r--r-- 1 lloyd hep 11185282 Mar 17 10:47 athena.out

Retrieve the Output:

Page 40: GRID (European Scientific Computing infrastructure)

[guntisb@zars guntisb]$ tar -xvf tutor1.tar[guntisb@zars guntisb]$ cd job1[guntisb@zars job1]$ voms-proxy-initYour identity: /DC=org/DC=balticgrid/OU=lumii.lv/CN=Guntis BarzdinsEnter GRID pass phrase:Creating proxy ......................................................... DoneYour proxy is valid until Wed May 3 23:51:41 2006

[guntisb@zars job1]$

[guntisb@zars job1]$ edg-job-submit --vo balticgrid job1.jdl

Selected Virtual Organisation name (from --vo option): balticgridConnecting to host grid3.mif.vu.lt, port 7772Logging to host grid3.mif.vu.lt, port 9002

********************************************************************************************* JOB SUBMIT OUTCOME The job has been successfully submitted to the Network Server. Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is:

- https://grid3.mif.vu.lt:9000/NdAqueQcc5aARqLN0vwWng

*********************************************************************************************

[guntisb@zars job1]$ edg-job-status https://grid3.mif.vu.lt:9000/NdAqueQcc5aARqLN0vwWng

*************************************************************BOOKKEEPING INFORMATION:

Status info for the Job : https://grid3.mif.vu.lt:9000/NdAqueQcc5aARqLN0vwWngCurrent Status: Ready Status Reason: unavailableDestination: grid2.mif.vu.lt:2119/jobmanager-lcgpbs-balticgridreached on: Wed May 3 13:16:56 2006*************************************************************

[guntisb@zars job1]$ edg-job-status https://grid3.mif.vu.lt:9000/NdAqueQcc5aARqLN0vwWng

*************************************************************BOOKKEEPING INFORMATION:

Status info for the Job : https://grid3.mif.vu.lt:9000/NdAqueQcc5aARqLN0vwWngCurrent Status: Done (Success)Exit code: 0Status Reason: Job terminated successfullyDestination: grid2.mif.vu.lt:2119/jobmanager-lcgpbs-balticgridreached on: Wed May 3 13:22:58 2006*************************************************************

[guntisb@zars job1]$ edg-job-get-output https://grid3.mif.vu.lt:9000/NdAqueQcc5aARqLN0vwWng

Retrieving files from host: grid3.mif.vu.lt ( for https://grid3.mif.vu.lt:9000/NdAqueQcc5aARqLN0vwWng )

********************************************************************************* JOB GET OUTPUT OUTCOME

Output sandbox files for the job: - https://grid3.mif.vu.lt:9000/NdAqueQcc5aARqLN0vwWng have been successfully retrieved and stored in the directory: /tmp/jobOutput/guntisb_NdAqueQcc5aARqLN0vwWng

*********************************************************************************

[guntisb@zars job1]$ mkdir abc[guntisb@zars job1]$ cp /tmp/jobOutput/guntisb_NdAqueQcc5aARqLN0vwWng/* abc/.[guntisb@zars job1]$ ls -al abctotal 12drwxrwxr-x 2 guntisb guntisb 4096 May 3 15:29 .drwxr-xr-x 3 guntisb guntisb 4096 May 3 15:29 ..-rw-rw-r-- 1 guntisb guntisb 0 May 3 15:29 stderr.log-rw-rw-r-- 1 guntisb guntisb 593 May 3 15:29 stdout.log[guntisb@zars job1]$

[guntisb@zars job1]$ more abc/stdout.log********************* Job Start *********************job1 started on Wed May 3 16:18:39 EEST 2006Executing on:Linux stud128.mif 2.4.32-grsec-i686-r1 #1 Št Sau 14 15:54:22 EET 2006 i686 GNU/LinuxCurrent working directory and contents:/home/bg035/globus-tmp.stud128.918.0/WMS_stud128_010893_https_3a_2f_2fgrid3.mif.vu.lt_3a9000_2fNdAqueQcc5aARqLN0vwWngtotal 8-rw-r--r-- 1 bg035 balticgrid 694 May 3 16:18 job1.sh-rw-r--r-- 1 bg035 balticgrid 0 May 3 16:18 stderr.log-rw-r--r-- 1 bg035 balticgrid 357 May 3 16:18 stdout.log********************** Job End **********************[guntisb@zars job1]$

Reāls pilns piemērs

Page 41: GRID (European Scientific Computing infrastructure)

Pilni Grid uzdevumu paraugi

http://www.ltn.lv/~guntis/unix/tutor1.tar http://www.ltn.lv/~guntis/unix/mpi.tgz

Papildinformācija http://grid-it.cnaf.infn.it/index.php?jobsubmit&type=1 Citi EGEE projekta dokumenti

Page 42: GRID (European Scientific Computing infrastructure)
Page 43: GRID (European Scientific Computing infrastructure)

A global, federated e-Infrastructure

EGEE infrastructure~ 200 sites in 39 countries~ 20 000 CPUs> 5 PB storage> 10 000 concurrent jobs per day> 60 Virtual Organisations

EUIndiaGrid

EUMedGrid

SEE-GRID

EELA

BalticGrid

EUChinaGridOSGNAREGI

Page 44: GRID (European Scientific Computing infrastructure)
Page 45: GRID (European Scientific Computing infrastructure)

e-Infrastructures: network layer

Page 46: GRID (European Scientific Computing infrastructure)

e-Infrastructures: grid layer

Page 47: GRID (European Scientific Computing infrastructure)

e-Infrastructures: data layer

Standards

OGF

Page 48: GRID (European Scientific Computing infrastructure)
Page 49: GRID (European Scientific Computing infrastructure)
Page 50: GRID (European Scientific Computing infrastructure)

DEISA

Page 51: GRID (European Scientific Computing infrastructure)
Page 52: GRID (European Scientific Computing infrastructure)

Microsoft Cluster Server SW

Page 53: GRID (European Scientific Computing infrastructure)

Typical Ways to Use MPI on Multicore Systems

One MPI process per core Each MPI process is a single thread

One MPI process per node MPI processes are multithreaded One thread per core aka Hybrid model

Some combination of the two

node

process

Dual-core processor

Page 54: GRID (European Scientific Computing infrastructure)

Scalability MPI can scale to very large systems

E.g. MPICH2 on 65K dual-core BlueGene/L nodes But will an existing application scale?

If it runs on 4K single-core nodes, it will run fine on 1K four-core nodes But will it run on 4K four-core nodes?

Hybrid programming model can help improve scalability Shared-memory/threads programming on nodes MPI across nodes

Page 55: GRID (European Scientific Computing infrastructure)

Single-Threaded MPI Programming Pros

Same paradigm developers are used to Existing codes can make use of multicore Easier to debug

Cons There may be a better shared-memory algorithm Possible duplication of large arrays

Page 56: GRID (European Scientific Computing infrastructure)

MPICH2 Features to Support Multicore Architechtures

Nemesis: New “multimethod” communication subsystem for MPICH2 Optimized intranode and internode communication

Shared memory

Network

Process 0 Process 1 Process 2

Node 0 Node 1

Page 57: GRID (European Scientific Computing infrastructure)

Multithreaded MPI Programming

Pros Hybrid programming model Use shared-memory algorithms where appropriate One copy of large array shared by all threads

Cons In general threaded programming can be difficult to write, debug and verify (e.g.,

using pthreads)

OpenMP and UPC make threaded programming easier Language constructs to parallelize loops, etc.

Page 58: GRID (European Scientific Computing infrastructure)

MPI Supported Thread Levels MPI_THREAD_SINGLE

Only one user thread is allowed MPI_THREAD_FUNNELED

May have one or more threads, but only the “main” thread may make MPI calls MPI_THREAD_SERIALIZED

May have one or more threads, but only one thread can make MPI calls at a time. It is the application developer’s responsibility to guarantee this.

MPI_THREAD_MULTIPLE May have one or more threads. Any thread can make MPI calls at any time

(with certain conditions).

MPICH2 supports MPI_THREAD_MULTIPLE

Page 59: GRID (European Scientific Computing infrastructure)

Using Multiple Threads in MPI The main thread must call MPI_Init_thread()

App requests a thread level MPI returns the thread level actually provided These values need not be the same on every process Hint: Request only the level you need to avoid unnecessary overhead for higher

thread levels. MPI_Init_thread()

Called in place of MPI_Init() Only the main thread should call this The main thread (and only the main thread) must call MPI_Finalize

there is no MPI_Finalize_thread()

MPI does not provide routines to create threads That’s left to the user

E.g., use pthreads OpenMP, etc

Page 60: GRID (European Scientific Computing infrastructure)

Allocating Processes/Threads to Cores

Typically: one process/thread per core

It may be beneficial to oversubscribe in some cases If a process/thread is blocking most of the time Examples:

Communication thread with compute threads Master/slave

Master process distributes work then waits for results

Caution: Some MPI implementations actively poll while waiting for a message Comm. Thread would compete for cycles with compute threads

Page 61: GRID (European Scientific Computing infrastructure)

Binding Processes/Threads to Specific Cores

Processes/threads can be bound to one ormore cores

Which core to choose?

For best communication, locate threads that do a lot of communication close together

On same node processor cores sharing L2 cache

But cores on the same processor share cache and pins to frontside bus Might do better splitting up threads that do a lot of memory accesses

Bind the communication thread to one of the cores for the compute threads of the same process

I/O interrupt handling is statically mapped to one core

Page 62: GRID (European Scientific Computing infrastructure)

What is a thread?

process: • an address space with 1 or more threads executing

within that address space, and the required system resources for those threads

• a program that is running thread:

• a sequence of control within a process• shares the resources in that process

Page 63: GRID (European Scientific Computing infrastructure)

Advantages and Drawbacks of Threads

Advantages:• the overhead for creating a thread is significantly less

than that for creating a process (~ 2 milliseconds for threads)

• multitasking, i.e., one process serves multiple clients• switching between threads requires the OS to do

much less work than switching between processes

Page 64: GRID (European Scientific Computing infrastructure)

Drawbacks:• not as widely available as longer established

features• writing multithreaded programs require more careful

thought• more difficult to debug than single threaded

programs• for single processor machines, creating several

threads in a program may not necessarily produce an increase in performance (only so many CPU cycles to be had)

Page 65: GRID (European Scientific Computing infrastructure)

POSIX Threads (pthreads)

IEEE's POSIX Threads Model:• programming models for threads in a UNIX platform• pthreads are included in the international standards

ISO/IEC9945-1 pthreads programming model:

• creation of threads• managing thread execution• managing the shared resources of the process

Page 66: GRID (European Scientific Computing infrastructure)

main thread:• initial thread created when main() (in C) or

PROGRAM (in fortran) are invoked by the process loader

• once in the main(), the application has the ability to create daughter threads

• if the main thread returns, the process terminates even if there are running threads in that process, unless special precautions are taken

• to explicitly avoid terminating the entire process, use pthread_exit()

Page 67: GRID (European Scientific Computing infrastructure)

thread termination methods:• implicit termination:

thread function execution is completed• explicit termination:

calling pthread_exit() within the threadcalling pthread_cancel() to terminate other threads

for numerically intensive routines, it is suggested that the application calls p threads if there are p available processors

Page 68: GRID (European Scientific Computing infrastructure)

Sample Pthreads Program in C++ and Fortran 90/95

The program in C++ calls the pthread.h header file. Pthreads related statements are preceded by the pthread_ prefix (except for semaphores). Knowing how to manipulate pointers is important.

The program in Fortran 90/95 uses the f_pthread module. Pthreads related statements are preceded by the f_pthread_ prefix (again, except for semaphores).

Pthreads in Fortran is still not an industry-wide standard.

Page 69: GRID (European Scientific Computing infrastructure)

1 //****************************************************************2 // This is a sample threaded program in C++. The main thread creates3 // 4 daughter threads. Each daughter thread simply prints out a message4 // before exiting. Notice that I’ve set the thread attributes to joinable and5 // of system scope.6 //****************************************************************7 #include <iostream.h>8 #include <stdio.h>9 #include <pthread.h>10 11 #define NUM_THREADS 412 13 void *thread_function( void *arg );14 15 int main( void )16 {17 int i, tmp;18 int arg[NUM_THREADS] = {0,1,2,3};19 20 pthread_t thread[NUM_THREADS];21 pthread_attr_t attr;22 23 // initialize and set the thread attributes24 pthread_attr_init( &attr );25 pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_JOINABLE );26 pthread_attr_setscope( &attr, PTHREAD_SCOPE_SYSTEM );27

Page 70: GRID (European Scientific Computing infrastructure)

28 // creating threads 29 for ( i=0; i<NUM_THREADS; i++ )30 {31 tmp = pthread_create( &thread[i], &attr, thread_function, (void *)&arg[i] );32 33 if ( tmp != 0 )34 {35 cout << "Creating thread " << i << " failed!" << endl;36 return 1;37 }38 }39 40 // joining threads41 for ( i=0; i<NUM_THREADS; i++ )42 {43 tmp = pthread_join( thread[i], NULL );44 if ( tmp != 0 )45 {46 cout << "Joing thread " << i << " failed!" << endl;47 return 1;48 }49 }50 51 return 0;52 }53

Page 71: GRID (European Scientific Computing infrastructure)

54 //***********************************************************55 // This is the function each thread is going to run. It simply asks56 // the thread to print out a message. Notice the pointer acrobatics.57 //***********************************************************58 void *thread_function( void *arg )59 {60 int id;61 62 id = *((int *)arg);63 64 printf( "Hello from thread %d!\n", id );65 pthread_exit( NULL );66 }

Page 72: GRID (European Scientific Computing infrastructure)

How to compile:• in Linux use:

> {C++ comp} –D_REENTRANT hello.cc –lpthread –o hello

• it might also be necessary for some systems to define the _POSIX_C_SOURCE (to 199506L)

• Creating a thread:int pthread_create( pthread_t *thread, pthread_attr_t *attr, void

*(*thread_function)(void *), void *arg );• first argument – pointer to the identifier of the created thread• second argument – thread attributes• third argument – pointer to the function the thread will execute• fourth argument – the argument of the executed function (usually a struct)• returns 0 for success

Page 73: GRID (European Scientific Computing infrastructure)

Waiting for the threads to finish:int pthread_join( pthread_t thread, void **thread_return )

• main thread will wait for daughter thread thread to finish• first argument – the thread to wait for• second argument – pointer to a pointer to the return value

from the thread• returns 0 for success• threads should always be joined; otherwise, a thread might

keep on running even when the main thread has already terminated

Page 74: GRID (European Scientific Computing infrastructure)

1 !****************************************************************2 ! This is a sample threaded program in Fortran 90/95. The main thread3 ! creates 4 daughter threads. Each daughter thread simply prints out4 ! a message before exiting. Notice that I've set the thread attributes to5 ! be joinable and of system scope.6 !****************************************************************7 PROGRAM hello8 9 USE f_pthread10 IMPLICIT NONE11 12 INTEGER, PARAMETER :: num_threads = 413 INTEGER :: i, tmp, flag14 INTEGER, DIMENSION(num_threads) :: arg15 TYPE(f_pthread_t), DIMENSION(num_threads) :: thread16 TYPE(f_pthread_attr_t) :: attr17 18 EXTERNAL :: thread_function19 20 DO i = 1, num_threads21 arg(i) = i – 122 END DO23 24 !initialize and set the thread attributes25 tmp = f_pthread_attr_init( attr )26 tmp = f_pthread_attr_setdetachstate( attr, PTHREAD_CREATE_JOINABLE )27 tmp = f_pthread_attr_setscope( attr, PTHREAD_SCOPE_SYSTEM )28

Page 75: GRID (European Scientific Computing infrastructure)

29 ! this is an extra variable needed in fortran (not needed in C)30 flag = FLAG_DEFAULT31 32 ! creating threads33 DO i = 1, num_threads34 tmp = f_pthread_create( thread(i), attr, flag, thread_function, arg(i) )35 IF ( tmp /= 0 ) THEN36 WRITE (*,*) "Creating thread", i, "failed!"37 STOP38 END IF39 END DO40 41 ! joining threads42 DO i = 1, num_threads43 tmp = f_pthread_join( thread(i) )44 IF ( tmp /= 0 ) THEN45 WRITE (*,*) "Joining thread", i, "failed!"46 STOP47 END IF48 END DO49

Page 76: GRID (European Scientific Computing infrastructure)

50 !**************************************************************51 ! This is the subroutine each thread is going to run. It simply asks52 ! the thread to print out a message. Notice that f_pthread_exit() is53 ! a subroutine call.54 !**************************************************************55 SUBROUTINE thread_function( id )56 57 IMPLICIT NONE58 59 INTEGER :: id, tmp60 61 WRITE (*,*) "Hello from thread", id62 CALL f_pthread_exit()63 64 END SUBROUTINE thread_function

Page 77: GRID (European Scientific Computing infrastructure)

How to compile:• only available in AIX 4.3 and above:

> xlf95_r –lpthread hello.f –o hello

• the compiler should be thread safe The concept for creating and joining threads are the

same in C/C++ except that pointers are not directly involved in fortran.

Note that in fortran some pthread calls are function calls while some are subroutine calls.

Page 78: GRID (European Scientific Computing infrastructure)

Thread Attributes

detach state attribute:int pthread_attr_setdetachstate(pthread_attr_t *attr, int

detachstate);

• detached – main thread continues working without waiting for the daughter threads to terminate

• joinable – main thread waits for the daughter threads to terminate before continuing further

Page 79: GRID (European Scientific Computing infrastructure)

contention scope attribute:int pthread_attr_setscope(pthread_attr_t *attr, int *scope);

• system scope – threads are mapped one-to-one on the OS's kernel threads (kernel threads are entities that scheduled onto processors by the OS)

• process scope – threads share a kernel thread with other process scoped threads

Page 80: GRID (European Scientific Computing infrastructure)

Thread Synchronization Mechanisms

Mutual exclusion (mutex):• guard against multiple threads modifying the same

shared data simultaneously• provides locking/unlocking critical code sections where

shared data is modified• each thread waits for the mutex to be unlocked (by the

thread who locked it) before performing the code section

Page 81: GRID (European Scientific Computing infrastructure)

Basic Mutex Functions:int pthread_mutex_init(pthread_mutex_t *mutex, const

pthread_mutexattr_t *mutexattr);

int pthread_mutex_lock(pthread_mutex_t *mutex);

int pthread_mutex_unlock(pthread_mutex_t *mutex);

int pthread_mutex_destroy(pthread_mutex_t *mutex);

• a new data type named pthread_mutex_t is designated for mutexes

• a mutex is like a key (to access the code section) that is handed to only one thread at a time

• the attribute of a mutex can be controlled by using the pthread_mutex_init() function

• the lock/unlock functions work in tandem

Page 82: GRID (European Scientific Computing infrastructure)

#include <pthread.h>...pthread_mutex_t my_mutex; // should be of global scope...int main(){ int tmp; ... // initialize the mutex tmp = pthread_mutex_init( &my_mutex, NULL ); ... // create threads ... pthread_mutex_lock( &my_mutex ); do_something_private(); pthread_mutex_unlock( &my_mutex ); ...

return 0;}

Whenever a thread reaches the lock/unlock block, it first determines if the mutex is locked. If so, it waits until it is unlocked. Otherwise, it takes the mutex, locks the succeeding code, then frees the mutex and unlocks the code when it's done.

Page 83: GRID (European Scientific Computing infrastructure)

Counting Semaphores:• permit a limited number of threads to execute a

section of the code• similar to mutexes• should include the semaphore.h header file• semaphore functions do not have pthread_ prefixes;

instead, they have sem_ prefixes

Page 84: GRID (European Scientific Computing infrastructure)

Basic Semaphore Functions:• creating a semaphore:int sem_init(sem_t *sem, int pshared, unsigned int value);

initializes a semaphore object pointed to by sem pshared is a sharing option; a value of 0 means the

semaphore is local to the calling processgives an initial value value to the semaphore

• terminating a semaphore:int sem_destroy(sem_t *sem);

frees the resources allocated to the semaphore semusually called after pthread_join()

an error will occur if a semaphore is destroyed for which a thread is waiting

Page 85: GRID (European Scientific Computing infrastructure)

• semaphore control:int sem_post(sem_t *sem);int sem_wait(sem_t *sem);

sem_post atomically increases the value of a semaphore by 1, i.e., when 2 threads call sem_post simultaneously, the semaphore's value will also be increased by 2 (there are 2 atoms calling)

sem_wait atomically decreases the value of a semaphore by 1; but always waits until the semaphore has a non-zero value first

Page 86: GRID (European Scientific Computing infrastructure)

#include <pthread.h>#include <semaphore.h>...void *thread_function( void *arg );...sem_t semaphore; // also a global variable just like mutexes...int main(){ int tmp;

... // initialize the semaphore

tmp = sem_init( &semaphore, 0, 0 );...// create threadspthread_create( &thread[i], NULL, thread_function, NULL );...while ( still_has_something_to_do() ){

sem_post( &semaphore ); ...

}...pthread_join( thread[i], NULL );sem_destroy( &semaphore );

return 0;}

Page 87: GRID (European Scientific Computing infrastructure)

void *thread_function( void *arg ){

sem_wait( &semaphore );perform_task_when_sem_open();...pthread_exit( NULL );

}

• the main thread increments the semaphore's count value in the while loop

• the threads wait until the semaphore's count value is non-zero before performing perform_task_when_sem_open() and further

• daughter thread activities stop only when pthread_join() is called

Page 88: GRID (European Scientific Computing infrastructure)

When mixing with MPI, the simplest way is to let only 1 thread handle the communications.

So why threads? In some cases, it is the only viable approach.

Page 89: GRID (European Scientific Computing infrastructure)

MPI Programming with Six Routines

Some programs can be written with only six routines MPI_Init MPI_Finalize MPI_Comm_size MPI_Comm_rank MPI_Send MPI_Recv

Page 90: GRID (European Scientific Computing infrastructure)

Non-blocking Communication Communication split into two parts

MPI_Isend or MPI_Irecv starts communication and returns request data structure.

MPI_Wait (also MPI_Waitall, MPI_Waitany) uses request as an argument and blocks until communication is complete.

MPI_Test uses request as an argument and checks for completion.

Advantages No deadlocks Overlap communication with computation Exploit bi-directional communication

Page 91: GRID (European Scientific Computing infrastructure)

Identifying Processes

MPI Communicator Defines a group (set of ordered processes) and a context

(a virtual network) Rank

Process number within the group MPI_ANY_SOURCE will receive from any process

Default communicator MPI_COMM_WORLD the whole group

Page 92: GRID (European Scientific Computing infrastructure)

Non-blocking send/recv buffers

May not modify or read the message buffer between MPI_Irecv and MPI_Wait calls.

May not modify or read the message buffer between MPI_Isend and MPI_Wait calls.

May not have two MPI_Irecv pending on the same buffer.

May not have two MPI_Isend pending on the same buffer.

Restrictions provide flexibility for implementers.