third eela tutorial for managers and users e-infrastructure shared between europe and latin america...

24
Third EELA Tutorial for Managers and Users www.eu-eela.org E-infrastructure shared between Europe and Latin America CE + WN installation and configuration Vanessa Hamar Universidad de Los Andes – Mérida, Venezuela Rio de Janeiro 26-30, 2006

Upload: clemence-webb

Post on 18-Jan-2018

224 views

Category:

Documents


0 download

DESCRIPTION

Third EELA Tutorial for Managers and Users E-infrastructure shared between Europe and Latin America What is CE? The CE is a service representing a computing resource. Its main functionality is job management (job submission, job control, etc.). For job submission, the CE can work in: –push model –push model (where the job is pushed to a CE for its execution). –pull model –pull model (where the CE asks the WMS for jobs).

TRANSCRIPT

Page 1: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

www.eu-eela.org

E-infrastructure shared between Europe and Latin America

CE + WN installation and configuration Vanessa HamarUniversidad de Los Andes – Mérida, VenezuelaRio de Janeiro 26-30, 2006

Page 2: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Outline

• What is a Computing Element (CE) ?• What is a Torque Server ?• What is a Worker Node?• How to install and configure a Computing Element with

Torque Server.• How to install and configure a Worker Node with

Torque

Page 3: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

What is CE?

• The CE is a service representing a computing resource.

• Its main functionality is job management (job submission, job control, etc.).

• For job submission, the CE can work in:– push modelpush model (where the job is pushed to a CE for its execution).

– pull modelpull model (where the CE asks the WMS for jobs).

Page 4: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

• TORQUETORQUE (Tera-scale Open-source Resource and QUEue management) is a resource management providing control over batch jobs and distribuited compute resource.

• The Torque System is composed by a:– pbs_serverpbs_server which provides the basic batch services

such as receiving/creating a batch job or protecting the job against system crashes.

– job_schedulerjob_scheduler which contains the site's policy used to decide which job must be executed.

– pbs_mompbs_mom which places the job into execution. It is also responsible for returning the job’s output to the user.

What is Torque?

Page 5: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

What is a Worker Node?

• The Worker Node (WN) is a set of clients required to run jobs sent by the CE via the Local Resource Management System. It currently includes the:

– gLite I/O Client, – the Logging and Bookkeeping Client, – the R-GMA Client and – the WMS Checkpointing library.

Page 6: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing CE + Torque Server

WN + Torque

Page 7: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

• Start from a fresh install of SLC 3.0.X

• Installation via– Starting from gLite release 3.0 the installation via gLite installer

scripts is not supported.– APT http://glite.web.cern.ch/glite/packages/APT.asp

rpm -qa | grep aptInstall apt if necessary:

rpm -ivh http://linuxsoft.cern.ch/cern/slc30X/i386/SL/RPMS/apt-0.5.15cnc6-8.SL.cern.i386.rpm

• Installation will install all dependencies, including– other necessary gLite modules– external dependencies

Installing pre-requisites

Page 8: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• JAVA is not included in distribution. Install it separately (>= 1.4.2_08)http://java.sun.com/j2se/1.4.2/download.html

chmod +x j2sdk-1_4_2_08-linux-i586-rpm.bin./j2sdk-1_4_2_10-linux-i586-rpm.binrpm -ivh j2sdk-1_4_2_10-linux-i586.rpmPreparing... ###########################################

[100%] 1:j2sdk ###########################################

[100%]

Page 9: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• Depending on the packages set you selected when installing the operating system, it may be possibile that lam package is installed on your WN. Please remove lam.apt-get remove lam

• There is a known installation conflict between the 'torque-clients' rpm and the 'postfix' mail client (Savannah. bug #5509). If you are going to install Torque, uninstall postfix packageapt-get remove postfix

Page 10: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• Check the FQDN hostname

– Ensure that the hostnames of your machines are correctly set. Run the command:

hostname -f

Page 11: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• Install glite-yaim and gilda_ig-yaim packages on your nodes

• Download and install latest version of glite-yaim-3.0.0 -* on all your grid nodes:

http://glitesoft.cern.ch/EGEE/gLite/APT/R3.0/rhel30/RPMS.Release3.0/

rpm -hiv glite-yaim-3.0.0-16.noarch.rpm Preparing...

########################################### [100%] 1:glite-yaim

########################################### [100%]

Page 12: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

• Download and install the latest version of gilda_ig-yaim-3.0.0 -* on all your grid nodes:

http://grid018.ct.infn.it/apt/gilda_app-i386/utils

[root@eelatut37 root]# rpm -hiv gilda_ig-yaim-3.0.0-11.noarch.rpm Preparing...

########################################### [100%] 1:gilda_ig-yaim

########################################### [100%]

Page 13: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

• Request host certificates for the CE to a CA– https://gilda.ct.infn.it/CA/mgt/restricted/srvreq.php

• Copy host certificate (hostcert.pem and hostkey.pem) in /etc/grid-certificates.

• Change the permisions– chmod 644 hostcert.pem– chmod 400 hostkey.pem

• If you plan to use certificates released by unsupported EGEE CA’s, be sure that their public key and CRLs (usually distributed with a rpm) are installed.– The CRL of the VO GILDA are available from https://gilda.ct.infn.it/RPMS/ca_GILDA-1.0-1.i386.rpm

Installing pre-requisites

Page 14: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing pre-requisites

• Syncronization among all gLite nodes is mandatory. Install ntp if not already available for your system:– apt-get install ntp

• Add your time server in /etc/ntp.conf– restrict <time_server_IP_address> mask 255.255.255.255 nomodify notrap

noquery – server <time_server_name> – (you can use ntp-1.infn.it – IP 193.206.144.10)

• Edit /etc/ntp/step-tickers adding your(s) time server(s) hostname• If you are running a firewall, you will have to allow inbound

comminication on the NTP port:– -A INPUT -s <NTP-serverIP-1> -p udp --dport 123 -j ACCEPT

• Activate the ntpd service with the following commands: ntpdate <your ntp server name> service ntpd start chkconfig ntpd on

– You can check ntpd’s status with: ntpq -p

Page 15: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• Add gLite apt repository:– Put one this line in a file (e.g. glite.list) inside the

/etc/apt/sources.list.d directory

– apt-get update – apt-get upgrade

Page 16: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• All the configuration values to sites have to be configured in a site configuration file using key-value pairs.

• This file is shared among all the different gLite node types. So edit once and keep it in a safe place

• Create a copy of /opt/glite/yaim/examples/site-info.def template (coming from the lcg-yaim RPM) to your reference directory for the installation (e.g. /root):– cp /opt/glite/yaim/examples/gilda_ig-site-info.def /root/my-site-info.def

• A good syntax test for your site configuration file is to try to source it manually running the command:– source my-site-info.def

Page 17: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• vi /opt/glite/yaim/examples/gilda_wn-list.conf

Page 18: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• Install the node

/opt/glite/yaim/scripts/gilda_ig_install_node gilda_ig-site-info.def GILDA_ig_CE_torque

• Configure the node

/opt/glite/yaim/scripts/gilda_ig_configure_node gilda_ig-site-info.def GILDA_ig_CE_torque

Page 19: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

• If the installation is performed successfully, the following components are installed:

– gLite in /opt/glite – Condor in /opt/condor-x.y.x (where x.y.z is the

current condor version) – Globus in /opt/globus – Tomcat in /var/lib/tomcat5 – Torque in /var/spool/pbs

Installing CE+Torque Server via apt

Page 20: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• Edit /etc/ssh/sshd_config/etc/ssh/sshd_config and add the following lines at the end:

HostbasedAuthentication yes IgnoreUserKnownHosts yes IgnoreRhosts yes

• Restart the server with:

/sbin/service sshd restart/sbin/service sshd restart

Page 21: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing CE+Torque Server via apt

• On the CE generate an updated version of /etc/ssh/ssh_know_hosts/etc/ssh/ssh_know_hosts by running:

/opt/edg/sbin/edg-pbs-knownhosts

• Copy that file into all the WorkerNodes.

Page 22: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

Installing WN Server via apt

•Install the node

•/opt/glite/yaim/scripts/gilda_ig_install_node gilda_ig-site-info.def GILDA_ig_WN_torque

•Configure the node

•/opt/glite/yaim/scripts/gilda_ig_configure_node gilda_ig-site-info.def GILDA_ig_WN_torque

Page 23: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America

References

• https://gilda.ct.infn.it/docs/GILDAsiteinstall-3_0_0.html

Page 24: Third EELA Tutorial for Managers and Users  E-infrastructure shared between Europe and Latin America CE + WN installation and configuration

Third EELA Tutorial for Managers and Users

E-infrastructure shared between Europe and Latin America