edg wp4: installation task

24
EDG WP4: installation task LSCCW/HEPiX hands-on, NIKHEF 5/03 German Cancio CERN IT/FIO http://cern.ch/wp4-install

Upload: whittaker-jovani

Post on 30-Dec-2015

49 views

Category:

Documents


0 download

DESCRIPTION

EDG WP4: installation task. LSCCW/HEPiX hands-on, NIKHEF 5/03 German Cancio CERN IT/FIO http://cern.ch/wp4-install. Agenda. Part 1: General architectural overview Components description and current status Part 2: Exercises on software distribution Part 3: - PowerPoint PPT Presentation

TRANSCRIPT

EDG WP4: installation task

LSCCW/HEPiX hands-on, NIKHEF 5/03

German Cancio CERN IT/FIO

http://cern.ch/wp4-install

HEPiX hands-on / Installation Task / German Cancio CERN - n° 2

Agenda

Part 1:

General architectural overview

Components description and current status

Part 2:

Exercises on software distribution

Part 3:

Discussion: differences to other solutions (if time permits)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 3

Disclaimer

This is not a repetition of the WP4 LCFGng tutorial given last year at CERN. I will describe the proposed replacement for LCFG, developed by EDG WP4-install.

This is a work in progress. Most of the subsystems presented here are currently under design/development, although some are already been deployed at CERN.

There are less practical exercises than theory slides ;-(

Your feedback is a most welcome source for improvements!

HEPiX hands-on / Installation Task / German Cancio CERN - n° 4

EDG WP4: reminder

WP4 is the ‘fabric management’ work package of the EU DataGrid project.

Objective: To develop system management tools for enabling the deployment

of very large computing fabrics […] with reduced sysadmin and operation costs.

Installation task: solutions for automated from scratch node installation node configuration/reconfiguration software storage, distribution and installation

Configuration task: solutions for storing, maintaining and retrieving configuration information.

HEPiX hands-on / Installation Task / German Cancio CERN - n° 6

WP4-install arch

CCM

SPMASPMANCMComponents

Cdispd

NCM

RegistrationNotification

SPMASPMA.cfg

CDB

nfshttp

ftp

Mgmt APIACL’s

Client Nodes

SWRep Servers

cache Packages(rpm

, pkg)

packages

(RPM, PKG)

PXEDHCP

Mgmt APIACL’s

Installation server

DHCPhandling

KS/JS

PXEhandling

KS/JSgenerator

NodeInstall

CCM

Node (re)install?

Automated Installation Infrastructure• DHCP and Kickstart (or JumpStart) are re-generated according to CDB contents•PXE can be set to reboot or reinstall by operator

Software Repository• Packages (in RPM or PKG format) can be uploaded into multiple Software Repositories•Client access is using HTTP, NFS/AFS or FTP•Management access subject to authentication/authorization

Node Configuration Manager (NCM)• Configuration Management on the node is done by NCM Components•Each component is responsible for configuring a service (network, NFS, sendmail, PBS)•Components are notified by the Cdispd whenever there was a change in their configuration

Software Package Mgmt Agent (SPMA)• SPMA manages the installed packages•Runs on Linux (RPM) or Solaris (PKG)•SPMA configuration done via an NCM component•Can use a local cache for pre-fetching packages (simultaneous upgrades of large farms)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 7

Base installation (AII)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 8

AII (Automated Installation Infrastructure)

Subsystem to automate the node base installation via the network

Layer on top of existing technologies (base system installer, DHCP, PXE)

Modules:

AII-dhcp: manage DHCP server for network installation information

AII-nbp (network bootstrap program): manages the PXE configuration for each node (boot from HD/ start the

installation via network)

AII-osinstall: Manage OS configuration files required by the OS installation procedure

(KickStart, JumpStart)

More details in AII design document: http://edms.cern.ch/document/374559

HEPiX hands-on / Installation Task / German Cancio CERN - n° 9

AII: current status

Architectural design finished

Detailed Design, implementation progressing

first alpha version expected mid July

HEPiX hands-on / Installation Task / German Cancio CERN - n° 10

Node Configuration (NCM)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 11

Node Configuration Management (NCM)

Client software running on the node which takes care of “implementing” what is in the configuration profile

Modules: “Components”

Invocation and notification framework

Component support libraries

HEPiX hands-on / Installation Task / German Cancio CERN - n° 12

NCM: Components

“Components” (like SUE “features” or LCFG ‘objects’) are responsible for updating local config files, and notifying services if needed

Components register their interest in configuration entries or subtrees, and get invoked in case of changes

Components do only configure the system Usually, this implies regenerating and/or updating local config files (eg. /etc/sshd_config)

Use standard system facilities (SysV scripts) for managing services Components can notify services using SysV scripts when their configuration

changes.

Possible to define configuration dependencies between components Eg. configure network before sendmail

HEPiX hands-on / Installation Task / German Cancio CERN - n° 13

Component example

sub Configure {

my ($self) = @_;

# access configuration information

my $config=NVA::Config->new();

my $arch=$config->getValue('/system/architecture’); # NVA API

$self->Fail (“not supported") unless ($arch eq ‘i386’);

# (re)generate and/or update local config file(s)

open (myconfig,’/etc/myconfig’); …

# notify affected (SysV) services if required

if ($changed) {

system(‘/sbin/service myservice reload’); …

}

}

HEPiX hands-on / Installation Task / German Cancio CERN - n° 14

NCM (contd.)

cdispd (Configuration Dispatch Daemon) Monitors the config profile, and invokes components via the ncd if there

were changes

ncd (Node Configuration Deployer): framework and front-end for executing components (via cron, cdispd, or

manually) Dependency ordering of components

Component support libraries: For recurring system mgmt tasks (interfaces to system services, sysinfo),

log handling, etc

More details in NCM design document http://edms.cern.ch/document/372643

HEPiX hands-on / Installation Task / German Cancio CERN - n° 15

NCM architecture (from design doc.)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 16

NCM: Status

Architectural design finished

Detailed (class) design progressing

First version expected mid July

Porting/coding of base configuration components completed mid September

more than 60 components to be ported for having a complete EDG solution (configuring all EDG middleware services)!

Pilot deployment on CERN central interactive/batch facilities expected at the end of the year

HEPiX hands-on / Installation Task / German Cancio CERN - n° 17

Software Distribution(SWRep and SPMA)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 18

SPM (Software Package Mgmt) (I)

SWRep (Software Repository):

Client-server toolsuite for the management of software packages

Universal repository: Extendable to multiple platforms and package formats (RHLinux/RPM,

Solaris/PKG,… others like Debian dpkg)

Multiple package versions/releases

Management (“product maintainers”) interface: ACL based mechanism to grant/deny modification rights (packages associated

to “areas”)

Current implementation using SSH

Client access: via standard protocols HTTP (scalability), but also AFS/NFS, FTP

Replication: using standard tools (eg. rsync) Availability, load balancing

HEPiX hands-on / Installation Task / German Cancio CERN - n° 19

SPM (Software Package Mgmt) (II)

Software Package Management Agent (SPMA):

Runs on every target node

Multiple repositories can be accessed (eg. division/experiment specific)

Plug-in framework allows for portability System packager specific transactional interface (RPMT, PKGT)

Can manage either all or a subset of packages on the nodes Useful for add-on installations, and also for desktops

Configurable policies (partial or full control, mandatory and unwanted packages, conflict resolution…)

Addresses scalability Packages can be stored ahead in a local cache, avoiding peak loads on software

repository servers (simultaneous upgrades of large farms)

HTTP protocol allows to use web proxy hierarchies

HEPiX hands-on / Installation Task / German Cancio CERN - n° 20

SPM (Software Package Mgmt) (III)

SPMA functionality:

1. Compares the packages currently installed on the local node with the packages listed in the configuration

2. Computes the necessary install/deinstall/upgrade operations

3. Invokes the packager (rpmt/pkgt) with the right operation transaction set

The SPM is driven via a local configuration file For batch/servers: A NCM component generates/maintains this cf

file out of CDB information

For desktops: Possible to write a GUI for locally editing the cf file

HEPiX hands-on / Installation Task / German Cancio CERN - n° 21

Software Package Manager (SPM)

RPMT

RPMT (RPM transactions) is a small tool on top of the RPM libraries, which allows for multiple simultaneous package operations resolving dependencies (unlike RPM)

Example: ‘upgrade X, deinstall Y, downgrade Z, install T’ and verify/resolve appropriate dependencies

Does use basic RPM library calls, no added intelligence

Ports available for RPM 3 and 4.0.X

Will try to feedback to rpm user community after porting to RPM 4.2

CERN IT/PS working on equivalent Solaris port (PKGT)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 22

SWRep/SPMA architecture

Packages

Mgmt API

Repository A

packages Mgmt API

CDB

config

Client nodes

NCM/GUI

SPMA.cfg

SPMA

(RPM, PKG)

GUI

CLI

cache

Repository B

inventory

http afsnfs ftp

(HTTPProxy

)

rpmt

HEPiX hands-on / Installation Task / German Cancio CERN - n° 23

SPMA & SWRep: current status

First production version available

Being deployed in the CERN Computer Centre (next slide)

Enhanced functionality (package cache management) for mid-October

Solaris port progressing (cf. M. Guijarro’s talk)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 24

SPMA/SWRep deployment @ CERN CC

Started phasing out legacy SW distribution systems (including ASIS) on the central batch/interactive servers (LXPLUS&LXBATCH)

Using HTTP as package access protocol (scalability) > 400 nodes currently running it in production

Deployment page: http://cern.ch/wp4-install/CERN/deploy

Server clustering solution For CDB (XML profiles) and SWRep (RPM’s over HTTP) Replication done with rsync Load balancing done with simple DNS round-robin Currently, 3 servers in production (800 MHz, 500MB RAM,

FastEthernet) giving ~ 3*12Mbyte throughput Future: may include usage of hierarchical web proxys (eg. using

squid)

HEPiX hands-on / Installation Task / German Cancio CERN - n° 25

Questions / comments ?