hp linux cluster - willkommen bei connect deutschland ... · hp linux cluster joseph pareti...
TRANSCRIPT
April 6, 2005
© 2003 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice
HP Linux cluster
Joseph Pareti ([email protected])April 7, 2005
page 2IT-Symposium 2005 www.decus.de
Overview
• Enterprise space• Linux Reference Architecture• Usage notes for the technical computing market
April 6, 2005
page 3IT-Symposium 2005 www.decus.de
HP Systems Insight ManagerComprehensive and complementary systems management
Discovery, inventory, fault management, enhanced security, roles, distributed tasks,….
HP Systems Insight Manager core services
Server management
Storage management:
CommandView
Clientmanagement:
Client Manager
Printer management: Web JetAdmin
And more…
Enterprise management integration modules
•BMC •HP OpenView
Workload Management
Deployment
Performance Management
Partition Management
Security Management
Cluster Management
System-specific
Administration
3rd Party/Home grown
Adaptable to your environment
Breadth of hardware management
Com
plet
e lif
e-cy
cle
man
agem
ent
•CA •Tivoli
page 4IT-Symposium 2005 www.decus.de
HP Serviceguard for Linux with Cluster ExtensionWorld’s First Disaster Recovery Solution on Linux!
– Protects against downtime, planned or unplanned
– Automatic failover & failbackto reduce disaster recovery complexity
– Ensures high data integrity by utilizing Remote Mirroring in HP StorageWorks XP Disk Arrays
IP Network(s)
Data Center B
Metropolitan-wide distances or greater!Data Center A
IP Network(s)
MAN/WAN
MAN/WAN
Storage Network(s)
MirrorData
April 6, 2005
page 5IT-Symposium 2005 www.decus.de
Planning, Implementation, Staffing, and Ongoing Operations
ManagedServices
• Unix-to-Linux Application Porting and Migration
Key Linux Solutions Focus:• Database Applications• Financial Services• HPTC• Telco• Web Infrastructure
Enterprise Integration Security Consulting
• Express Linux Assessment
• Open Source Process Management
• Mission critical Support (7x24x365)
• Multi-vendor hardware• Red Hat, SUSE,
Turbolinux, Connectiva, Debian, & United Linux Distributions
• Popular Linux Applications
Installation, Start-up, and Complex Solution Implementation
Deploy-ment
Consulting
ProjectMgt
Porting & Migration
Comprehensive HP Linux Services
ArchitectingTraining
Support
5,000 HP Linux Professionals
Providing:
• Linux Education• Industry (LPI)
Certification
page 6IT-Symposium 2005 www.decus.de
HP is the single source for Linux hardware, software and support for many ISVsincluding Red Hat and SuSE
– Single point accountability for Linux solutions in heterogeneous multi-technology, multi-vendor environment
– Strategic relationships and support agreements with key ISVs
– Proven service level agreements
– Expanded global alliance with Red Hat and SuSE to be single point of purchase, support and maintenance
Single point of accountability with HP Services
Red Flag
April 6, 2005
page 7IT-Symposium 2005 www.decus.de
HPS migration methodology
executive reviews
Detailed design
Implementation
Test and pilot
Install and support
Acceptance
Migration assessmentquestionnaire and workshop
Investment justification (TCO)
Architectural blueprint
Deliverables• Business Value • Migration guide• Implementation proposal
(HL PoC)
• Transition planning • Migration tools for specific
environment
• Pilot Application migrated
• Code migration• Application package migration• Data migration• Environment migration
• Tested environment
• Final System• Education• Knowledge transfer
page 8IT-Symposium 2005 www.decus.de
UnixLinux
HP-Customer-Support portfolio Same offers multi OS
Windows
Mission Critical services, Nonstop support services, hardware support services, software support services
Availability services
Integrated support, ITSM services, SAN support management, software licensing, On-Demand support
Installation and start-up, implementation, integration, deployment, PC Lifecycle services
Assessment and review performance-tuning, performance support, education
Deployment services
Support managementservices
Optimization services
April 6, 2005
page 9IT-Symposium 2005 www.decus.de
Incident based or unlimited
Software technical support
• 2 hr response• 9x 5 & 24 x 7• Software updates
Basic services
HP support services for Linux environments
Support Plus 24• HW 4 hr
response • 24 x 7• Software
technical assistance
• Software updates
Support Plus
• HW 4 hr response
• 13 x 5• Software
technical assistance
• Software updates
Integrated services
Hardware reactive support – 8x5/24x7
Call-to-Repair – 6 hours/4 hours
Installation, start-up and education services
Mission critical services
Proactive 24
• 24x7• Account mgt.• 4 hour
hardware response
• Software technical assistance
• Software updates
Criticalservice
• 24x7• Account mgt.• Proactive Svcs• Change Mgmt • 6 hr CTR• SW Tech
Assist• SW Updates
Mission criticalpartnership
• Assessment• Service
Improvement• Proactive
deliverables above P24 and CS
Multivendor heterogenous environments
page 10IT-Symposium 2005 www.decus.de
Our HP Linux training portfolio
http://education.itrc.hp.com
Getting startedFundamentals
Configuring the systemBasic networking
A business perspective
Leading to Linux Professional
Institute certification
We offer high-quality, on-site, classroom and virtual classroom training for:
System administrators
End users
April 6, 2005
page 11IT-Symposium 2005 www.decus.de
HP - Linux Courses
• Linux End User – UNIX (Linux) System Basics I– Linux - getting started– Linux - fundamentals – Linux - configuring the system– Linux - basic networking
• Linux System Administrator (all instructor-led classroom)– UNIX (LINUX) Fundamentals – Linux Administration I– Linux Administration II– Accelerated Linux Administration for HP-UX/Tru64
Admins– Linux Network Security– Linux Development Tools and Techniques– Linux Device Driver Development (onsite only!)
• Advanced Linux Courses– Linux Troubleshooting– Linux Performance Tuning– MC/ServiceGuard for Linux– Linux Install and Troubleshooting on IA-64
A full suite of Linux courses for you and your customers that includes LPI and RH certification courses
Classroom courses
Online, instructor led courses
Online, self-paced courses
page 12IT-Symposium 2005 www.decus.de
• 24 hours a day, 7 days a week• More than 27,000 people• 600 support offices, 120 countries
• 35 HP Response Center Network locations in 34 countries
• More than 80 HP Customer Education Centers Worldwide
BristolDusseldorfStockholmWinnershGenevaMadridMilanParis
SeoulHong KongMelbourneSingapore
Tokyo
1400 EMEA Linux HP-S
400 AP
Linux HP-S
700 Japan Linux HP-S
2500 Americas Linux HP-S
LINUX + HP Services help you WW
April 6, 2005
page 13IT-Symposium 2005 www.decus.de
Capability and scope• Operating in over 160 countries• Largest channel partner network• #1 in mission critical support services• #1 in open system support services
• 65,000 Professionals
• 28,000 Microsoft Experts
• 18,000 Unix Experts
• 5,000 Storage Experts
• 7,500 System Management Experts
• 5,000 Linux Experts
• 2,000 Security Experts
• 10,000 Printing Imaging Experts
• 4,500 Cisco Experts
• 2,500 non stop MC Experts
HP Customer supportMulti-OS, Multi-vendor, Multi-technology, WW
page 14IT-Symposium 2005 www.decus.de
Linux partners
The open source community and development partners
Industry leading independent software partners
April 6, 2005
page 15IT-Symposium 2005 www.decus.de
18%None
8%Lack of skills
10%Unsecure
16%Version splintering/lack of standards
16%Lack ofapplications
Immaturityof products
Lack of support
Base: 50 $1B+ companies using Linux (multiple responses accepted)Source: Forrester, March 2003
32%
46%
Why a reference architecture?
Reference architectures
Customer support
Tested
Solutions & partners
What are your biggest concerns in using Linux and Open Source software?
page 16IT-Symposium 2005 www.decus.de
– Leadership of Linux Standards Base (LSB)– Founding/sponsoring member: OSDL, Linux International– Open Source Software Institute – HP 2004 Cornerstone
Sponsor– Over 40 on-going internal Open Source projects– 100 – 150 projects annually (+ growing)– Substantial adoption of Open Source technologies in HP
businesses– Extensive support for Samba & Apache projects– “Count the hops”
HP and Open Source
April 6, 2005
page 17IT-Symposium 2005 www.decus.de
• Architecture– Linux distribution– Associated system software (drivers, agents)– Optional value-added components (HP OpenView)– Application infrastructure – application server, web server,
database server, directory server, etc• Run on industry standard servers
– 32 bit ProLiant servers– 64-bit Integrity servers
• Tested, validated and managed by HP• Supported through HP consulting and integration and
customer support services
Linux Reference Architecture: Open Source Middleware
page 18IT-Symposium 2005 www.decus.de
Linux: Red Hat / SuSE
Hardware (ProLiant servers, storage, networks)
Java runtime
Management
Application3Application2 Application4Application1Development
Env
Web Server
J2EEApplication Server
Identity Management Server
XML MessagingServer
Database Server Directory Server
Apache
Open LDAP
JabberPing ID
MySQLJBOSS
Linux Reference Architecture: Open Source Middleware
April 6, 2005
page 19IT-Symposium 2005 www.decus.de
Open SourceWorkshops
Open SourceAssessment Services
Application Migration &Re-engineering Services
Application IntegrationServices
Ope
n S
ourc
e m
atur
ity
Solution timeline
Pilot Services andProof-of-Concepts
HP Open Source Middleware SupportServices
Application DevelopmentServices
HP Linux and Value-Add Services
HP Open Source Middleware Installation & Configuration Services
HP Open Source Middleware IT Consolidation Services
HP Open Source Middleware Services
page 20IT-Symposium 2005 www.decus.de
• Customers are moving applications to Open Source middleware platform
• HP is providing support services• HP is providing consulting services (porting/migration and
development)
Use Cases: Application Migration to Open Source
Commercial Stacks Open Source Stack
SU
N O
NE
S
OLA
RIS
WE
BSP
HER
E A
IX
WE
BLO
GIC
H
P-U
X
APACHE
TOMCAT/JBOSSMYSQL / ORACLE
LINUX DISTRIBUTION
April 6, 2005
page 21IT-Symposium 2005 www.decus.de
• Customers are consolidating to Linux Server Farms with Open Source Middleware
– Single platform for simplified management– Resource virtualization– Infrastructure on tap
• HP is providing support services, infrastructure and application migration services
• Opportunity for outsourcing services
Use Cases: Open Source Server Farms
Linux/Open Source Cluster
....Linux/Open Source Cluster
page 22IT-Symposium 2005 www.decus.de
Business Scenario:• Customers needs to provide both low-volume/high-revenue as well
as high-volume/low-revenue services (example: Sabre/Travelocity booking service and search service)
• Revenue is made only from low-volume/high-revenue serviceWilling to invest in higher cost infrastructure
• No revenue streams from high-volume/low-revenue service; in addition service needs to be scalable to support drastic changes in workloads (e.g. advertising campaigns)
Infrastructure cost must be kept at lowest price point
NEED FOR HYBRID ENVIRONMENT
Use Cases: Hybrid Environments
April 6, 2005
page 23IT-Symposium 2005 www.decus.de
HVLR Compartment:• Need to minimize infrastructure
costs• Requires flexible resource re-
allocationUse of Open Source platforms; need for scale-out capabilities
Use Cases: Hybrid Environments (cont.)
Low Volume / High Revenue (LVHR)High Volume / Low Revenue (HVLR)
LVHR Compartment:• Must guarantee transaction
recoverability • More constant workload but
stringent SLA requirementsHighly redundant commercial platforms; evtl. scale-up needs
Transaction Transfers
Data Replication
page 24IT-Symposium 2005 www.decus.de
By moving our core reservations operations from mainframes to Intel-based servers from Hewlett-Packard, we have been able to increase capacity as well as lower programming costs.
Bob Offutt, Senior Vice President and Chief Architect at Sabre Holdings
Sabre HoldingsTransportation Industry
Customer quote
The HP Difference:• Vigilant Customer Support and Global Capabilities • HP Linux Value Proposition within Heterogeneous Solution• One-Stop Shopping / Accountability from Desktop to Datacenter
Highly Transactional Airline Reservation System Long-Term Migration of Mainframe Applications
Customer Business Challenge
• Cost-Effectively Manage Exponential Surge in Transactional Activity
• Preserve System Reliability for Core eCommerce Business
• Horizontal Scalability to Accommodate Growth
Solution
• 28 PA-RISC & Multiple NSK S86000s Servers w/NS ZLE SW
• Cluster of 45 HP Integrityrx5670 Linux Servers
• 100 Clustered MySQL DB tables replicated 7X24
Results
• Lowered total cost of ownership through mainframe downsizing & workload segmentation across Linux, HP-UX & NSK
• Faster customer fulfillment & increased customer sat
April 6, 2005
page 25IT-Symposium 2005 www.decus.de
XC v1/ user‘s perspective (under the hoodthere is a resource mgmt system)
runit
#!/bin/csh –f
setenv pc $1
bsub -n $pc –o foo.$pc.out runsvc ./myscript $pc
myscript
#!/bin/csh –f
setenv pc $1
applaunch –sz $pc foo.exe < input
page 26IT-Symposium 2005 www.decus.de
Beowulf /the user‘s perspective
% bsub -n 16 -R "cr && model == DL360" -o outfile ./run.csh
% cat run.csh
#!/bin/csh -f
set cpu=16
set prog=mpi.xset runcmd=/cluster/mpich-1.2.5..12_gm-2.1.1_pgi/bin/mpirun
lsbparse.pl ! manipulates LSB_MCPU_HOSTS to create the machinefile
$runcmd -machinefile mf -np $cpu ./${prog}
April 6, 2005
page 27IT-Symposium 2005 www.decus.de
Development on Beowulf /TotalView™(1/2)
• Launch a „sleep“ job throughLSF (bsub) to reserve a set of nodes
• Login (ssh) to one of those and start TotalView interactively
crms0> cat ./runH.csh
#!/bin/csh -f
/scratch/joe/pallas/PMB2.2.1/SRC_PMB/gen_appfile >& machinefile
/cluster/mpich-1.2.5..12_gm-2.1.1_pgi/bin/mpirun.ch_gm -machinefile machinefile -np 1 ./xxx.csh
crms0> cat xxx.csh
#!/bin/csh
hostname
sleep 7600
crms0>
page 28IT-Symposium 2005 www.decus.de
Development on Beowulf /TotalView™(2/2)
April 6, 2005
page 29IT-Symposium 2005 www.decus.de
OpenPBS
• www.openpbs.org• http://grtzky.zko.dec.com/~cholmes/jm/pbs/index.html• OpenSource variant (if you want more features use the
Altair version --- contact Jochen Krebs [email protected])• Batch job-management tool, can be used to define
queues, submit jobs to low-loaded hosts, preventprocess oversubscription, etc.
• Limited scheduling capabilities
page 30IT-Symposium 2005 www.decus.de
OpenPBS /basic commands
• qsub (job submission)• qstat (gather job statistics• qdel (delete a job submitted with qsub)
• Some basic on-line help, eg. qsub -h
April 6, 2005
page 31IT-Symposium 2005 www.decus.de
OpenPBS / integrated with an app
• The application gets started in a shell script• The script references PBS environment variables to
create a machine-file and do other house-keeping tasks• The machine-file is used as an option in the mpirun
command
page 32IT-Symposium 2005 www.decus.de
OpenPBS / sample integration (1/3)
• Usage:
qsub -l nodes=5 pjobs_joe
The script “pjobs_joe” manipulates the environment variable $PBS_NODEFILE to create a machine file for the parallel application
Sample code: FLUENT
April 6, 2005
page 33IT-Symposium 2005 www.decus.de
OpenPBS / sample integration (2/3)
• #!/bin/sh• # usage• ## qsub -l nodes=5 pjobs_joe• # use the command below to specify 2 cpus per node• #• ## qsub -l nodes=5:ppn=2 pjobs_joe
• ### ----- Pre execution -----
• NB_PROC=`wc -l $PBS_NODEFILE`
• # use PBS_NODEFILE directly to use the Fast Ethernet • # instead of Gigabit Ethernet• /bin/sed -e "s/\(.\+\)/\\1-2/" $PBS_NODEFILE > ./mf.$PBS_JOBID• echo "----------"• cat ./mf.$PBS_JOBID• echo "----------"• #• # tag output file with pbs job id• #• /bin/sed "s/junk\.gz/junk\.gz`echo $PBS_JOBID`/g"
/home/cfdtest/joe/fluent_pbs_integration/script_womonitoring > /home/cfdtest/joe/fluent_pbs_integration/script_womonitoring.$PBS_JOBID
• #
page 34IT-Symposium 2005 www.decus.de
OpenPBS / sample integration (3/3)
#start fluent
fluent 3d -g -mpthostfile=/home/cfdtest/joe/fred1.txt -t$NB_PROC -cnf=./mf.$PBS_JOBID -i /home/cfdtest/joe/fluent_pbs_integration/script_womonitoring.$PBS_JOBID
### ----- Post Execution -----
# kill stray fluent processes
for H in $(PBS_NODEFILE) ; do
echo " host " $H
FLUENTS_ID=`/usr/bin/rsh $H /bin/ps -ef | /bin/grep fluent |/bin/awk '{print $2}'`
for FLUENT_ID in $(FLUENTS_ID) ;do
echo "fluent id" $FLUENT_ID
/usr/bin/rsh $h /bin/kill -9 $FLUENT_ID
done
done
rm ./mf.$PBS_JOBID
rm /home/cfdtest/joe/fluent_pbs_integration/script_womonitoring.$PBS_JOBID
April 6, 2005
page 35IT-Symposium 2005 www.decus.de
XC v 2
• Fully supported by HP.• Based on open source software components.• The idea is to leverage open source and use HP
capabilities around testing, integrating and supporting.• XC details are covered in other presentations.
page 36IT-Symposium 2005 www.decus.de
XC v2 using slurm/lsf
• Assume you just want to reserve one SMP (e.g. onerx1620) for an OMP job
• bsub -n2 -o run_fft_pass1.2t srun -n 1 ./do_run_fft• bsub -n 1 -o run_fft_pass1.1t srun -n 1 ./do_run_fft
April 6, 2005
page 37IT-Symposium 2005 www.decus.de
XC v2 using slurm/lsf
• Assume you just want to reserve one SMP (e.g. One rx1620) for an OMP job
• The Gaussian job requires bigmem, so LSF should help
• lkg135> bsub -Is -R "mem> 4000" -n1 /bin/bash
• Job <4355> is submitted to default queue <normal>.• <<Waiting for dispatch ...>> hangs ... Why ?
page 38IT-Symposium 2005 www.decus.de
XC v2 using slurm/lsf (continued)
• bsub -ext "SLURM[nodelist=lkgA,lkgB,lkgC]"
April 6, 2005
page 39IT-Symposium 2005 www.decus.de
XC v2 using slurm/lsf (capability)
• There is good coverage in http://techmktg.rsn.hp.com/people/sdevere/XC_How_to_Guide.htm#_Multinode_LSF_jobs