suse high performance computing · 15 why suse ® linux enterprise server for high performance...

53
SUSE® High Performance Computing Roadmap and Update Kai Dupke Senior Product Manager SUSE Linux Enterprise [email protected] Meike Chabowski Senior Product Marketing Manager SUSE Linux Enterprise [email protected]

Upload: others

Post on 26-Apr-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

SUSE® High Performance ComputingRoadmap and Update

Kai DupkeSenior Product Manager

SUSE Linux Enterprise

[email protected]

Meike ChabowskiSenior Product Marketing Manager

SUSE Linux Enterprise

[email protected]

2

History Future

3

HIGH PRODUCTIVITY COMPUTING

TotalBaker Hughes

Texas Instruments…..

MULTI- and MANY-CORE PROCESSOR SUPPORT

Intel, AMD,POWER

…..

TECHNOLOGY

Kernel 3.xLustre File System

Ceph storage platformHighly scalable - up to

4096 cores…..

COOPERATION

IBMSGIHPDell

ACADEMIC AND RESEARCH

LRZ / SuperMUCBSC / MareNostrum

Tokyo Institute of TechnologyBeijing Computing Center

NASA…..

SUSESince 1992

Strong Presence in Top500

NECCrayCisco

SUSE – Strong in HPC Market!SUSE® HPC

Overview

5

HPC OverviewSUSE® High Performance Computing

• Solving computational, data-intensive, or numerically-intensive tasks

• Reducing the time and effort required to set-up and maintain HPC clusters

• Ensuring that all components of the HPC stack work together

6

HPC DevelopmentSUSE® High Performance Computing

• Yesterday‒ Academia and Research

• Today‒ Academia and Research‒ Financial Services‒ Oil and Gas‒ Semiconductor‒ Life Sciences‒ Manufacturing

• Tomorrow‒ Departmental and workgroup

clusters‒ High Productivity Computing

7

High Productivity Computing

Hollywood and HPC

8

High Productivity Computing

Big Data – or HPC??

HPC Market

10

Linux Preferred for HPCSUSE® High Performance Computing

• Linux‒ runs on more than 90% of

the world's top 500 supercomputers*

‒ is used by nearly 90% of general clusters

‒ Linux is used in the majority of HPC systems,from smaller departmental implementations to larger, integrated cluster solutions

*top500.org June 2013

11

• Lighthouse projects

• Government sponsored

• Generic workloads

• Often self-supported by Academic staff

• Specialized hardware

• Highly specialized application

• ROI and reliabilityare key

• Data Center support

• Commodity hardware

Split MarketSUSE® HPC

Commercial High Productivity

Computing

ScientificTop 500-class

12

Super Computer'top 500'

Divisional

Departmental

Work Group

>500K$

<500K$

<250K$

<100K$

HPC class SystemBudget

Ready

+++

+++

++

+

Special build HWonly performance countSelf-supportedPartner supported

Customized HWPartner drivenPartner supportedSUSE supported

Commodity HWBusiness drivenSUSE supported

Customer drivenHome brewed

Key drivers GTM

HPC-IHV

IHVISVSI

ChannelSUSE

ChannelShop

Market SegmentationSUSE High Performance Computing

SUSE Linux Enterprise HPC

14

• Open Source benefits‒ Easy to customize, maintain and improve

• Innovation‒ Beowulf Clusters “born” on Linux

• Modularity‒ GUI overhead not required

‒ appliance form factors

• Linux Standards‒ Large base of tools, including remote management

‒ Hardware availability

‒ Large vendor ecosystem surrounding Linux HPC clusters

Why Linux?SUSE® High Performance Computing

15

Why SUSE® Linux Enterprise ServerFor High Performance Computing

• Early player in HPC, pushing innovation and new technologies

• Highly reliable, interoperable and manageable server operating system

• Built to power mission-critical workloads in physical, virtual and cloud environments

• The natural successor to UNIX, backed by proven services for UNIX migration

• Special features to improve performance

• Backed by established ecosystem – support and certificates

• The only Linux recommended by Microsoft

16

• Up-to-date 3.x Linux Kernel for optimal performance

• CPU Management and System Activity‒ CPUset System, CPUset command line tool

‒ Sysstat package

‒ IRQbalance

• OpenFabrics Enterprise Distribution (OFED)‒ Remote Direct Memory Access (RDMA) switched fabric

technologies, high-speed data transport technologies for server and storage connectivity

• SystemTap, LTTng 2.0

• Packaged Lustre

SUSE Additional FeaturesSUSE® High Performance Computing

17

• Asynchronous I/O (AIO) ‒ Input/output processing that permits other processing to

continue before the transmission has finished

• Modular I/O Scheduler‒ Algorithm most suitable for workload can be chosen

dynamically

• Multi-core/hyper-threading processor support‒ Execute threads in parallel within each individual processor

‒ Supports up to 8192 cores per system

• Intel I/O Acceleration ‒ Offloads the CPU towards the network card, thus allowing the

system to continue processing data while I/O is taking place

SUSE Advanced I/O ProcessingSUSE® High Performance Computing

18

Recent EnhancementsSUSE® High Performance Computing

• Storage‒ Support for btrfs

‒ Improved support for iSCSI and FCoE

‒ Major filesystem performance increases

• Management‒ Faster, more powerful

control groups for resource isolation

‒ Improved power management

• Kernel‒ Newest processors and

chipsets

‒ Better idle-load balancing

‒ Transparent huge pages

‒ Improved scaling of incoming network traffic

‒ Up to 8192 cores

• Network‒ Higher network throughput

‒ Added tunables in the IP stack (for lower latency)

Customers, Partners

20

Customers and PartnersSUSE® High Performance Computing

Customers

Partners

21

• Irish Center for High-End Computing

• Power efficiency‒ 1st for x86 in top500 (June 2014)

• Winning partnership‒ SGI, SUSE, Intel working together

• 3 use cases‒ Thin: latest Intel Ivy Bridge

‒ Fat: large shared-memory

‒ Hybrid: Xeon Phi & NVIDIA Tesla

Fionn – SUSE benefitsSUSE High Performance Computing

“The stability is impressive“

“SUSE Linux Enterprise Server doesn’t get in the way of the computational workload”

“... great tools for set up and configuration, but gives us the flexibility to use other tools, which simplifies maintenance.“

„In our view, ... very well suited to high-performance computing.”

— Niall Wilson

Infrastructure ManagerICHEC

22

• Designed to simplify purchasing,deployment and management ofHPC clusters

• SUSE Linux Enterprise Server is IntelCluster Ready and powers manycertified Intel Cluster Ready systems

• intel® Cluster Ready “recipes” are available with SUSE Linux Enterprise Server

‒ Reference designs to help hardware vendors, platform integrators, and system integrators design and build certified Intel Cluster Ready systems

Intel Cluster Ready ProgramSUSE® High Performance Computing

Business Update

24

• Simplified model

‒ Only number of socket pairs matter

‒ Socket pairs are accumulated per cluster

‒ Head nodes and compute nodes are threaten equal

Simplify Projects!SUSE® HPC

25

• SUSE Vendor Support

‒ All levels of support for the whole system

‒ Maintenance, Standard, Priority

• Intel Enterprise Lustre Support‒ Get Lustre support from Intel/Whamcloud

Keep It Running!SUSE® HPC

Technical Update

27

• Major 4 virtualization technologies‒ XEN, KVM, LXC, Docker¹

• Workload management with Systemd‒ Prioritization with CGroups

• Tracing tools for software optimization‒ LTTng with graphical frontend

SUSE Linux Enterprise Server 12SUSE® HPC

¹Docker provided as technical preview – see release notes

28

LTTng viewerSUSE® HPC

29

• Machinery module‒ KIWI image creation

‒ cfengine, puppet

‒ System verification & analysis

• Updated Stack ‒ Kernel, Tools, pNFS, OFED, openmpi, chipset support

SUSE Linux Enterprise Server 12SUSE® HPC

30

MachinerySUSE® HPC

31

www.suse.com/products/server/hpc.html

Backup

SuperMUC

34

• Great support experience‒ Cooperation for more than 15 years

‒ Backed by SUSE's winning support

• Support for Itanium2 and x86‒ Smooth migration of old to new system

‒ No additional staff training needed

• Easy deployment methods‒ SUSE's autoYaST used today

‒ Other SUSE offerings – SUSE Cloud, SUSE Manager – considered

SuperMUC – SUSE benefitsSUSE High Performance Computing

“We have relied on SUSE Linux Enterprise Server for 15 years, and have always been very satisfied.

The SUSE team is close at hand, should we require support or guidance.

We have received highly competent support over the years, and look forward to collaborating with them.

— Dr. Herbert Huber

Division Head of Supercomputing

Leibniz Rechenzentrum

35

LRZ - Leibniz RechenzentrumEurope’s supercomputer run SUSE Linux Enterprise Server

Business challenge:LRZ is part of the Gauss Centre for Supercomputing (GCS), which operates the most powerful HPC infrastructure in Europe, and needs to provide researchers across Europe with a reliable and powerful HPC platform, which enables users to make faster progress in their complex research projects. To reduce the environmental impact of HPC, the institution aimed at improving the energy efficiency leverage established automation solutions to maximise the efficiency and manageability of the new supercomputing platform.

Benefits:• Completed easy and smooth migration from previous Itanium 2

infrastructure to new x86 processor architecture• Considerably simplified configuration and automation of the new system,

using the automation capabilities of AutoYaST(integrated with SLES)• Improved the energy efficiency: SuperMUC delivers appro. 20 times

more performance per watt than its predecessor• Boosted overall performance by a factor of 60

Solution:Working with SUSE and IBM, LRZ implemented SuperMUC with approx. 9,400 general purpose computing nodes, a peak performance of three Petaflop/s, comprised of 155,000 Intel Xeon processor cores and more than 300 TB main memory. LRZ chose to run SuperMUC on SUSE Linux Enterprise Server, leveraging SUSE’s proven HPC expertise and leading automation tools such as AutoYaST, which allows systems to be installed without manual intervention.

“We have relied on SUSE Linux Enterprise Server for 15 years, and have always been very satisfied.

The SUSE team is close at hand, should we require support or guidance.

We have received highly competent support over the years, and look forward to collaborating with them.

— Dr. Herbert Huber

Division Head of SupercomputingLeibniz Rechenzentrum

36

SuperMUC – FactsSUSE® High Performance Computing

• 60x faster,one of the fastest HPC systems in Europe

• 20x better performance per Watt,provide green HPC

• > 155,000 Intel Xeon Processor,migration from Itanium2 to x86

37

SuperMUC – System Overview

38

• Hot Water Cooling – reduce cooling cost‒ Use free air cooling

‒ Use of system heat for heating and technical processes

• RAS driven – high system availability‒ Full maintained SUSE Linux Enterprise Server

‒ Full support via IBM and SUSE

• Automated deployment – less management cost‒ Full use of SUSE's autoYAST feature

SuperMUC – Business AspectsSUSE High Performance Computing

HPC Stack

40

ChallengeSUSE® High Performance Computing

HPC market still developing

Stack components provided by various vendors

Some stack components run in parallel

Mix of small and big vendors

Segmented into commercial and scientific

41

HPC StackSUSE® High Performance Computing

SUSE Linux Enterprise Server

Network

OFED10G

Storage

OCFS2 NFS

Message Passing InterfaceMPI

SGIIntelParastation

GPFS

Hardware

Queuing / ManagementSoftware & Tools

Application

IBRIX

= SUSE Partner= SUSE supported = SUSE future

cephFS

pNFS

pNFS

EXT3 XFS BTRFS

TCP offload MPICH openMPI

HP

PBS Pro Moab IBM LSF Bright CM

Lustre

42

• Support for Itanium2 and x86‒ Smooth migration of old to new system

‒ No additional staff training needed

• Great support experience‒ Cooperation for more than 15 years

‒ Backed by SUSE's winning support

• Easy deployment methods‒ SUSE's autoYAST used today

‒ Other SUSE offerings – SUSE Cloud, SUSE Manager – considered

SuperMUC – SUSE benefitsSUSE High Performance Computing

“We have relied on SUSE Linux Enterprise Server for 15 years, and have always been very satisfied.

The SUSE team is close at hand, should we require support or guidance.

We have received highly competent support over the years, and look forward to collaborating with them.

— Dr. Herbert Huber

Division Head of Supercomputing

Leibniz Rechenzentrum

Storage Options

44

• Local storage

‒ Maintain existing capabilities (e.g. EXT3, XFS)

‒ Full btrfs support, improving manageability

‒ Maximum flexibility for customers

• Expand network filesystem capabilities (NFSv4.x/pNFS)‒ Improve performance, reliability and security

‒ pNFS client support, server support for later version of SUSE Linux Enterprise

File Systems – TodaySUSE High Performance Computing

45

• Integrated Volume Management

• Support for copy on write

• Powerful snapshot capabilities

• Scalability

• Other Capabilities:‒ Compression

‒ Data integrity (checksums)

‒ SSD optimization

• Status:‒ SLE 11: Fully supported

‒ SLE 12:Planned as default file system

File Systems – BTRFSSUSE High Performance Computing

46

• Ceph is a scalable open source storage platform comprised of an object store (Rados), block store (RDB),a POSIX-compatible distributed file system (Ceph FS), and an Amazon S3 integration

• Ceph has been integrated with OpenStack and is included in the Linux kernel

File Systems – CEPHSUSE High Performance Computing

47

Distributed Storage System Market

• IDC predicts, that by 2015, combined spending for public and private cloud storage will be $22.6 billion worldwide

• Gartner predicts, that by 2016‒ more than one third of consumer data will be stored in cloud

storage

‒ storage will grow from 329 exabytes in 2011 to 4.1 zettabytes (12x)

48

• OCFS2‒ Superior cluster file system for up to 32 nodes

‒ Scalable network access via CTDB

‒ Used for big storage and user directories

• GPFS

‒ 3rd party offering by IBM

• IBRIX

‒ 3rd party offering by HP

File Systems – ClusterSUSE High Performance Computing

49

• Maintenance Release 2.1‒ Available for SUSE Linux Enterprise Server 11 SP1+

• Maintenance Release 2.4‒ Available for SUSE Linux Enterprise Server 11 SP2+

• Client already accepted by Whamcloud (http://downloads.whamcloud.com)

• SUSE sponsored and developed port provided to the community

http://drivers.suse.com/lustre/

File Systems – LustreSUSE High Performance Computing

50

www.suse.com/products/server/hpc.html

Thank you.

51

www.suse.com/products/server/hpc.html

Learn More

Corporate HeadquartersMaxfeldstrasse 590409 NurembergGermany

+49 911 740 53 0 (Worldwide)www.suse.com

Join us on:www.opensuse.org

52

Unpublished Work of SUSE. All Rights Reserved.This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE. Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability.

General DisclaimerThis document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for SUSE products remains at the sole discretion of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.