disclaimers & notices - suse · 2017-11-20 · • intro to suse openstack cloud and red hat...
TRANSCRIPT
1 1 ©2016 SGI
This presentation contains forward-looking statements and plans that are subject to risks and uncertainties. All products, dates and information are preliminary and subject to change without notice. SGI may choose not to make generally available any product or features discussed in this presentation. SGI, the SGI logo, SGI UV, SGI ICE, SGI InfiniteData, NUMAlink, SGI InfiniteStorage, Rackable, OpenSHMEM, and MEMlog are trademarks or registered trademarks of Silicon Graphics International Corp. or its subsidiaries in the United States and/or other countries. Intel and Xeon are registered trademarks of Intel Corporation. All other trademarks are property of their respective holders.
Disclaimers & Notices
2 2 ©2016 SGI
SGI Cloud Reference Architecture on SUSE OpenStack Cloud 6 Brian Payton, SGI Bryan Gartner, SUSE SUSEcon 2016
3 3 ©2016 SGI
Building complete solutions Decades of experience from silicon to systems to software
We listen to customers and analyse workflows Delivering complete solutions
Compute, Storage, Data Management, System Management Optimal solutions for any HPC requirement
Supporting complete solutions End-to-end services and support covering planning, implementation, and management Qualified experts who advise customers on applications with any level of complexity
SGI is Production Supercomputing
THIS is the box it comes in
4 4 ©2016 SGI
Best in class software ecosystem Best of breed SGI and third party software Running on industry standard OS SGI Optimized, SGI Validated, SGI Integrated
Solid system management tools Fast multicast OS Provisioning Efficient and effective power management Health Management Live integration capability SGI Remote Services
Best User Experience
Robust development suite Efficient workload scheduling Optimized Applications Ease of Use
Provision thousands of nodes in minutes Limit power use job, node, rack, or system level to fit power envelope Insulate users and avoid costly downtime from memory errors Scale systems seamlessly with multi-generation processors Real-time scheduling of service as required
SGI HPC Software
Commercial HPC Software
Open Source Software
Best in class Services and Support Expertise in HPC, Data Management, Visualization, Systems Management, and HPDA ~330 Service Professionals covering >25 countries 63 Depots Worldwide Follow-The-Sun Model Customer Service Centers in US, UK, Australia, and Japan
Responsive and Flexible Custom Solutions – One size does not fit all Committed Account and Technical Resource Teams
5 5 ©2016 SGI
An opportunity exists to integrate OpenStack Cloud and High Performance Computing providing cloud tools to HPC users and HPC resources to cloud users. Success starts with a strong foundation.
6 6 ©2016 SGI
SGI Cloud Reference Architecture
Tempest
Ceph
Keystone Pacemaker RabbitMQDatabase
Cinder Neutron Glance
Horizon Nova Heat
NetworkCeph
SGIHPCCloud
Cer$fiedandValidatedHPCHardwareSGIUV,SGIRackable,highperformancenetworks,validatedandcerIfiedtobeproducIonreadyandenterpriseclass
SGIServicesandSupportDeployment,knowledgetransfer,on-siteandremoteservices
7 7 ©2016 SGI
• Open source community • UV Enterprise Class compute node • Validation • Scalability • Fault Tolerance
Goals
9 9 ©2016 SGI
• Fully functional compute node • Linear scalable
– 4 to 64 processors – 2 TB to 64 TB of memory
• Reliability, Availability, and Serviceability – SGI MEMlog™ – Hardware Event Tracker - Phone home support
SGI UV™ 300 Enterprise Class Compute Node
10 10 ©2016 SGI
• SUSE OpenStack Cloud 6 certification • Tempest full test suite • 3 node 3 network Heat stack
• Evolve above tests to validate new features
Validation
11 11 ©2016 SGI
• SLES12 SP1 High Availability Extension – Add control and compute nodes for additional service
resources • SUSE Enterprise Storage 2.1
– Add nodes and disks for performance • Separate, bonded networks
– Team mode 6 – Automatic load balancing
Scalability
12 12 ©2016 SGI
Fault Tolerance • SLES12 SP1 High Availability Extension
– Pacemaker failover for multiple control and compute nodes
• SUSE Enterprise Storage 2.1 – Ceph replication and resiliency
• Separate, bonded networks – Network survives hardware failures
13 13 ©2016 SGI
Production Ready OpenStack from SGI and SUSE! • The SGI OpenStack HPC Cloud Reference Architecture pairs innovative SGI HPC hardware with
the leading OpenStack cloud software distributions • For those that want to build out cloud capabilities in their own data center • With SGI UV, we are able to deploy the largest virtual machines available on the market today for
flexible in-memory compute applications • Platform fully tested and certified to run SUSE OpenStack Cloud 6 as a production ready,
enterprise-class cluster • We are the only SUSE OpenStack Cloud Reference Architecture to use 2x database nodes in an
HA configuration for increased reliability and platform stability • We are the only SUSE OpenStack Cloud Reference Architecture to use separate networks for
storage, management, and compute – increasing performance where it counts • We are the only SUSE OpenStack Cloud Reference Architecture to use HPC-specific hardware
with SGI UV
14 14 ©2016 SGI
SGI OpenStack Reference Architecture
SGI Rackable and SGI UV Certified and Validated Systems Production ready
SGI Services and Support Deployment, knowledge transfer, on-site and remote services
Commercially Supported SUSE OpenStack Cloud Nova, Ceph, Cinder, Keystone, Glance, Heat, Ceilometer, Tempest, Crowbar
Running Distributed Virtual Machines
Distributed HPC Applications
Data Analytics Applications
Infrastructure support systems
SGI Hardware and SUSE OpenStack Cloud
User defined
15 15 ©2016 SGI
SGI OpenStack Reference Architecture Switch
Switch
Switch
Compute SGI UV / SGI
Rackable Nova
Storage SGI Rackable
Ceph
Control SGI Rackable
Keystone, Glance, Cinder, Heat, Ceilometer, Tempest
Admin SGI Rackable
Crowbar
Control Data SGI Rackable
RabbitMQ, Database
Customer Network
• Bonded interconnects for performance and reliability
• High performance networks designed for HPC workloads
• Compute, storage, and networking in ratios optimal for HPC workloads
• A proven, certified, and scalable foundation for your HPC Cloud.
Admin Network Storage Network External Network
16 16 ©2016 SGI
SGI OpenStack RA Use Cases HPC Users
Problem: HPC queue times are long for smaller job runs
Solution: A virtual HPC cluster provides a centralized, secure and flexible infrastructure to support more users and lower queue times
IT Problem: Infrastructure applications (wikis, websites, knowledgebase, etc) are running on dedicated systems without central management
Solution: Use an internal cloud infrastructure to run the applications in virtual machines for central resource management and dynamic scaling
Private HPC Cloud Problem: Multiple users and groups require compartmentalized HPC compute resources
Solution: Design an OpenStack cluster with necessary HPC hardware resources: SGI UV and SGI Rackable
Resource Setup Problem: Virtual infrastructure requires IT support which may take days or weeks depending on queue size
Solution: Users are given a self service portal to spin up their own virtual infrastructure in minutes
17 17 ©2016 SGI
• Current Investigation Areas – Certify SUSE OpenStack Cloud 7 – Infiniband – Virtual Abaqus solution – 512 node CXFS engineering test cluster – Add metrics to validation
• Input?
Next Steps
18 18 ©2016 SGI
• SGI.COM – sgi.com > products > software > openstack – http://www.sgi.com/products/software/openstack.html
• Intro to SUSE OpenStack Cloud
• Intro to Red Hat OpenStack Platform • Data Sheet
• SALES.CORP.SGI.COM – sales.corp.sgi.com > products > software > hpc cloud – http://sales.corp.sgi.com/products/software/hpc_cloud.html
• Intro to SUSE OpenStack Cloud and Red Hat OpenStack Platform • SGI product support
• Sales trainings
• Software compatibility matrices • Tony’s notes and links to sessions from the last 2 OpenStack Summits
• SUSE.COM – suse.com > products & solutions > SUSE OpenStack Cloud > Resources – https://www.suse.com/products/suse-openstack-cloud
• SUSE customer facing reference architecture documentation
Resources
SGICONFIDENTIAL