introduction to hpc in canada - westgrid · pdf fileintroduction to hpc in canada erming pei...

Introduction to HPC in Canada

Erming Pei

Research Computing Group, UAlbertaCompute Canada / WestGrid

Outline & Schedule

• 10:00 Introduction to Compute Canada (15’)• 10:15 Introduction to WestGrid (15’)• 10:30 Q&A -1 (5’)• 10:35 Break (10’)• 10:45 Introduction to HPC (40’)• 11:25 Q&A -2 (5’)

Introduction to Compute Canada

About Compute Canada

• Compute Canada integrates 4 regional HPC consortia across the country– provides a shared HPC/ARC infrastructure across Canada– supports world-level leading-edge research activities.

• CC aggregates petaflops of computing power and petabytes storage capacity over Canada's high-performance networks.

• CC provides overall services including infrastructure, application, operation and user support for national-wide users.

Compute ConsortiaPreviously, there were 7 consortia.• ACENET • CLUMEQ • RQCHP• HPCVL• SciNet• SHARCNET • WestGrid

Currently, it has been consolidated into four consortia.• WestGrid• Compute Ontario• Calcul Québec• ACENET

Existing Systems & Resources

• ~40 Universities• ~27 Data Centers• ~50 Systems • ~200,000 cores, 2 Pflops, 20PB• ~100 of research software packages • ~200 experts in utilization of ARC for research

https://www.westgrid.ca/events/responding-to-canadas-research-computing-needs

New CC Systems

• UVic, GP1 (Cloud)

• SFU, GP2 (General Purpose)

• UW, GP3 (General Purpose)

• UofT, LP (Large Parallel)

Schedule of New CC SystemsSite/Service Description Availability Resource

GP1 - UVic Large Openstack Cloud

Sept., 2016 3000cores + 40% expansion (2017)

GP2 - SFU General purpose cluster + Cloud partition

Feb., 2017 18,000cores+ 40% expansion (2017)1923 GPU nodes

GP3 - Waterloo Ditto May, 2017 19,000 cores + 40% expansion (2017)64 GPU nodes

LP - UToronto Large parallel Dec., 2017 66,000 cores

National Storage Infrastructure

HSM + Object Storage (All 4 sites)

Oct., 2016 Dozens PBs10PB to start

https://www.computecanada.ca/renewing-canadas-advanced-research-computing-platform/new-systems-at-four-national-sites/

Continuing Development

• Consolidation by 2018– 5-10 Data Centres– 300,000 cores, 12 Pflops, 50+ PB

• 2016-17: Commissioning new systems while decommissioning old systems

CC New Organization Chart

https://staff.computecanada.ca/national_teams/chart

TLC SLC

CloudGP2 GP1

MONLP GP3

NW PSNT RS Storage

SWG

EOT VIZ DH Bio-M SPNTBio-Info

SC

Administration

CC Cloud Service

• Currently Compute Canada has mainly two cloud systems: Cloud West and Cloud East

Access CC clouds

• Cloud East: http://east.cloud.computecanada.ca

• Cloud West: http://west.cloud.computecanada.ca

• Can access with your CC account

CC Cloud Service

OwnCloud

• A Dropbox-like cloud storage service– hosted by WestGrid

• Can access with WestGrid user/password

Globus Online

• High performance data transfer service• https://globus.computecanada.ca

Globus Online

• Needs MyProxy authentication (WestGrid login/passwd)• Can select existing endpoints (GridFTP service in sites)• Can create your personal endpoint with “Globus Connect

Personal”

Intro to WestGrid

About WestGrid

USask

UBC

SFUUVi

c

UNBC

ULeth

UofC Uof

MUofW

UofR

UofA

Banff Center

BU

AU

• WestGrid is one of four regional HPC consortia of Compute Canada

• WestGrid itself has 15 partner institutions across British Columbia, Alberta, Saskatchewan and Manitoba.

TRU

Overall Resources

2012/13

• By far, WestGrid has more than 40,000 compute cores and 9PB storage space.

• About 1,000 Compute Canada users from 475 projects are currently using WestGrid systems.

Text and image source: Lindsay Sill, Intro to WestGrid 2013

* HQP stands for highly qualified personnel

Text and image source: Lindsay Sill, Intro to WestGrid 2013

WestGrid Staff

• Executive Director (Lindsay Sill)• Director of Operations (Patrick Mann)• Collaboration Coordinator• Visualization Coordinator• Site Leads• Programmers• System Analysts• System Administrators

WestGrid Facilities, UofA (Jasper)

• Processors: 4160 cores– 240 nodes with Xeon X5675 processors, 12 cores (2 x 6) and 24 GB of memory. – 160 nodes with Xeon L5420 processors, 8 cores (2 x 4) and 16 GB of memory.

– Interconnect: • Infiniband QDR, 40 Gbit/s, with a 1:1 blocking factor• Infiniband DDR, 20 Gbit/s, with a 2:1 blocking factor

– Storage: ~830TB (356TB Lustre + 280 TB storage servers + 192TB IS10K)– Quickstart: http://www.westgrid.ca/support/quickstart/jasper

http://www.westgrid.ca/support/quickstart/jasper

WestGrid Facilities, UofA (Hungabee)

• Processors: 2048 coresShared-memory multiprocessor, comprises an SGI UV100 login node and an SGI UV1000 computational node, 16TB memory.

• Interconnect: ccNUMA(cache-coherent non-uniform memory access ), combination of Intel's Quickpath and SGI's NUMAlink

• Storage: 53TB NFS, and 356TB Lustre shared with Jasper• Quickstart: www.westgrid.ca/support/quickstart/hungabee

http://www.westgrid.ca/support/quickstart/hungabee

WestGrid Facilities, UBC (Orcinus)

• Processors: 9600 cores (3072 Intel Xeon E5450 quad-core/16GB RAM + 6528 Xeon X5650 six-core/24GB RAM)

• Storage: ~450TB, Lustre• QuickStart: www.westgrid.ca/support/quickstart/orcinus

http://www.westgrid.ca/support/quickstart/orcinus

WestGrid Facilities, UofC (Breezy)

• Processors: 384 cores (16 node Appro AMD cluster with quad-socket, 6-core AMD Istanbul processors (24 cores @ 2.4 GHz) per node, 256GB RAM/node)

• Interconnect: 4X DDR InfiniBand• Storage: ~450TB, IBRIX• Quickstart: http://www.westgrid.ca/support/quickstart/breezy

http://www.westgrid.ca/support/quickstart/breezy

WestGrid Facilities, UofC (Lattice)

• Processors: 4096 cores– 512 nodes with Intel Xeon L5520 8-core processor, 12 GB of memory.

– Interconnect: • InfiniBand 4X QDR (Quad Data Rate) 40 Gbit/s, 2:1 blocking

– Storage: 160 TB shared with Lattice and Breezy– Quickstart: http://www.westgrid.ca/support/quickstart/lattice

http://www.westgrid.ca/support/quickstart/lattice

WestGrid Facilities, UofC (Parallel)

• Processors: 7056 cores– 528 12-core standard Xeon E5649 nodes, 24 GB of RAM.– 60 special nodes with 3 GPGPUs each (NVIDIA Tesla M2070s, 5.5 GB Memory).

• Interconnect: – InfiniBand 4X QDR (Quad Data Rate) 40 Gbit/s, 2:1 blocking

– Storage: 160 TB shared with Lattice and Breezy– Quickstart: http://www.westgrid.ca/support/quickstart/lattice

http://www.westgrid.ca/support/quickstart/lattice

WestGrid Facilities, UM (Grex)

• Processors: 3792 cores (316 SGI Altix XE cluster, with two 6-core Intel Xeon X5650 2.66GHz processors, 48-96GB RAM/node

• Interconnect: Non-blocking Infiniband 4X QDR• Storage: >100TB• Quickstart: www.westgrid.ca/support/quickstart/glacier

http://www.westgrid.ca/support/quickstart/glacier

WestGrid Facilities, UVic (Hermes/Nestor)

• Processors: 4416 cores [2112 (Hermes), 2304 (Nestor) ]– IBM iDataplex server with eight 2.67-GHz Xeon x5550 cores with 24 GB of RAM– Dell C6100 servers with twelve 2.66-GHz Xeon x5650 cores and 24 GB of RAM

• Interconnect: – 84 Hermes nodes use two bonded Gigabit/s Ethernet links– New Hermes 4X QDR non-blocking, 32-40Gb/s

• Storage: 1.2PB, GPFS• Quickstart: www.westgrid.ca/support/quickstart/hermes_nestor

http://www.westgrid.ca/support/quickstart/hermes_nestor

WestGrid Facilities, SFU (Bugaboo)

• Processors: 4584 cores– 16 nodes with Intel Xeon E5430 4-core, 16GB/node;– 254 node with Xeon X5650 6-core processor, 24GB/node – 16 Xeon X5355 quad-core processor, 16GB/node

• Interconnect: Infiniband using a 288-port QLogic switch• Storage: ~700TB• Quickstart: www.westgrid.ca/support/quickstart/bugaboo

http://www.westgrid.ca/support/quickstart/bugaboo

WestGrid Facilities, USask(Silo)

• Disk: 4.2 PB raw total, 3.15 PB usable– 600 x 1TB SATA drives, RAID 6– 1800 x 2TB SATA drives, RAID 6

• Tape: IBM LTO 3584 tape library– ~3PB total, 1460 x LTO4 tapes, 920 LTO5 tapes.

• Backup System: IBM Tivoli Storage Manager (TSM)– Quickstart: http://www.westgrid.ca/support/quickstart/silo

http://www.westgrid.ca/support/quickstart/silo

Site Status

https://www.westgrid.ca/support/system_status

Use CC/WestGrid

• Apply for a CC/WestGrid account• Get a Grid Certificate / Proxy• Existing Resource Classification• New Resource Allocation• Software • Site status• Technical Support

CC/WestGrid Account1. First ask your PI to apply for a Compute Canada account if he/she doesn’t have.

2. Then, you yourself apply for Compute Canada account as part of your PI’s project.

https://www.westgrid.ca/support/accounts/getting_account

3. Your PI approves your application

4. You apply for an consortia account, e.g. WestGrid, ACEnet

Note: It takes a couple of days for your account to be created on all sites.

Grid Certificate 1. Log in to http://portal.westgrid.ca and “Request a Grid Certificate”2. In “My Account” webpage, you will see two buttons for downloading you Grid certificate and private key.

Grid Proxy• Grid proxy is used in submitting Grid jobs or transferring

files across Grid. (Limited lifetime and limited privileges)

• Users just need log in to any WestGrid site and then run: – myproxy-logon

Resource Classification

Program Type SitesSerial Bugaboo, Hermes, JasperParallel Bugaboo, Nestor, Orcinus, Lattice, Parallel,

Jasper, GrexSMP Parallel Breezy, HungabeeLarge memory Grex, Breezy, HungabeeVisualization ParallelGaussian GrexMatlab Orcinus (distributed computing toolbox),

Jasper/Hungabee (UofA license), etc.Storage Silo, Bugaboo

Software

• WestGrid has both free and commercial software.

• You can use software packages in WestGrid– check this webpage to see if certain

software release is already avaliable on WestGrid

• Software list webpagehttps://www.westgrid.ca/support/software

WestGrid support

Any questions, you can ask [email protected]

New Resource Allocation

• RAC (Resource Allocation Competition)– https://www.westgrid.ca/support/accounts/resource_allocations

• RAC = RPP + RRG– RPP: Research Platforms and Portals (scientific/technical review needed)

– RRG: Resources for Research Groups (scientific/technical review needed)

• RAS: Rapid Access Service (formerly “Default Allocation”). No scientific/technical review needed

Email to:[email protected] / [email protected]

New RAC Schedule

Introduction to HPC

Outline• What is HPC• Capability vs. Capacity• Programming model

– Serial/Parallel

• Architecture– SMP/DSM/MPP, UMA/NUMA/COMA

• Interconnect– PCI(E)/Infiniband/NUMALink

• Storage– RAID, Multipathing, Data Bus– DAS/NAS/SAN– Parallel File Systems

• Evolution of Computing– Mainframe, Cluster, Grid, Cloud, Big Data

What is HPC?

• High Performance Computing– most generally refers to the practice of aggregating

computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business.

Capability vs. Capacity

• Capability computing is typically thought of as using the maximum computing power to solve a single large problem in the shortest time. – e.g. A real-time weather simulation and prediction application.

• Capacity computing in contrast is typically thought of as using multiple cost-effective computing power to solve a big number of small problems or a small number of big problems. – e.g. Tons of user access to a web service simultaneously or, – To analyze huge amount HEP data: split it into many small

pieces and distribute them across multiple cluster nodes.

Spectrum

• Capability → Capacity

Hungabee• Single system • 2048 cores• 16TB• Hi-speed interconnet

Breezy• 16 fat node cluster• 256GB/node

Bugaboo• 256+ node cluster• 16-24GB/node

BlueGene/Q• 4096 low power nodes• 65536 processor cores

Architectures

• By processor– SMP (Symmetric Multi-Processors )– DSM (Distributed Shared Memory)– MPP (Massive Parallel Processors)

• By memory– UMA (Uniform Memory Access)– NUMA (Non-Uniform Memory Access)– COMA (Cache Only Memory Access)

Evolution of Architectures

Message Passing UMA

COMANUMA

Programming Model

• Serial– Instructions are executed one after another on

a single CPU.• Parallel

– Computations are carried out concurrently on multiple processors.

• SPMD: single program multiple data • MPMD: multiple programs multiple data

Parallel Programming Paradigms/Tools

– Data Parallel• HPF (High Performance Fortran)

– Task Parallel• OpenMP (Open Multi-Processing)

– Message Passing• PVM (Parallel Virtual Machine)• MPI (Message Passing Interface)

– MPICH, Open MPI, etc.

– Hybrid (MPI+OpenMP, MPI+GPGPU)– Advanced: Chapel, PGAS(Partitioned Global Address

Space)

Interconnect

• PCI• PCI Express• Infiniband

• HyperTransport (AMD)• QPI/Omni-path (Intel)• NUMAlink (SGI)

Serial vs. Parallel

• In early days, serial connections were reliable but quite slow, so parallel connections was developed to send multiple pieces of data simultaneously.

• While later it turned out that parallel connections have their own problems – electromagnetical interference between wires.

• So the pendulum swung back to highly-optimized serial connections.

Serial → Parallel → Serial

PCI/PCI-X

• PCI: Peripheral Component Interconnect (32bit)• PCI-X: PCI-eXtended (64bit)

Image source: http://www.altera.com/products/ip/altera/t-alt-pci_soln.html

Electromagnetic interference and signal degradation are common in parallel connections, which slows the connection down. The additional bandwidth of the PCI-X bus means it can carry more data but generates even more noise.

PCI-Express

A single PCI Express lane, can handle 200 MB/s. A 16X PCI-E connector can reach 6.4 GB/s.

• Instead of using the parallel connections, PCI-E has a switch controlled point-to-point serial connections.

• Every device has its own dedicated connection, so devices no longer share bandwidth like they do on a normal data bus.

Image source: http://computer.howstuffworks.com/pci-express2.htm

Infiniband

Image source: http://wiki.expertiza.ncsu.edu/index.php/CSC/ECE_506_Fall_2007/wiki4_001_a1

Infiniband• The internal connections in most computers are inflexible and

relatively slow. • As I/O increases, the existing bus system becomes a bottleneck.• While through InfiniBand switches, Infiniband channels are created

to connect hosts (HCAs) and I/O targets (TCAs) • Instead of sending data in parallel across the backplane bus,

Infiniband specifies a serial bus– The serial bus can also carry multiple channels of data at the same time

in a multiplexing signal.

Infiniband theoretical throughput in Gb/s

Infiniband vs. PCI/PCI-Express

http://www.mellanox.com/pdf/whitepapers/PCI_3GIO_IB_WP_120.pdf

Storage• Storage Protocol• I/O BUS

– Serial vs. Parallel• Redundancy

– RAID (Redundant Array of Inexpensive Disks)– Multipathing (Redundant physical paths )

• Storage Attaching Approaches– DAS (Direct Attached Storage)– NAS (Network Attached Storage)– SAN (Storage Area Network)

Storage Protocol• CIFS/SMB (Common Internet File System)

– application-layer network protocol mainly used to provide shared access to files, printers, etc. between nodes

• NFS (Network File System) – application-layer network protocol only allows access to files over an

Ethernet network.

• SCSI/iSCSI (Internet Small Computer System Interface)– iSCSI is a mapping of regular SCSI protocol over TCP/IP

• FC (Fibre Channel)– transport protocol which mainly transports SCSI commands over

Fibre Channel networks• FCoE (Fibre Channel over Ethernet)

– This allows Fibre Channel to use 10 Gigabit Ethernet networks (or higher speeds) while preserving the Fibre Channel protocol.

I/O Bus

• ATA → SATA– ATA (Advanced Technology Attachment)– SATA (Serial ATA)

• SCSI → SAS– SCSI (Small Computer System Interface)– SAS (Serially Attached SCSI)

Parallel → Serial

synchronizationelectromagnetic interference

cost

http://www.denali.com/wordpress/index.php/dmr/2010/02/02/ssd-interfaces-and-performance-effects

RAIDRedundant Array of Independent Disks

• RAID 0: Striping, without parity or mirroring. • RAID 1: Mirroring, without parity or striping.• RAID 2: Bit-level striping with dedicated parity.• RAID 3: Byte-level striping with dedicated parity.• RAID 4: Block-level striping with dedicated parity.• RAID 5: Striping with single distributed parity. • RAID 6: Block-level striping with double distributed

parity.• Nested RAID: RAID 10, RAID 50, RAID 60, etc.

Example: RAID 0, 1, 5

Example: Nested RAID 10, 50

Comparison

http://www.techwarelabs.com/10-things-to-consider-before-setting-up-raid/

Multipathing

• Multipath I/O – Is a fault-tolerance and performance

enhancement technique. – To create multiple logical paths between the

server and the storage devices.• via adapters, cables, and switches, etc.

– In the event that one path fails, multipathing uses an alternate path so that applications can still continuingly access their data.

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/DM_Multipath/

DAS/NAS/SAN

http://abdullrhmanfarram.wordpress.com/2013/04/08/storage-technologies-das-nas-and-san/

• Storage directly attached• High cost of management• Inflexible• Expensive to scale

• Storage access through Ethernet

• Scalable and flexible

• Storage access through FC/IB• Much better performance• More flexible and scalable• Increases data availability

Parallel File System

• Distribute data into multiple storage nodes and access via high-speed network.

• Concurrent (often coordinated) access from many clients

• Provide global shared meta data (locations, file names, sizes, etc)

Parallel File Systems

• Lustre• GPFS• Panasas• NFSv4??

Parallel File Systems

• Lustre• GlusterFS• OrangeFS• GPFS• IBRIX• CXFS• Panasys• PVFS2• PNFS (NFSv4.1)• GoogleFS• Ceph

Example: Lustre

Image source: http://wiki.lustre.org/manual/LustreManual18_HTML/figures/LustreArch.png

Object storage• Object storage appears as a collection of objects.• An object typically includes not only data itself, but some extra

information such as meta data, OID, attributes, etc.• It moves lower-level functionalities such as space management,

security functions into the storage device itself, accessing the device through a standard object interface.

• Especially good for storing unstructured data such as photos, songs, etc.

Block Storage Object Storage

Comparison of 3 storage types

NFS and SMB/CIFS Fibre Channel/iSCSI AWS S3https://insights.ubuntu.com/2015/05/18/what-are-the-different-types-of-storage-block-object-and-file/

Comparison of 3 storage types

http://blog.sungardas.com/CTOLabs/2015/10/object-storage-the-alternator-of-cloud-computing/

Evolution of Computing

• Mainframe: super power • Cluster: worker bees• Grid: global orchestration• Cloud: everything as a service • Big Data: find needle in the sea

Evolution of Computing

Multiple sites (geographically distributed)Global SchedulingVirtualized OrganizationTransparent data accessUnified security infrastructure

Virtualized resourcesElastic computingBuild everything as a service!

Multiple nodesBatch job schedulingParallel computing

Single machineShared memory

Big volumeBig variety Big velocityFast analysis/decision

Mainframe

PC/Cluster

Grid

Cloud

Big Data

Image and test source: http://www.wikipedia.org/

Mainframe

• Originally referred to large cabinets that housed powerful CPU and shared memory.

• Modern design:– Redundant internal engineering

resulting in high reliability and security

– Extensive I/O facilities – High utilization rates – Uses virtualization technology

to support massive throughput

Amdahl 470V/6

Cluster

• Tightly connected computers that work together as a single system– Low cost– scalability– Flexibility

• Batch job scheduling/management• Parallel computing

Grid

• Grid computing is the coordination on massive computer resources from multiple locations, to reach a common goal. The resources are:– loosely coupled – heterogeneous – Geographically dispersed– Dynamic

• Main features:– High level scheduling/Workload management– Unified security infrastructure – Global information system– Virtualized organization – Transparent data transfer interface

Example: WLCG (Worldwide LHC Computing Grid)

Cloud• Initially

– IAAS (Infrastructure as a Service)– PAAS (Platform as a Service)– SAAS (Software as a Service)

• Subsequently– HAAS (Hardware as a Service)– NAAS (Network as a Service)– DAAS (Database as a Service)– CAAS (Communication as a Service)– BPAAS (Business Process as a Service)

• Eventually– XAAS (Everything as a service!)

Image source: www.telezent.com

X

Image source: http://blueatoll.com/blog/the-next-generation-enterprise-business-as-a-service-in-the-cloud/

Big Data

• What is Big Data– refers to technologies of handling data that is

too diverse, fast-changing or massive for conventional technologies to address efficiently.

– Today new technologies make it possible to realize value from Big Data.

Big Data’s Four V’s

http://www.ibmbigdatahub.com/blog/how-big-data-and-cognitive-computing-are-transforming-insurance-part-2

Big Data: Core Technology

• Foundation stone– Google (GFS, MapReduce, Big Table)

• Free version– Apache (HDFS, YARN, Hbase, Hive, Pig…)

Big Data: MapReduce

Image source: http://www.slideshare.net/tothc/introduction-to-hadoop-and-map-reduce

Big Data: Evolution

• New Troika– Google (Dremel, Pregel, Caffeine)

• Free version– Apache Drill, Apache Giraph, Stanford GPS

Image source: http://blog.mikiobraun.de/2013/02/big-data-beyond-map-reduce-googles-papers.html

Example: MapReduce vs. Dremel

Query: SELECT SUM(CountWords(txtField)) / COUNT(*) FROM T1 (T1: 85 billion records, 87 TB, 3000 nodes)

Image source: http://www.cubrid.org/blog/dev-platform/meet-impala-open-source-real-time-sql-querying-on-hadoop/

Big Data Ecosystem

Summary

• Introduced Compute Canada and its consortia • Introduced WestGrid and the member sites• Introduced high performance computing from different

angles such as architecture, memory, interconnect, storage, file system, etc.

• Also briefed evolution of computing technologies from mainframe, cluster, grid, cloud, to the current hot topic —— Big Data.

Follow-up Talks

• Sept.15, 2016 Tips for Submitting jobs & Moving Data (with Hands-on session)– Masao Fujinaga

• Sept. 27, 2016 Scheduling & Job Management (with Hands-on session)– Kamil Marcinkowski

See more details in: https://www.westgrid.ca/events

Thanks!Questions?

introduction to hpc in canada - westgrid · pdf fileintroduction to hpc in canada erming pei...

Documents