rapid&ict&prototyping&in& ireland&with&ichec&&€¦ · irish centre...

63
Irish Centre for High-End Computing Rapid ICT prototyping in Ireland with ICHEC Overview JeanChristophe “JC” Desplat 11 th February 2015

Upload: haanh

Post on 05-May-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

Irish Centre for High-End Computing

Rapid  ICT  prototyping  in  Ireland  with  ICHEC    

Overview    

Jean-­‐Christophe  “JC”  Desplat    

11th  February  2015  

Irish Centre for High-End Computing

Agenda  

•  Centre  overview  •  Technology  walkthrough  •  Training  &  educaIon  programme  •  Data  analyIcs  •  Business  engagement  model    

Irish Centre for High-End Computing

Scalability

Performance

Active industry engagement: Making technology work for you •  Experts have wide ranging skill set with career backgrounds

in industry and academia

•  Teams combining software engineers, domain experts and accredited PRINCE2/Agile project managers

•  Flexible and responsive industry engagement model

National Technology Centre

•  Established in 2005 •  University hosted (with national remit) •  28 staff in Dublin & Galway

Mandate includes: •  High-Performance Computing (HPC) & Big Data / analytics •  Industry & Public Sector engagement

Irish Centre for High-End Computing

National e-Infrastructure Storage Nexsan E60 574TB (formatted) DDN SFA12k-20 550TB (formatted) Panasas AS5200 175TB (formatted)

Fionn: SGI ICE X 7,680 E5-2660v2 cores c.148 Tflops 20.5TB RAM SGI Pyramid 640 E5-2660v2 cores 32 Xeon Phi 5110P 32 NVIDIA K20m

SGI UV2000 1.7TB RAM 2 Xeon Phi 5110P

Near-mission critical services with emergency service failover

Industry Test beds •  R&D •  Production (in procurement)

Irish Centre for High-End Computing

ICHEC uses novel ICT technologies to enable new data intensive applications

Nvidia Many integrated core GPU Intel Multi integrated core CPU Xilinx programmable logic

Storage

Compute

Cloud

Irish Centre for High-End Computing

Screen shot from DDN Web blog, Nov 2014 - Link

Irish Centre for High-End Computing

Data analytics Molecular dynamics

Weather forecasting

“ICHEC joined our globally competitive Intel Parallel Computing Centre programme. ICHEC were selected because of their exceptional parallel software skills and their problem solving approach to HPC and big data challenges.”

Brian Quinn Director strategic programs at Intel Labs Europe

Financial and Genomics

Irish Centre for High-End Computing

High frequency trading E-commerce Cloud computing

“Xilinx needs a new generation of design environments that abstract away the complexity of the hardware. ICHEC really helps us with the proliferation of the design expertise as they can develop the education material that is required to train up the next generation of programmers.”

Michaela Blott Principal Engineer - Xilinx Research

Irish Centre for High-End Computing

ECTS accredited graduate modules

Custom training for 11 cloud system admin positions (in year 1)

Public sector R courses

Education and outreach

ICHEC designed and delivered personalised R courses across 14 public sector organisations to address a skills demand and enable innovation.

In progress

Irish Centre for High-End Computing

CSO UNECE Partnership Partnership with Irish Central Statistics Office & United Nations Economic Commission for Europe. High level group to develop best practices in the analysis of civic data through Hadoop workflows.

•  Managing Data Analytics 'Sandbox' for UNECE programme on Big Data in Official Statistics

•  Hadoop Cluster ( 20 nodes ) -  Hortonworks data platform -  Ancillary tools e.g. R

Irish Centre for High-End Computing

Irish Centre for High-End Computing

Engagement  Model  •  Feasibility  project  –  €9K  (funded  by  EI)  •  InnovaIon  Partnership  –  up  to  80%  

•  Strategic  Partnership  programme  –    50%  

•  Consultancy  –  SoYware  development,  process  validaIon,  training  –  Flexible  access  to  resources    –  Value  for  money  

Irish Centre for High-End Computing

SME  Programme  

[email protected] www.ichec.ie

Irish Centre for High-End Computing

See  our  Industry  tesImonials  on  YouTube  

Engineering ICT Training & Education

http://www.youtube.com/user/ichecireland

Irish Centre for High-End Computing

ExploiIng  novel  ICT  technologies  with  ICHEC    

 Dr.  Michael  Lysaght  

 11th  February  2015  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  Emerging  Technologies  

Irish Centre for High-End Computing

More  cores,  More  Performance  

Intel Xeon Phi Coprocessor NVIDIA K80

Irish Centre for High-End Computing

More  cores,  More  Performance  

Intel Xeon Phi Coprocessor

Irish Centre for High-End Computing

More  cores,  More  Performance  

Intel Xeon Phi Coprocessor

•  1997: The 1st Intel Teraflop Computer •  9298 Intel Processors •  72 Server Cabinets

Irish Centre for High-End Computing

More  cores,  More  Performance  

Intel Xeon Phi Coprocessor

•  1997: The 1st Intel Teraflop Computer •  9298 Intel Processors •  72 Server Cabinets

•  2013: The Intel Xeon Phi Coprocessor •  1 Teraflop of Performance •  1 PCIe Slot

Irish Centre for High-End Computing

More  cores,  More  Performance  

Intel Xeon Phi Coprocessor

Logical layout of functional components

Xeon Phi Introduction Tim Cramer | Rechen- und Kommunikationszentrum

6

Architecture (1/2)

Instruction Decode

32k/32k L1 Cache inst/data

512k L2 Cache

Scalar Unit

Scalar Registers

Vector Unit

Vector Registers

Ring

Intel Xeon Phi Coprocessor • 1 x Intel Xeon Phi @ 1090 MHz • 60 Cores (in-order) • ~ 1 TFLOPS DP Peak • 4 hardware threads per core • 8 GB GDDR5 memory • 512-bit SIMD vectors (32 registers) • Fully-coherent L1 and L2 caches • Plugged into PCI Express bus

Source: Intel

Xeon Phi core

Irish Centre for High-End Computing

More  cores,  More  Performance  

Intel Xeon Phi Coprocessor

Logical layout of functional components

60 cores

Xeon Phi Introduction Tim Cramer | Rechen- und Kommunikationszentrum

6

Architecture (1/2)

Instruction Decode

32k/32k L1 Cache inst/data

512k L2 Cache

Scalar Unit

Scalar Registers

Vector Unit

Vector Registers

Ring

Intel Xeon Phi Coprocessor • 1 x Intel Xeon Phi @ 1090 MHz • 60 Cores (in-order) • ~ 1 TFLOPS DP Peak • 4 hardware threads per core • 8 GB GDDR5 memory • 512-bit SIMD vectors (32 registers) • Fully-coherent L1 and L2 caches • Plugged into PCI Express bus

Source: Intel

Xeon Phi core

Irish Centre for High-End Computing

More  cores,  More  Performance  

Intel Xeon Phi Coprocessor

Logical layout of functional components

60 cores

Xeon Phi Introduction Tim Cramer | Rechen- und Kommunikationszentrum

6

Architecture (1/2)

Instruction Decode

32k/32k L1 Cache inst/data

512k L2 Cache

Scalar Unit

Scalar Registers

Vector Unit

Vector Registers

Ring

Intel Xeon Phi Coprocessor • 1 x Intel Xeon Phi @ 1090 MHz • 60 Cores (in-order) • ~ 1 TFLOPS DP Peak • 4 hardware threads per core • 8 GB GDDR5 memory • 512-bit SIMD vectors (32 registers) • Fully-coherent L1 and L2 caches • Plugged into PCI Express bus

Source: Intel

512b VPU

Xeon Phi core

Irish Centre for High-End Computing

More  cores,  More  Performance  

Intel Xeon Phi Coprocessor Accelerated software is a differentiator!

Irish Centre for High-End Computing

More  cores,  More  Performance  

Intel Xeon Phi Processor Accelerated software is a differentiator!

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  GPGPU  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  •  Quan%ta%ve  Finance*:  -  In-­‐house  code  (NDA)  -  London-­‐based  -  Real-­‐Ime  risk  simulaIons  

•  Applica%ons:  -  Profit-­‐margin  analysis  

Source: Talk “Real-Time Risk Simulation The GPU Revolution In Profit Margin Analysis” (http://bit.ly/13sWgH1), May 2012

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  •  Quan%ta%ve  Finance*:  -  In-­‐house  code  (NDA)  -  London-­‐based  -  Real-­‐Ime  risk  simulaIons  

•  Applica%ons:  -  Profit-­‐margin  analysis  

Source: Talk “Real-Time Risk Simulation The GPU Revolution In Profit Margin Analysis” (http://bit.ly/13sWgH1), May 2012

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  •  Quan%ta%ve  Finance*:  -  In-­‐house  code  (NDA)  -  London-­‐based  -  Real-­‐Ime  risk  simulaIons  

•  Applica%ons:  -  Profit-­‐margin  analysis  

•  Project: 3 week time-frame

Source: Talk “Real-Time Risk Simulation The GPU Revolution In Profit Margin Analysis” (http://bit.ly/13sWgH1), May 2012

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  •  Quan%ta%ve  Finance*:  -  In-­‐house  code  (NDA)  -  London-­‐based  -  Real-­‐Ime  risk  simulaIons  

•  Applica%ons:  -  Profit-­‐margin  analysis  

•  Project: 3 week time-frame

•  Requirements: 500ms response time

Source: Talk “Real-Time Risk Simulation The GPU Revolution In Profit Margin Analysis” (http://bit.ly/13sWgH1), May 2012

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  •  Quan%ta%ve  Finance*:  -  In-­‐house  code  (NDA)  -  London-­‐based  -  Real-­‐Ime  risk  simulaIons  

•  Applica%ons:  -  Profit-­‐margin  analysis  

•  Project: 3 week time-frame

•  Requirements: 500ms response time

•  Challenge: How many more simulations on GPU?

Source: Talk “Real-Time Risk Simulation The GPU Revolution In Profit Margin Analysis” (http://bit.ly/13sWgH1), May 2012

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  Deployable solution delivered within 3 weeks

__device__ float getProb(int timeIndex1, int indexToUse, int timeIndex2){

indexToUse = max(min(indexToUse, 80), 30);

if (timeIndex1 >= my_Constants.get2D(delimiter, timeIndex2, 1) && timeIndex2 > 2){

return my_Constants.get2D(gProb,

timeIndex1 - my_Constants.get2D(delimiter, timeIndex2, 1), indexToUse);

} else

return 0.0f;

}

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  

__device__ float getProb(int timeIndex1, int indexToUse, int timeIndex2){

indexToUse = max(min(indexToUse, 80), 30);

if (timeIndex1 >= my_Constants.get2D(delimiter, timeIndex2, 1) && timeIndex2 > 2){

return my_Constants.get2D(gProb,

timeIndex1 - my_Constants.get2D(delimiter, timeIndex2, 1), indexToUse);

} else

return 0.0f;

}

240x speedup

Deployable solution delivered within 3 weeks

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  

__device__ float getProb(int timeIndex1, int indexToUse, int timeIndex2){

indexToUse = max(min(indexToUse, 80), 30);

if (timeIndex1 >= my_Constants.get2D(delimiter, timeIndex2, 1) && timeIndex2 > 2){

return my_Constants.get2D(gProb,

timeIndex1 - my_Constants.get2D(delimiter, timeIndex2, 1), indexToUse);

} else

return 0.0f;

}

240x speedup

Same 500ms time window: 100x to 1000x more simulations

Deployable solution delivered within 3 weeks

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  GPGPU  

__device__ float getProb(int timeIndex1, int indexToUse, int timeIndex2){

indexToUse = max(min(indexToUse, 80), 30);

if (timeIndex1 >= my_Constants.get2D(delimiter, timeIndex2, 1) && timeIndex2 > 2){

return my_Constants.get2D(gProb,

timeIndex1 - my_Constants.get2D(delimiter, timeIndex2, 1), indexToUse);

} else

return 0.0f;

}

240x speedup

Same 500ms time window: 100x to 1000x more simulations

*Winner of HPCWire RCA Award 2012 for ‘Most Innovative Use of HPC in Finance’

Deployable solution delivered within 3 weeks

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

•  Oil  &  Gas  Project  1:  -  In-­‐house  code  -  Seismic  Imaging  -  Petabytes  of  data  

•  Applica%ons:  -  Oil  &  Gas  

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

•  Oil  &  Gas  Project  1:  -  In-­‐house  code  -  Seismic  Imaging  -  Petabytes  of  data  

•  Applica%ons:  -  Oil  &  Gas  

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

•  Oil  &  Gas  Project  1:  -  In-­‐house  code  -  Seismic  Imaging  -  Petabytes  of  data  

•  Applica%ons:  -  Oil  &  Gas  

•  Project: 3 week time-frame

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

•  Oil  &  Gas  Project  1:  -  In-­‐house  code  -  Seismic  Imaging  -  Petabytes  of  data  

•  Applica%ons:  -  Oil  &  Gas  

•  Project: 3 week time-frame

•  Requirements: Reduce I/O bottlenecks

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

•  Oil  &  Gas  Project  1:  -  In-­‐house  code  -  Seismic  Imaging  -  Petabytes  of  data  

•  Applica%ons:  -  Oil  &  Gas  

•  Project: 3 week time-frame

•  Requirements: Reduce I/O bottlenecks

•  Challenge: Space and power savings

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

DDN-ICHEC Whitepaper: Experimenting on IME with an Oil & Gas imaging code (2014)

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

Object  Storage  Servers  

Compute  nodes  

OSS1   OSS2  

IME1  

IME2  

IME3  

IME4  

OST1   OST2   OST6  

...

IB  FDR  

IME  Servers  

SFA7700  

DDN-ICHEC Whitepaper: Experimenting on IME with an Oil & Gas imaging code (2014)

I/O requests are re-ordered through a cache layer

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Burst-­‐Buffer  

Object  Storage  Servers  

Compute  nodes  

OSS1   OSS2  

IME1  

IME2  

IME3  

IME4  

OST1   OST2   OST6  

...

IB  FDR  

IME  Servers  

SFA7700  

0.00

0.20

0.40

0.60

0.80

1.00

Small  case  80GB Medium  case

950GB  Large  case  8.4  TB

Up-­‐to  3x  speedup  Total  execuIon  Ime  In memory

Lustre IME Burst Buffer

DDN-ICHEC Whitepaper: Experimenting on IME with an Oil & Gas imaging code (2014)

I/O requests are re-ordered through a cache layer

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  Xeon  Phi  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  Xeon  Phi  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  Xeon  Phi  •  Materials  under  extreme  condi%ons:  -  UK  STFC  code  -  500k  LOC  -  Extreme-­‐scale  code  

•  Applica%ons:  -  Novel  Materials  -  Biotech  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  Xeon  Phi  •  Materials  under  extreme  condi%ons:  -  UK  STFC  code  -  500k  LOC  -  Extreme-­‐scale  code  

•  Applica%ons:  -  Novel  Materials  -  Biotech  

•  Project: 2-year R&D project (IPCC)

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  Xeon  Phi  •  Materials  under  extreme  condi%ons:  -  UK  STFC  code  -  500k  LOC  -  Extreme-­‐scale  code  

•  Applica%ons:  -  Novel  Materials  -  Biotech  

•  Project: 2-year R&D project (IPCC)

•  Requirements: Extreme scalability

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  Xeon  Phi  •  Materials  under  extreme  condi%ons:  -  UK  STFC  code  -  500k  LOC  -  Extreme-­‐scale  code  

•  Applica%ons:  -  Novel  Materials  -  Biotech  

•  Project: 2-year R&D project (IPCC)

•  Requirements: Extreme scalability

•  Challenge: Exascale-ready – Knights Landing ready

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  Xeon  Phi  

Xeon Phi Introduction Tim Cramer | Rechen- und Kommunikationszentrum

12

Xeon Phi Nodes at RWTH Aachen

Xeon (8 Cores

@ 2 GHz)

Xeon (8 Cores

@ 2 GHz)

DDR3 (16 GB)

DDR3 (16 GB)

Xeon Phi (60 Cores @ 1GHz)

Xeon Phi (60 Cores @ 1GHz)

GDDR5 (8 GB)

GDDR5 (8 GB)

Shared Memory

PCI Express

QPI

Host System

MIC System

MIC System

Compute Node

“Dynamic load-balancing for iterative algorithms”

Increasing heterogeneity in the Datacentre

6x speedup over 2 socket IvyBridge

Irish Centre for High-End Computing

AcceleraIng  soluIons  on  FPGA  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  FPGA  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  FPGA  •  Webscale  apps:  -  Memcached  -  Kernels  (NDA)  

•  Applica%ons:  -  AnalyIcs  -  Stream  processing  

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  FPGA  •  Webscale  apps:  -  Memcached  -  Kernels  (NDA)  

•  Applica%ons:  -  AnalyIcs  -  Stream  processing  

ADM-XRC-7V3 board

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  FPGA  •  Webscale  apps:  -  Memcached  -  Kernels  (NDA)  

•  Applica%ons:  -  AnalyIcs  -  Stream  processing  

•  Project: 6-month R&D project

ADM-XRC-7V3 board

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  FPGA  •  Webscale  apps:  -  Memcached  -  Kernels  (NDA)  

•  Applica%ons:  -  AnalyIcs  -  Stream  processing  

•  Project: 6-month R&D project

•  Requirements: Open Standards ADM-XRC-7V3 board

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  FPGA  •  Webscale  apps:  -  Memcached  -  Kernels  (NDA)  

•  Applica%ons:  -  AnalyIcs  -  Stream  processing  

•  Project: 6-month R&D project

•  Requirements: Open Standards

•  Challenge: Acceleration & Perf/Watt ADM-XRC-7V3 board

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  FPGA  •  Webscale  apps:  -  Memcached  -  Kernels  (NDA)  

•  Applica%ons:  -  AnalyIcs  -  Stream  processing  

Blended ICHEC-Xilinx Team working at Xilinx EMEA HQ

ADM-XRC-7V3 board

Irish Centre for High-End Computing

AcceleraIng  SoluIons  on  FPGA  •  Webscale  apps:  -  Memcached  -  Kernels  (NDA)  

•  Applica%ons:  -  AnalyIcs  -  Stream  processing  

Blended ICHEC-Xilinx Team working at Xilinx EMEA HQ

ADM-XRC-7V3 board

FPGA Server node available at ICHEC: Q1

Full SDAccel Toolchain available