the architecture for discovery...tick-tock development model sustained microprocessor leadership...

20
Intel Confidential The architecture for Discovery June, 2016

Upload: others

Post on 22-May-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

The architecture for Discovery

June, 2016

Page 2: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel® Solutions Summit 2016

Caught in the Vortex…?

2

Growth Enablers/Inhibitors

Business Efficiency & Agility

DATA: Trust, Privacy,

sovereignty

Innovation: New Economy

Biz Models

Macro Economic Effect

Page 3: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel® Solutions Summit 2016 3

Page 4: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential 4

Page 5: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel® Solutions Summit 2016

Data Center Blocks

HPC

HPC Compute Block

Cloud

VSAN Ready Node

Enterprise

SMB Server Block

Storage

Reduce Complexity

Intel engineering, validation, support

Speed time to market

Begin with a higher level of integration

Increase Value

Reduce TCO, value pricing

Fuel innovation

Focus R&D on value-add and differentiationServer blocks for specific segments

Data Center Blocks

5

Page 6: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

A Holistic Design Solution for All HPC Needs

Compute Memory/Storage

Fabric Software

Small Clusters Through Supercomputers

Compute and Data-Centric Computing

Standards-Based Programmability

On-Premise and Cloud-BasedIntel Silicon

Photonics

Intel® Scalable System Framework

Intel® Xeon® Processors

Intel® Xeon Phi™ Processors

Intel® Xeon Phi™ Coprocessors

Intel® Server Boards and Platforms

Intel® Solutions for Lustre*

Intel® SSDs

Intel® Optane™ Technology

3D XPoint™ Technology

Intel® Omni-Path Architecture

Intel® True Scale Fabric

Intel® Ethernet

Intel® Silicon Photonics

HPC System Software Stack

Intel® Software Tools

Intel® Cluster Ready Program

Intel® Visualization Toolkit

14

Page 7: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

Intel® Xeon®

processor

64-bit

Intel® Xeon®

processor

5100 series

Intel® Xeon®

processor

5500 series

Intel® Xeon®

processor

5600 series

Intel® Xeon®

processor code-

named Sandy

Bridge EP

Intel® Xeon®

processor code-

named

Ivy Bridge

EP

Intel® Xeon®

processor code-

named

HaswellEP

Core(s) 1 2 4 6 8 12 18

Threads 2 2 8 12 16 24 36

SIMD Width 128 128 128 128256 AVX

256AVX

256AVX2

How do we attain extremely high compute density for parallel workloadsAND maintain the robust programming models and tools that developers crave?

Intel® Xeon Phi™

coprocessor

Knights

Corner

Intel® Xeon Phi™

coprocessor

Knights

Landing1

57-61 72

228-244 288

512 2 x 512

More cores More Threads Wider vectors

Parallel is the Path Forward

*Product specification for launched and shipped products available on ark.intel.com. 1. Not launched - in planning.

Intel® Xeon® and Intel® Xeon Phi™ Product Families are both going parallel

Page 8: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential 8

Tick-Tock Development ModelSustained Microprocessor Leadership

Nehalem

Microarchitecture

Sandy Bridge

Microarchitecture

Haswell

Microarchitecture

45nm

New Micro-

architecture(SSE)

TOCK

Nehalem

32nm

New Process

Technology

TICK

Westmere

32nm

New Micro-

architecture(AVX)

TOCK

Sandy Bridge

22nm

New Process

Technology

TICK

Ivy Bridge

22nm

New Micro-

architecture(AVX2)

TOCK

Haswell

14nm

New Process

Technology

TICK

Broadwell

SkyLake

Microarchitecture

14nm

New Micro-

architecture(AVX512)

TOCK

SkyLake

XXnm

New Process

Technology

TICK

Future

Typically, Increase in Transistor Density Enables New Capabilities, Higher Performance

Levels, and Greater Energy Efficiency

Page 9: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

Page 10: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

Intel® Xeon® processor E5-2600 v4 product family

Grantley-Refresh Overview

Broadwell microarchitecture

Built on 14nm process technology

Socket compatible# replacement for Intel® Xeon® processor E5-2600 v3 on Grantley

Several new features and capabilities

Feature Xeon E5-2600 v3 (Haswell-EP) Xeon E5-2600 v4 (Broadwell-EP)

Cores Per Socket Up to 18 Up to 22

Threads Per Socket Up to 36 threads Up to 44 threads

Last-level Cache (LLC) Up to 45 MB Up to 55 MB

QPI Speed (GT/s) 2x QPI 1.1 channels 6.4, 8.0, 9.6 GT/s

PCIe* Lanes/ Controllers/Speed(GT/s) 40 / 10 / PCIe* 3.0 (2.5, 5, 8 GT/s)

Memory Population4 channels of up to 3 RDIMMs or 3

LRDIMMs+ 3DS LRDIMM&

Max Memory Speed Up to 2133 Up to 2400

TDP (W) 160 (Workstation only), 145, 135, 120, 105, 90, 85, 65, 55

# Requires BIOS and firmware update

& Depends on market availability

All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.

Intel may make changes to specifications and product descriptions at any time, without notice

Page 11: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

Page 12: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential*Results will vary. This simplified test is the result of the distillation of the more in-depth programming guide found here: https://software.intel.com/sites/default/files/article/383067/is-xeon-phi-right-for-me.pdf

All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.1 Over 3 Teraflops of peak theoretical double-precision performance is preliminary and based on current expecations of cores, clock frequency and floating point operations per cycle. FLOPS = cores x clock frequency x floating-point operations per second per cycle.2 Host processor only

22 nm process

Coprocessor only

>1 TF DP Peak

Up to 61 Cores

Up to 16GB GDDR5

Available Today

Knights CornerIntel® Xeon Phi™

x100 Product Family

Coming Soon

Knights LandingIntel® Xeon Phi™ x200 Product Family

Future

Knights Hill3rd generation

14 nm process

Host Processor & Coprocessor

>3 TF DP Peak1

Up to 72 Cores

Up to 16GB HBM

Up to 384GB DDR42

~500 GB/s STREAM

Integrated Fabric2

10 nm process

Integrated Fabric (2nd

Generation)

In Planning…

Intel® Xeon Phi™ Product FamilyHighly-parallel processing to power your breakthrough innovations

“Meet Knight's Landing: Intel's most powerful chip ever is overflowing with cutting-edge technologies”

– PC World 06/2014

Page 13: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

Page 14: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

Solution for future clusters with both Xeon and Xeon Phi

Binary-compatible with Intel® Xeon® processor (Skylake)

Higher performance density for highly parallel applications2

Reduced system power consumption2

Higher perf/Watt & perf/$$3

Solution for general purpose servers and workstations

Targeted for applications with larger sections of serial work1

Upgrade path from Knights Corner as PCIe* card

Knights Landing Processor“Self-boot” Intel® Xeon Phi™ processor platform

*Other names and brands may be claimed as the property of others.1 Projections based on early product definition and as compared to prior generation Intel® Xeon Phi™ Coprocessors2 Based on Intel internal analysis. Lower power based on power consumption estimates between (2) HCAs

compared to 15W additional power for KNL-F. Higher density based on removal of PCIe* slots and associated

HCAs populated in those slots.3 Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any

difference in system hardware or software design or configuration may affect actual performance.

Knights Landing CoprocessorRequires Intel® Xeon® processor host

Ingredient in Grantley/Purley PlatformGroveport Platform

For more info, download the Groveport (KNL) Snapshot:https://sharepoint.amr.ith.intel.com/sites/snapshot/Groveport

Three (3) Knights Landing Products

Page 15: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Knights Landing Architectural Diagram

DMI

MCDRAM MCDRAM MCDRAM

MCDRAM

MCDRAM

MCDRAM MCDRAM MCDRAM

DDR4

DDR4

DDR4

Wellsburg

PCH

Up to 72 cores

HFI

DDR4

DDR4

DDR4

PCIe Gen3

x36

6 channels

DDR4

Up to

384GB

Common with

Grantley PCH

2 ports Storm Lake

Integrated Fabric

On-package

50 GB/s bi-directional

Up to 16GB high-bandwidth

on-package memory

(MCDRAM)

Exposed as NUMA node

~500 GB/s sustained BWUp to 72 cores

2D mesh architecture

Over 3 TF DP peak

Full Xeon ISA compatibility through AVX-512

~3x single-thread vs. compared to Knights

Corner

Core Core

2 VPU

2VPU

1M

B L

2H

UB

Tile

Mic

ro-C

oa

x C

ab

le

(IF

P)

Mic

ro-C

oa

x C

ab

le

(IF

P)

2x 512b VPU per core

(Vector Processing Units)

Based on Intel® Atom Silvermont processor

with many HPC enhancements

Deep out-of-order buffers

Gather/scatter in hardware

Improved branch prediction

4 threads/core

High cache bandwidth

& more

Page 16: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

Page 17: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential — Do Not Forward

Intel® True Scale Fabric

Network Infrastructure - Optimized Price/Performance interconnect for HPC

Host Architecture - High MPI message rate & low end-to-end latency

Scalable Switch Solution - Performance & Latency scales with network

~10% of Verbs-based

Instructions

Page 18: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

Intel® Omni-Path Architecture:

Changing the Fabric Landscape

Time

Next Intel® Xeon® processorDiscrete PCIe HFI

Intel® Xeon Phi™ processor (Knights Landing)Multi-chip package integration

Next Intel® Xeon® processor

Next Intel® Xeon Phi™ processor

(Knights Hill)

Next Generation

Intel® Xeon® processor E5-2600 v3Discrete PCIe HFI

Intel® OPA

HFI Card +

CPU-Fabric

Integration

Optimizing

• Performance

• Density

• Power

• Cost

Page 19: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Intel Confidential

Intel® Omni-Path Architecture Product Family

1 Available as a reference design and Intel product. Director class switch features and introduction in planning2 192- and 768-port DCS products are QSFP-based with 32-port leaf modules. In planning: 264- and 1056-port DCS products with uQSFP-based (44-port leaf modules), and 288- and

1152-port DCS products, Intel® Silicon Photonics-based with onboard 4x4 optical transceivers (48-port leaf modules).

Standard

PCIe Board1

(Chippewa Forest)

Low Profile PCIe v3.0 x16

Low Profile PCIe v3.0 x8

Single Port QSFP28

24 / 48 port individual QSFP28 ports

Short reach – QSFP28 Cu cables

Long reach – QSFP28 AOC

Air cooling, N+1 redundant fans

Optional redundant power supply

In-band management supported

Optional management card

Full Bisection Bandwidth

QSFP-based leaf module

In planning: Micro-QSFP

and embedded 4x4 Optical

transceiver options

N+1 Redundant Power

Air cooling, N+1 red. fans

and chassis mgmt, hot plug

HFI

ASIC

Wolf River

Intel OPA Gen1 Host Fabric Interface (HFI) Silicon

2 x 100 Gbps, 50 GB/sec Fabric Bandwidth

Intel OPA Gen1 Switch Silicon

48 ports, 9.6Tb/s, 1200 GB/sec Fabric Bandwidth

Integrated Xeon®

and Xeon Phi™

Prairie River

Knights Landing:

2 x 100 Gbps ports

Skylake (Xeon®):

1x100Gbps port

Custom Mezz

& PCIe Cards

OEM products

based on Wolf

River ASIC TBD

192-port (7U)

264-port2(in planning)

Director Class

Switches (DCS)1

(Sawtooth Forest)

Edge Switch1

(Eldorado Forest)

Custom

Switches

OEM products

based on Prairie

River ASIC TBD

Switch

ASIC

Product Line

768-port (20U)

1056-port2(in planning)

Software Cables

Intel® Fabric Suite[based on OFA with

Intel® OPA support]

Passive Copper

& Active Optical

Cable (AOC)

AOC*

Passive CuCable

Page 20: The architecture for Discovery...Tick-Tock Development Model Sustained Microprocessor Leadership Nehalem Microarchitecture Sandy Bridge Microarchitecture Haswell Microarchitecture

Thank You.