dynamic infrastructure for modern workloads

21
© 2021 Liqid, Inc. All rights reserved. 1 Dynamic Infrastructure for Modern Workloads LIQID CDI IN THE ARC CLOUD™

Upload: khangminh22

Post on 08-Mar-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

© 2021 Liqid, Inc. All rights reserved. 1

Dynamic Infrastructure for Modern WorkloadsLIQID CDI IN THE ARC CLOUD™

© 2021 Liqid, Inc. All rights reserved. 2

Unlock Significant Utilization Improvement- leap frogging hyperscale

Industry Average

12%Datacenter Utilization

Hyperscale Average

30%Datacenter Utilization

Liqid Enables Up To

90%Datacenter Utilization

© 2021 Liqid, Inc. All rights reserved. 3

Datacenter RoadmapThree-Tier

Hyperconverged

Composable

When you’re able to disaggregate the converged server…

then compose and reconfigure…that’s a revolution.

Jensen Huang, CEO of Nvidia

© 2021 Liqid, Inc. All rights reserved. 4

Next Gen DC/MSP- CSP, ISP, Private Cloud

- Flexibility, Resource Utilization, TCO

- Bare Metal Cloud Product Offering

Accelerated M&E - Accelerated Interactive Composite Editing

- High Performance Edge Compute

- Accelerated 3D Virtual Production

AI & Deep Learning- GPU Scale out

- Enable GPU Peer-2-Peer at scale

- GPU Dynamic Reallocation/Sharing

Research Labs & HPC- High Performance Computing

- Lowest Latency Interconnect

- Massive GPU/FPGA Scale Out

Markets & Customers

© 2021 Liqid, Inc. All rights reserved. 5

What is Composability?

© 2021 Liqid, Inc. All rights reserved. 6

What is Disaggregation?

StorageNVMe Flash / Optane

HostsIntel / AMD / ARM

NetworkingNIC

AcceleratorsGPU / FPGA / DPU

Composed Bare Metal ServersResource Pools

Composable Software&

PCIe/Eth Fabric

Bare Metal Servers

• Weeks to deploy

• Limited or no scale

• Poor resource utilization

• Deploy in seconds

• Scale in any dimension, on-demand

• Maximize resource utilization

Converged Infrastructure Composable Disaggregated Infrastructure

© 2021 Liqid, Inc. All rights reserved. 7

Putting it All Together

Host

Expansion Chassis

Fabric SwitchPCIe or Ethernet

Adapter Card

(rear)

© 2021 Liqid, Inc. All rights reserved. 8

Composable Portfolio

PCIe Fabric Resource ExpansionManagement

& Matrix Software

48-Port PCIe Fabric Switch 32-Bay JBOX 10/20 Slot JBOX 16-Port Director

1U 1U4U

1U

GPU | FPGA | SSD | NICSSD | Storage Class Memory (SCM)

© 2021 Liqid, Inc. All rights reserved. 9

Host 1

Host 2

Host N

Storage

Host

NIC

GPU

Rest API

Liqid Mgmt. SW

Resource Pools Bare Metal Hosts

THE ARC CLOUD™

© 2021 Liqid, Inc. All rights reserved. 10

New Workloads, New Solutions: Composable A.I.

Data Pool

Data Ingest Data Clean and Tag Training Inference

Dynamic Resource Allocation for Each Stage of AI Workflow

Liqid Mgmt. SW

© 2021 Liqid, Inc. All rights reserved. 11

Up To 10x Higher Performance With Peer-2-Peer

© 2021 Liqid, Inc. All rights reserved. 12

High Performance GPU Scale-Out Solutions

Resnet ð Octane ð V-Ray

Benchmark Record Holder

20x1x

GPUCPU

9601,908

3,747

7,369

13,386

15,814

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

1 2 4 8 16 20NVIDIA RTX8000 GPU

20 GPU SupercomputerResnet50 – Images/Sec

© 2021 Liqid, Inc. All rights reserved. 13

Large In System Composable Memory

Command Center Software for

OrchestrationManaged Fabric

Switch

3TB DRAM Added 3TB DRAM Added6TB DRAM Added 9TB DRAM Added 12TB DRAM Added

13

Intel® Optane™ Technology

Liqid Powered Intel® Optane™ Technology

Capacity 1.5TB 3.0TB

Seq Read 128KB 2,500 20,000

Seq Write 128KB 2,200 20,000

Rnd Read 4KB 550,000 4,000,000

Rnd Write 4KB 550,000 4,000,000

Interface Gen3 x4 Gen4x16

$27,360

$6,130

$24,520

$-

$10,000

$20,000

$30,000

3TB ofDRAM

3TB ofDRAM +Optane

12TB ofDRAM +Optane

DRAM Compared to DRAM + Optane

© 2021 Liqid, Inc. All rights reserved. 14

Use Case: Dynamic GPU Sharing

Nighttime Workload (Inference)

GPU Pool

Daytime Workload (Training)

Radically Improve GPU Utilization

7am to 4pm 4pm to 7am

8-20 GPUs per node

1/7th - 1 GPU per node using MIG

© 2021 Liqid, Inc. All rights reserved. 15

Large Public Clouds – Under Utilized Silos

Standard Compute nodes High Memory Servers AI / ML GPU servers

© 2021 Liqid, Inc. All rights reserved. 16

Best in Class Dynamic Data Center RDK PODS

48TB Storage Class Memory

144x Hardware Accelerators(GPUs, FPGAs, NICs, etc)

1.1PB NVMe Storage

96x High-PerformanceCompute Nodes

Liqid Matrix Fabric Switch

Legacy IB/Eth Networking

© 2021 Liqid, Inc. All rights reserved. 17 © 2021 Liqid, Inc. All rights reserved. 18

Composable Disaggregated Infrastructure

Optane Memory

ExpansionGPU | FPGANVMe | NIC | DPU

Hosts

Command Center

Hosts

NVMe Flash

© 2021 Liqid, Inc. All rights reserved. 18

Benefits of Liqid CDI In The Arc Cloud™

© 2021 Liqid, Inc. All rights reserved. 19

DC Evolution: Brands to commodity pooled components

2000 – 2020 (THE “OLD WAY” SOLUTIONS)

Compute –GPU/CPUStorage Networking

2021 (TODAY)

Liqid CDI OS

ON

© 2021 Liqid, Inc. All rights reserved. 20

Example Training & Inference Solution on Arc Cloud ™One Rack View – 50 Racks Total

Composable 4,000 GPUs/24K AMD Cores/4.5PB NVMs/600TB RAM

(4) Liqid Gen4 20 Slot Double Width Chassis’s(1) Liqid Gen4 10 Slot Double Width Chassis

(80) Nvidia A100 GPUs(4) Honeybadger Optane 3TB(6) HoneyBadger NVMe 15TB

(3) Liqid Directors

(3) Liqid Gen4 48 Port Switches

(4) Dell Servers w/512GB / 160 Cores Each(4) Liqid Gen4 HBAs

Legend

PCIe Management Link(Note: Not all Mgmt Links & PDUs shown)

64 GB/s PCIE Connection (8 Ports)32 GB/s PCIE Connection(4 Ports)

Rack Power: 30kW-35kWDell Server – 1200-2400WLiqid Populated Chassis – 6kWLiqid Director – 225WLiqid Switch – 225W

Rack Dimensions: 42U

Cross Over PCIE Connection(8 Ports)

© 2021 Liqid, Inc. All rights reserved. 21

Thank You

LIQID CDI IN THE ARC CLOUD™[email protected]