tesla: fastest processor adoption in hpc tesla gpu computing products tesla s1070 system tesla c1060

Download Tesla: Fastest Processor Adoption in HPC Tesla GPU Computing Products Tesla S1070 System Tesla C1060

Post on 22-Jan-2020

7 views

Category:

Documents

4 download

Embed Size (px)

TRANSCRIPT

  • Tesla: Fastest Processor Adoption in HPC History

    Teratec 2009

  • 1995

    2008

  • 240 cores 4 cores

    Advent of GPU Computing

    CPU + GPU Co-Processing

  • GPU Computing Applications

    C C++ Java

    FortranOpenCLtm DirectX Compute

    NVIDIA GPU CUDA Parallel Computing Architecture

    OpenCL is trademark of Apple Inc. used under license to the Khronos Group Inc.

    CUDA GPU Computing Architecture

  • Tesla GPU Computing Products

    Tesla S1070 System Tesla C1060 Processor

    GPUs 4 Tesla GPUs 1 Tesla GPU 1 Tesla GPU

    Single Precision Perf 4.14 TFlops 933 GFlops 933 GFlops

    Double Precision Perf 346 GFlops 78 GFlops 78 GFlops

    Memory 4 GB / GPU 4 GB 4 GB

    Form Factor 1U Chassis, cables to a Host Standard PCIe board Custom Module

    Tesla M1060 Processor

  • Tesla GPU Computing Solutions

    Datacenter

    Departmental Cluster

    Scientific Desktop

    Integrated 1U Servers

    Personal Supercomputer

    Pre-Configured Clusters

  • Tesla S1070 & MS Server 2008

    Current HIC PCI x16 Gen2

    New GHICx16 PCIe x16 Gen2 with GPU & Fan

    Connector extends out from I/O plane High Density connector for

    VGA/DVI dongle

  • NVIDIA GPU Computing Ecosystem

    NVIDIA Hardware Solutions CUDA SDK & ToolsGPU Architecture

    Customer Application DeploymentCustomerRequirementsHardware Architecture

    OEMs CUDA

    Development Specialist

    ISV

    CUDA Training

    Company Hardware Architect

    VAR

  • CUDA Ecosystem

    Applications Libraries FFT

    BLAS LAPACK

    Image processing Video processing Signal processing

    Vision

    Consultants OEMs

    Languages C, C++ DirectX Fortran Java

    OpenCL Python

    Compilers PGI Fortran

    CAPS HMPP MCUDA

    MPI NOAA Fortran2C

    OpenMP

    UIUC MIT

    Harvard Berkeley

    Cambridge Oxford

    IIT Delhi Dortmundt ETH Zurich

    Moscow Ecole Centrale Paris 6 Jussieu

    Over 200 Universities Teaching CUDA

    Oil & Gas Finance

    Medical Biophysics

    Numerics

    Imaging

    CFD

    DSP EDA

    Debuggers Allinea

    TotalView

  • 200 6

    201 0

    201 2

    201 5

    200 8

    G PU

    T8 128

    core

    T10 240

    core

    A 2015 GPU ~20x the performance of today’s GPU ~5,000 cores at ~3GHz (50mW each) ~20 TFLOPS ~1.2TB/s of memory bandwidth

    *This is a sketch of a what a GPU in 2015 might look like, it does not reflect any actual product plans

    GPU Revolutionizing Computing GFlops

  • Click to edit Master subtitle style

    Diapo 1 Diapo 2 Diapo 3 Diapo 4 Diapo 5 Diapo 6 Diapo 7 Diapo 8 Diapo 9 Diapo 10 Diapo 11 Diapo 12