cuda fortran for scientists and engineers || tesla specifications

3

A APPENDIX Tesla Speciﬁcations Floating-point performance Tesla Products C870 C1060 C2050 C2070 M2090 K10 K20 K20X Compute capability 1.0 1.3 2.0 3.0 3.5 Number of multiprocessors 16 30 14 14 16 2 × 8 13 14 Core clock (GHz) 1.35 1.296 1.15 1.15 1.3 0.745 0.706 0.732 Single-precision cores per 8 8 32 192 192 multiprocessor Total single-precision cores 128 240 448 448 512 2 × 1536 2496 2688 Single-precision GFlops 346 622 1030 1030 1331 2 × 2289 3524 3935 (Multiply + Add) Double-precision cores – 1 16* 8 64 per multiprocessor Total double-precision cores – 30 224* 224* 256* 2 × 64 832 896 Double-precision GFlops – 78 515* 515* 665* 2 × 95 1175 1312 (Multiply + Add) *GeForce GPUs have fewer double-precision units. CUDA Fortran for Scientists and Engineers. http://dx.doi.org/10.1016/B978-0-12-416970-8.00016-X © 2014 Elsevier Inc. All rights reserved. 237

Upload: massimiliano

Post on 17-Dec-2016

215 views

Category:

Documents

2 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: CUDA Fortran for Scientists and Engineers || Tesla Specifications

AAPPENDIX

Tesla Specifications

Floating-point performance

Tesla Products C870 C1060 C2050 C2070 M2090 K10 K20 K20X

Compute capability 1.0 1.3 2.0 3.0 3.5

Number of multiprocessors 16 30 14 14 16 2 × 8 13 14

Core clock (GHz) 1.35 1.296 1.15 1.15 1.3 0.745 0.706 0.732

Single-precision cores per8 8 32 192 192

multiprocessor

Total single-precision cores 128 240 448 448 512 2 × 1536 2496 2688

Single-precision GFlops346 622 1030 1030 1331 2 × 2289 3524 3935

(Multiply + Add)

Double-precision cores– 1 16* 8 64

per multiprocessor

Total double-precision cores – 30 224* 224* 256* 2 × 64 832 896

Double-precision GFlops– 78 515* 515* 665* 2 × 95 1175 1312

(Multiply + Add)

*GeForce GPUs have fewer double-precision units.

CUDA Fortran for Scientists and Engineers. http://dx.doi.org/10.1016/B978-0-12-416970-8.00016-X© 2014 Elsevier Inc. All rights reserved.

237

http://dx.doi.org/10.1016/B978-0-12-416970-8.00016-X

Page 2: CUDA Fortran for Scientists and Engineers || Tesla Specifications

238APPEN

DIXA

TeslaS

pecifications

Memory

Tesla Products C870 C1060 C2050 C2070 M2090 K10 K20 K20X

Compute capability 1.0 1.3 2.0 3.0 3.5

Device Memory (DRAM)

Total global memory (GB) 1.5 4 3* 6* 6* 2 × 4* 5* 6*

Constant memory (KB) 64

Memory clock (MHz) 800 800 1,500 1,566 1,848 2,500 2,600 2,600

Bus width (bits) 384 512 384 384 384 2 × 256 320 384

Theoretical peak bandwidth (GB/s) 76.8 102.4 144* 150.3* 177.4* 2 × 160* 208* 249.6*

On-Chip Memory

32-bit registers per multiprocessor 8 K 16 K 32 K 64 K 64 K

Maximum registers per thread 127 127 63 63 255

Shared memory per multiprocessor 16 K 16 K 48 K/16 K 48 K/32 K/16 K 48 K/32 K/16 K

L1 cache per multiprocessor – – 16 K/48 K 16 K/32 K/48 K** 16 K/32 K/48 K**

Constant memory cache per multiprocessor (KB) 8

*With ECC enabled the available global memory and peak bandwidth will be less than the numbers listed.**For the K10, K20, and K20X GPUs, the L1 cache is used for local memory only.

Page 3: CUDA Fortran for Scientists and Engineers || Tesla Specifications

APPENDIX

ATesla

Specifications

239

Execution configuration limits

Compute capability 1.0 1.3 2.0 3.0 3.5

C2050 C2070Tesla products C870 C1060 M2090 K10 K20 K20X

M2050 M2070

Maximum thread8 8 8 16 16

blocks per multiprocessor

Maximum threads per512 512 1024 1024 1024

thread block

Maximum threads (warps)768 (24) 1024 (32) 1536 (48) 2048 (64) 2048 (64)

per multiprocessor

Maximum grid 65536 × 65536 × 65536 × 2147483647 × 2147483647 ×dimensions 65536 × 1 65536 × 1 65536 × 65536 65536 × 65536 65536 × 65536

Maximum block512 × 512 × 64 512 × 512 × 64 1024 × 1024 × 64 1024 × 1024 × 64 1024 × 1024 × 64

dimensions

PVF Reference Guide · PVF Reference Guide xii 5.8.29. CUDA Fortran Keep Kernel Source.....130 5.8.30. CUDA Fortran Keep PTX.....130 5.8.31. CUDA Fortran Keep PTXAS.....130 5.8.32

Fortran CUDA Library Interfaces - softek.co.jp · Fortran CUDA Library Interfaces Version 2017 | ii ... Pointer Modes in cuBLAS and cuSPARSE.....6 1.7. Writing Your Own CUDA Interfaces

CUDA Fortran - PGI Compilers and ToolsCUDA Fortran includes a Fortran 2003 compiler and tool chain for programming NVIDIA GPUs using Fortran. PGI 2013 includes support for CUDA Fortran

New Introduction to CUDA Fortran - NVIDIA · 2013. 3. 19. · Introduction •CUDA is a scalable model for parallel computing •CUDA Fortran is the Fortran analog to CUDA C – Program

CUDA Fortran - PGI Compilers and Tools

CUDA Fortran Programming Guide and Referencepds17.egloos.com/pds/201001/15/11/pgicudaforug.pdf · 2010. 1. 15. · CUDA Fortran Programming Guide and Reference Published: v1.0 November

Parallel Programming with CUDA Fortran - · PDF file© NVIDIA Corporation 2011 CUDA Fortran CUDA is a scalable programming model for parallel computing CUDA Fortran is the Fortran

CUDA Fortran: Porting Scientific Research Codes to GPUs ... · CUDA C vs. CUDA Fortran Getting existing Fortran codes up and running on GPUs can be easy if you use the right tools

CUDA Fortran Programming Guide and Reference · PDF file2 Programming Guide This chapter introduces the CUDA programming model through examples written in CUDA Fortran. A reference

GTC2012 Cuda Fortran - NVIDIA · 2012. 11. 27. · Introduction •CUDA is a scalable model for parallel computing •CUDA Fortran is the Fortran analog to CUDA C – Program has

CUDA Fortran - PGI Compilers and Tools · 2018-11-21 · CUDA Fortran Quick Reference Guide CUDA Fortran is a Fortran analog to the NVIDIA CUDA C language for programming GPUs. It

CUDA Fortran for Scientists and Engineers - Newsmorse.uml.edu/.../PS8.D/CUDA_FORTRAN_FOR_ENG.pdf · CUDA Fortran for Scientists and Engineers Greg Ruetsch Massimiliano Fatica NVIDIA

G-DEP 第3回セミナーツールで始めるGPGPU · PGI CUDA Fortran 538.5 19.6 PGI Fortran + CUBLAS 777.6 28.3 倍精度 PGI ACC. Directives 13.0 115.3 8.9 PGI CUDA Fortran

CUDA Fortran Programming Guide and Reference · 2020. 5. 11. · CUDA Fortran Programming Guide and Reference Version 2020 | viii PREFACE This document describes CUDA Fortran, a small

CUDA Fortran for Scientists and Engineers

CUDA Fortran 2013 | GTC 2013...PGI CUDA Fortran 2013 New Features Texture memory support CUDA 5.0 Dynamic Parallelism Chevron launches within global subroutines Support for allocate,

CUDA Fortran · 2016. 6. 27. · CUDA Fortran includes a Fortran 2003 compiler and tool chain for programming NVIDIA GPUs using Fortran. PGI 2011 includes support for CUDA Fortran

CUDA Fortran Programming Guide and Referencegeco.mines.edu/software/pg10/gpu/pgicudaforug.pdf · CUDA Fortran Programming Guide and Reference 7 1 Introduction Graphic processing units

Introduction to Accelerators: CUDA+OpenCL · (NVIDIA CUDA Programming Guide) 18 ... CUDA C OpenCL CUDA Fortran DirectCompute NVIDIA GPU + Driver. ... (3,1) Thread (4,1) Thread (0,2)

CUDA en Fortran

GPGPU Seminar (Accelerataion of Lattice Boltzmann Method using CUDA Fortran)

CUDA Fortran Programming Guide and Reference - The Portland

GPU (Graphics Processing Unit) Programming in CUDANVIDIA CUDA Programming Guide) ... CUDA C OpenCL CUDA Fortran ... GPU Computing Applications. Soluzioni alternative a CUDA per GPU

Using Thrust to Sort CUDA FORTRAN Arrays - … · Using Thrust to Sort CUDA FORTRAN Arrays ... // allocate device vector thrust::device_vector d_vec(4); // obtain raw pointer to device

Using CUDA C , CUDA Fortran, and OpenCL on a Cray XK6

CUDA Fortran - CASscc.qibebt.cas.cn/docs/compiler/pgi/2011/pgicudaforug.pdf · CUDA Fortran Programming Guide and Reference Release 2011. While every precaution has been taken in

Par4All: Auto-Parallelizing C and Fortran for the CUDA Architecture

CUDA Libraries and CUDA Fortran - Nvidia · CUDA Libraries and CUDA Fortran Massimiliano Fatica NVIDIA Corporation. NVIDIA CUDA Libraries CUDA Toolkit includes several libraries:

Parallel Programming with CUDA FortranCUDA is a scalable programming model for parallel computing CUDA Fortran is the Fortran analog of CUDA C Program host and device code similar

NVIDIA CUDA - Compute Unified Device Architecture...Architectures : comparatif Sylvain Jubertie (LIFO) NVIDIA CUDA 2011-2012 16 / 58 Architecture Architecture Tesla Architecture Tesla

CUDA Fortran Programming Guide and Reference

CUDA Fortran Programming Guide and · PDF fileCUDA Fortran Programming Guide and Reference Version 2017 | viii PREFACE This document describes CUDA Fortran, a small set of extensions

GPGPU Seminar (GPGPU and CUDA Fortran)

HPC Computing with CUDA and Tesla Hardwarejbaker/PDC-Sp12/slides/CUDA... · HPC Computing with CUDA and Tesla Hardware Tim Lanfear, NVIDIA. ... NVIDIA Switch NVIDIA Tesla GPU NVIDIA