accelerated linear algebra libraries

37
Accelerated Linear Algebra Libraries James Wynne III NCCS User Assistance

Upload: jada

Post on 24-Feb-2016

50 views

Category:

Documents


0 download

DESCRIPTION

Accelerated Linear Algebra Libraries. James Wynne III NCCS User Assistance. Accelerated Linear Algebra Libraries. Collection of functions to preform mathematical operations on matrices - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Accelerated Linear Algebra Libraries

Accelerated Linear Algebra Libraries

James Wynne IIINCCS User Assistance

Page 2: Accelerated Linear Algebra Libraries

2

Accelerated Linear Algebra Libraries

• Collection of functions to preform mathematical operations on matrices • Designer has re-written the standard LAPACK

functions to make use of GPU accelerators to speed up execution on large matrices

Page 3: Accelerated Linear Algebra Libraries

3

MAGMA HOST INTERFACE

Accelerated Linear Algebra Libraries

Page 4: Accelerated Linear Algebra Libraries

4

MAGMA - Host

• MAGMA stands for Matrix Algebra on GPU and Multicore Architecture• Developed by the Innovative Computing Laboratory at

the University of Tennessee• Host interface allows easy porting from CPU libraries

(like LAPACK) to MAGMA’s accelerated library– Automatically manages data allocation and transfer

between CPU (Host) and GPU (Device)

Page 5: Accelerated Linear Algebra Libraries

5

MAGMA - Host• Fortran:– To run MAGMA functions from Fortran, an Interface block needs to be

written for each MAGMA function that’s being called. These interfaces will be defined file magma.f90

– Example:module magma Interface Integer Function magma_sgesv(n,nrhs,…)& BIND(C,name=“magma_sgesv”) use iso_c_binding

implicit none integer(c_int), value :: n integer(c_int), value :: nrhs …

end Function end Interfaceend module

Page 6: Accelerated Linear Algebra Libraries

6

MAGMA - Host• Pseudo-code for a simple SGESV operation in magma

Program SGESV!Include the module that hosts your interfaceuse magma use iso_c_binding!Define your arrays and variablesReal(C_FLOAT) :: A(3,3), b(3)Integer(C_INT) :: piv(3), ok, status!Fill your `A` and `b` arrays then call!MAGMA_SGESVstatus = magma_sgesv(3,1,A,3,piv,b,3,ok)!Loop through and write(*,*) the contents of !array `b`

end Program

Page 7: Accelerated Linear Algebra Libraries

7

MAGMA - Host• Pseudo-code for a simple SGESV operation in magma

Program SGESV!Include the module that hosts your interfaceuse magmause iso_c_binding!Define your arrays and variablesReal(C_FLOAT) :: A(3,3), b(3)Integer(C_INT) :: piv(3), ok, status!Fill your `A` and `b` arrays then call!MAGMA_SGESVstatus = magma_sgesv(3,1,A,3,piv,b,3,ok)!Loop through and write(*,*) the contents of !array `b`

end Program

Page 8: Accelerated Linear Algebra Libraries

8

MAGMA - Host• Pseudo-code for a simple SGESV operation in magma

Program SGESV!Include the module that hosts your interfaceuse magmause iso_c_binding!Define your arrays and variablesReal(C_FLOAT) :: A(3,3), b(3)Integer(C_INT) :: piv(3), ok, status!Fill your `A` and `b` arrays then call!MAGMA_SGESVstatus = magma_sgesv(3,1,A,3,piv,b,3,ok)!Loop through and write(*,*) the contents of !array `b`

end Program

Page 9: Accelerated Linear Algebra Libraries

9

MAGMA - Host

• Before compiling, Make sure the MAGMA module, CUDA toolkit and the GNU programming environment is loaded– magma.f90: Contains the Interface block module– sgesv.f90: Contains the Fortran source code

• To compile:$ module swap PrgEnv-pgi PrgEnv-gnu

$ module load cudatoolkit magma

$ ftn magma.f90 –lcuda –lmagma –lmagmablas sgesv.f90

Page 10: Accelerated Linear Algebra Libraries

10

MAGMA - Host• C:• In C source code, no kind of interface block is needed

like in Fortran• Simply #include<magma.h> in your code• When declaring variables to use with MAGMA

functions, use magma_int_t instead of C’s int type. Matrices for MAGMA’s SGESV are of type float

Page 11: Accelerated Linear Algebra Libraries

11

MAGMA - Host• Example code for C:

#include<magma.h>#include<stdio.h>

int main(){ //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok;

//Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b}

Page 12: Accelerated Linear Algebra Libraries

12

MAGMA - Host• Example code for C:

#include<magma.h>#include<stdio.h>

int main(){ //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok;

//Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b}

Page 13: Accelerated Linear Algebra Libraries

13

MAGMA - Host• Example code for C:

#include<magma.h>#include<stdio.h>

int main(){ //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok;

//Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b}

Page 14: Accelerated Linear Algebra Libraries

14

MAGMA - Host• Example code for C:

#include<magma.h>#include<stdio.h>

int main(){ //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok;

//Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b}

Page 15: Accelerated Linear Algebra Libraries

15

MAGMA - Host

• Before compiling, Make sure the MAGMA module, CUDA toolkit and the GNU programming environment is loaded

• To compile:$ module swap PrgEnv-pgi PrgEnv-gnu

$ module load cudatoolkit magma

$ cc –lcuda –lmagma –lmagmablas sgesv.c

Page 16: Accelerated Linear Algebra Libraries

16

MAGMA DEVICE INTERFACE

Accelerated Linear Algebra Libraries

Page 17: Accelerated Linear Algebra Libraries

17

MAGMA - Device

• MAGMA Device interface allows direct control over how the GPU (device) is managed– Memory allocation and transfer– Keeping matrices on the device

Page 18: Accelerated Linear Algebra Libraries

18

MAGMA - Device• Fortran:– To run MAGMA device functions from Fortran, an Interface

block needs to be written for each MAGMA function that’s being called. This is not required in C source code

– Device functions suffixed with _gpu– CUDA Fortran used to manage memory on the device– All interface blocks need to be defined in a module

(module magma) • If module exists in a separate file, file extension must be .cuf, just like

the source file

Page 19: Accelerated Linear Algebra Libraries

19

MAGMA - Device• Example:

module magma Interface Integer Function magma_sgesv_gpu(n,nrhs,dA…)& BIND(C,name=“magma_sgesv_gpu”)

use iso_c_binding use cudafor implicit none integer(c_int), value :: n integer(c_int), value :: nrhs real (c_float), device,

dimension(:)::dA(*) …

end Function end Interface…

Page 20: Accelerated Linear Algebra Libraries

20

MAGMA - Device• Also need the MAGMA initialize function (defined in

the same interface block module):

… Interface Integer Function magma_init() & BIND(C,name=“magma_init”)

use iso_c_binding implicit none end Function end Interfaceend module

Page 21: Accelerated Linear Algebra Libraries

21

MAGMA - Device• Pseudo-code for a simple SGESV operation in magma

Program SGESV!Include the module that hosts your interfaceuse magmause cudaforuse iso_c_binding!Define your arrays and variablesReal(C_FLOAT) :: A(3,3), b(3)Real(C_FLOAT),device,dimension(:,:) :: dAReal(C_FLOAT),device,dimension(:) :: dBInteger(C_INT),value :: piv(3), ok, status

Page 22: Accelerated Linear Algebra Libraries

22

MAGMA - Device• Pseudo-code for a simple SGESV operation in magma

Program SGESV!Include the module that hosts your interfaceuse magmause cudaforuse iso_c_binding!Define your arrays and variablesReal(C_FLOAT) :: A(3,3), b(3)Real(C_FLOAT),device,dimension(:,:) :: dAReal(C_FLOAT),device,dimension(:) :: dBInteger(C_INT),value :: piv(3), ok, status

Page 23: Accelerated Linear Algebra Libraries

23

MAGMA - Device• Pseudo-code for a simple SGESV operation in magma

Program SGESV!Include the module that hosts your interfaceuse magmause cudaforuse iso_c_binding!Define your arrays and variablesReal(C_FLOAT) :: A(3,3), b(3)Real(C_FLOAT),device,dimension(:,:) :: dAReal(C_FLOAT),device,dimension(:) :: dBInteger(C_INT),value :: piv(3), ok, status

Page 24: Accelerated Linear Algebra Libraries

24

MAGMA - Device• Pseudo-code for a simple SGESV operation in magma

!Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init()!Copy filled host arrays `A` and `b` to `dA`!and `dB` using CUDA FortrandA = AdB = b!Call the device functionstatus = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok)!Copy results back to CPUb = dB!Loop through and write(*,*) the contents of !array `b`

end Program

Page 25: Accelerated Linear Algebra Libraries

25

MAGMA - Device• Pseudo-code for a simple SGESV operation in magma

!Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init()!Copy filled host arrays `A` and `b` to `dA`!and `dB` using CUDA FortrandA = AdB = b!Call the device functionstatus = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok)!Copy results back to CPUb = dB!Loop through and write(*,*) the contents of !array `b`

end Program

Page 26: Accelerated Linear Algebra Libraries

26

MAGMA - Device• Pseudo-code for a simple SGESV operation in magma

!Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init()!Copy filled host arrays `A` and `b` to `dA`!and `dB` using CUDA FortrandA = AdB = b!Call the device functionstatus = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok)!Copy results back to CPUb = dB!Loop through and write(*,*) the contents of !array `b`

end Program

Page 27: Accelerated Linear Algebra Libraries

27

MAGMA - Device• Pseudo-code for a simple SGESV operation in magma

!Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init()!Copy filled host arrays `A` and `b` to `dA`!and `dB` using CUDA FortrandA = AdB = b!Call the device functionstatus = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok)!Copy results back to CPUb = dB!Loop through and write(*,*) the contents of !array `b`

end Program

Page 28: Accelerated Linear Algebra Libraries

28

MAGMA - Device

• Before compiling, Make sure the MAGMA module, CUDA toolkit and the PGI programming environment is loaded– magma.cuf: Contains module of Interface blocks– sgesv.cuf: Contains the Fortran source

• To compile:$ module load cudatoolkit magma

$ ftn magma.cuf –lcuda –lmagma –lmagmablas sgesv.cuf

Page 29: Accelerated Linear Algebra Libraries

29

MAGMA - Device• C:• In C Device source code, no kind of interface block is

needed like in Fortran• Simply #include<magma.h> in your code• When declaring variables to use with MAGMA

functions, use magma_int_t instead of C’s int type. Matrices for MAGMA’s SGESV are of type float• Before running any MAGMA Device code, magma_init()must be called.

Page 30: Accelerated Linear Algebra Libraries

30

MAGMA - Device• C:• To interact with the device (Allocate matrices, transfer

data, etc) use the built in MAGMA functions– Allocate on the device: magma_dmalloc()– Copy matrix to device: magma_dsetmatrix()– Copy matrix to host: magma_dgetmatrix()

Page 31: Accelerated Linear Algebra Libraries

31

MAGMA - Device• Example code for C:

#include<magma.h>#include<stdio.h>

int main(){ //Define Arrays and variables for MAGMA float b[3], A[3][3]; float *A_d, *b_d //Device pointers magma_int_t m = 3, n = 1, piv[3] ok;

//Fill matrices A and b and allocate device //matrices magma_dmalloc(&A_d, m*m); magma_dmalloc(&b_d, m);

Page 32: Accelerated Linear Algebra Libraries

32

MAGMA - Device• Example code for C:

#include<magma.h>#include<stdio.h>

int main(){ //Define Arrays and variables for MAGMA float b[3], A[3][3]; float *A_d, *b_d //Device pointers magma_int_t m = 3, n = 1, piv[3] ok;

//Fill matrices A and b and allocate device //matrices magma_dmalloc(&A_d, m*m); magma_dmalloc(&b_d, m);

Page 33: Accelerated Linear Algebra Libraries

33

MAGMA - Device• Example code for C:

#include<magma.h>#include<stdio.h>

int main(){ //Define Arrays and variables for MAGMA float b[3], A[3][3]; float *A_d, *b_d //Device pointers magma_int_t m = 3, n = 1, piv[3] ok;

//Fill matrices A and b and allocate device //matrices magma_dmalloc(&A_d, m*m); magma_dmalloc(&b_d, m);

Page 34: Accelerated Linear Algebra Libraries

34

MAGMA - Device• Example code for C:

//Transfer matrices to device magma_dsetmatrix(m,m,A,m,A_d,m); magma_dsetmatrix(m,n,b,m,b_d,m); //Call the device sgesv function magma_sgesv_gpu(m,n,A_d,m,piv,b_d,m,&info); //Copy back computed matrix magma_dgetmatrix(m,n,b_d,m,b,m); //Loop through and print out returned array b}

Page 35: Accelerated Linear Algebra Libraries

35

MAGMA - Device• Example code for C:

//Transfer matrices to device magma_dsetmatrix(m,m,A,m,A_d,m); magma_dsetmatrix(m,n,b,m,b_d,m); //Call the device sgesv function magma_sgesv_gpu(m,n,A_d,m,piv,b_d,m,&info); //Copy back computed matrix magma_dgetmatrix(m,n,b_d,m,b,m); //Loop through and print out returned array b}

Page 36: Accelerated Linear Algebra Libraries

36

MAGMA - Device• Example code for C:

//Transfer matrices to device magma_dsetmatrix(m,m,A,m,A_d,m); magma_dsetmatrix(m,n,b,m,b_d,m); //Call the device sgesv function magma_sgesv_gpu(m,n,A_d,m,piv,b_d,m,&info); //Copy back computed matrix magma_dgetmatrix(m,n,b_d,m,b,m); //Loop through and print out returned array b}

Page 37: Accelerated Linear Algebra Libraries

37

MAGMA - Host

• Before compiling, Make sure the MAGMA module, CUDA toolkit and the GNU programming environment is loaded

• To compile:$ module swap PrgEnv-pgi PrgEnv-gnu

$ module load cudatoolkit magma

$ cc –lcuda –lmagma –lmagmablas sgesv_gpu.c