opencl programming 101

18
www.dsp-ip.com Fast Forward Your Development OpenCL Host Programming

Upload: yossi-cohen

Post on 15-Jan-2015

5.786 views

Category:

Technology


0 download

DESCRIPTION

OpenCL Programing Part 1 This lecture reviews Host programing according to OpenCL, a Khronos standard for parallel programming

TRANSCRIPT

Page 1: OpenCL Programming 101

www.dsp-ip.comFast Forward Your Development

OpenCL

Host Programming

Page 2: OpenCL Programming 101

Fast Forward Your Development

OPENCL™ EXECUTION MODEL

Page 3: OpenCL Programming 101

Fast Forward Your Development 3

OpenCL™ Execution Model

•Kernel

▫ Basic unit of executable code - similar to a C function

▫ Data-parallel or task-parallel

▫ H.264Encode is not a kernel

▫ Kernel should be a small separate function (SAD)

•Program

▫ Collection of kernels and other functions

▫ Analogous to a dynamic library

•Applications queue kernel execution instances

▫ Queued in-order

▫ Executed in-order or out-of-order

Page 4: OpenCL Programming 101

Fast Forward Your Development 4

Data-Parallelism in OpenCL™•Define N-dimensional computation domain (N = 1, 2 or 3)

▫ Each independent element of execution in N-D

domain is called a work-item

▫ The N-D domain defines the total number of work-

items that execute in parallel

1024 x 1024 image:

problem dimensions:

1024 x 1024 = 1 kernel

execution per pixel:

1,048,576 total executions

void

scalar_mul(int n,

const float *a,

const float *b,

float *result)

{

int i;

for (i=0; i<n; i++)

result[i] = a[i] * b[i];

}

Scalar

kernel void

dp_mul(global const float *a,

global const float *b,

global float *result)

{

int id = get_global_id(0);

result[id] = a[id] * b[id];

}

// execute dp_mul over “n” work-items

Data-Parallel

Page 5: OpenCL Programming 101

Fast Forward Your Development 5

Compiling Kernels• Create a program

▫ Input: String (source code) or precompiled binary

▫ Analogous to a dynamic library: A collection of kernels

• Compile the program

▫ Specify the devices for which kernels should be compiled

▫ Pass in compiler flags

▫ Check for compilation/build errors

• Create the kernels

▫ Returns a kernel object used to hold arguments for a given execution

Page 6: OpenCL Programming 101

Fast Forward Your Development

EX-1:OPENCL-”HELLO WORLD”

Page 7: OpenCL Programming 101

Fast Forward Your Development

Page 8: OpenCL Programming 101

Fast Forward Your Development

BASIC Program structure

Include

Get Platform Info

Create Context

Load & compile program

Create Queue

Load and Run Kernel

8

Page 9: OpenCL Programming 101

Fast Forward Your Development

Includes

9

#include <cstdio>

#include <cstdlib>

#include <iostream>

#include <SDKFile.hpp>

#include <SDKCommon.hpp>

#include <SDKApplication.hpp>

#include <CL/cl.hpp>

• Pay attention to include ALL OpenCL include files

Page 10: OpenCL Programming 101

Fast Forward Your Development

GetPlatformInfo

10

err = cl::Platform::get(&platforms);

if(err != CL_SUCCESS)

{ std::cerr << "Platform::get() failed (" << err << ")" << std::endl;

return SDK_FAILURE;

}

std::vector<cl::Platform>::iterator i;

if(platforms.size() > 0)

{ for(i = platforms.begin(); i != platforms.end(); ++i)

{

if(!strcmp((*i).getInfo<CL_PLATFORM_VENDOR>(&err).c_str(), "Advanced

Micro Devices, Inc."))

{ break;}

}

}

• Detects the OpenCL “Devices” in the system:

▫ CPUs, GPUs & DSPs

Page 11: OpenCL Programming 101

Fast Forward Your Development

Create Context

11

cl_context_properties cps[3] =

{ CL_CONTEXT_PLATFORM, (cl_context_properties)(*i)(), 0 };

std::cout<<"Creating a context AMD platform\n";

cl::Context context(CL_DEVICE_TYPE_CPU, cps, NULL, NULL, &err);

if (err != CL_SUCCESS)

{

std::cerr << "Context::Context() failed (" << err << ")\n";

return SDK_FAILURE;

}

• Context enables operation (Queue) and memory sharing between devices

Page 12: OpenCL Programming 101

Fast Forward Your Development

Load Program

12

std::cout<<"Loading and compiling CL source\n";

streamsdk::SDKFile file;

if (!file.open("HelloCL_Kernels.cl"))

{ std::cerr << "We couldn't load CL source code\n";

return SDK_FAILURE;}

cl::Program::Sources

sources(1, std::make_pair(file.source().data(),

file.source().size()));

cl::Program program = cl::Program(context, sources, &err);

if (err != CL_SUCCESS)

{ std::cerr << "Program::Program() failed (" << err << ")\n";

return SDK_FAILURE;

}

• Loads the kernel program (*.cl)

Page 13: OpenCL Programming 101

Fast Forward Your Development

Compile program

13

err = program.build(devices);

if (err != CL_SUCCESS) {

if(err == CL_BUILD_PROGRAM_FAILURE)

{ //Handle Error

std::cerr << "Program::build() failed (" << err << ")\n";

return SDK_FAILURE;

}

• Host program compiles Kernel program per device.

• Why compile in RT? - Like Java we don’t know the device till we run. We can decide in real-time based on load-balancing on which device to run

Page 14: OpenCL Programming 101

Fast Forward Your Development

Create Kernel with program

14

cl::Kernel kernel(program, "hello", &err);

if (err != CL_SUCCESS)

{

std::cerr << "Kernel::Kernel() failed (" << err << ")\n";

return SDK_FAILURE;

}

if (err != CL_SUCCESS) {

std::cerr << "Kernel::setArg() failed (" << err << ")\n";

return SDK_FAILURE;

}

• Associate Kernel object with our loaded and compiled program

Page 15: OpenCL Programming 101

Fast Forward Your Development

Create Queue per device & Run it

15

cl::CommandQueue queue(context, devices[0], 0, &err);

std::cout<<"Running CL program\n";

err = queue.enqueueNDRangeKernel(…..)

err = queue.finish();

if (err != CL_SUCCESS) {

std::cerr << "Event::wait() failed (" << err << ")\n";

}

• Loads the kernel program (*.cl). This does not have to happen immediately

• Attention: enqueue() is Asynchronous call meaning : function return does not imply Kernel was executed or even started to execute

Page 16: OpenCL Programming 101

Fast Forward Your Development

And that’s All Folks?

• Naaaa…..We still need to learn:

• Writing Kernel functions

• Synchronizing Kernel Functions

• Setting arguments to kernel functions

• Passing data from/to Host

16

Page 17: OpenCL Programming 101

Fast Forward Your Development

References

• “OpenCL Hello World” is an ATI OpenCL SDK programming exercise

• ATI OpenCL slides

17

Page 18: OpenCL Programming 101

Fast Forward Your Development

DSP-IP Contact information

Download slides at: www.dsp-ip.com

Course materials & lecture request

www.dsp-ip.comMail : [email protected]: +972-9-8850956,Fax : +972-50- 8962910

Yossi Cohen

[email protected]

+972-9-8850956