indicthreads-pune12-accelerating computation in html 5

Upload: indicthreads

Post on 04-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    1/26

    Accelerating computation in html 5

    Ashish ShahSAS R&D INDIA

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    2/26

    Outline

    Multicore Computing

    Problem statement

    Demo

    Introduction to OpenCL and WebCL

    Conclusion

    References

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    3/26

    Multicore Computing

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    4/26

    Problem statement

    Layout algorithm for node-linked graphs

    AlgorithmLayout

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    5/26

    DEMO

    Demo 1 Serial versionDemo 2 - Parallel version with multi-core CPU

    Demo 3 - Parallel version with many-core GPU

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    6/26

    Performance analysis

    Tim

    ein

    ms

    Number of particles

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    7/26

    Introduction to OpenCL

    OpenCompute Language, C- like language.

    Framework for writing parallel algorithms

    Heterogeneous platforms

    Developed by Apple

    Is an open standard and controlled by Khronosgroup

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    8/26

    Example of adding two vectors

    _kernel add(a,b,c)

    {

    int i =get_global_id(); //get thread id

    c[i]=a[i]+b[i];

    }

    For(i=1 to n)

    c[i]= a[i]+b[i];

    Serial version

    Using OpenCL

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    9/26

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    10/26

    OpenCL -Platform

    Device

    Host

    Host

    Intel CPUGPU 2

    ComputeDevice 1 (GPU1)

    Compute unite (Cores)

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    11/26

    OpenCL-Execution Model

    1. Kernel2. Work-items

    3. Work group

    4. ND-range

    5. Program

    6. Memoryobjects

    7. Commandqueues

    _kernel add(a,b,c)

    {int i =get_global_id();//get thread/workitem id

    c[i]=a[i]+b[i];

    }

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    12/26

    Memory Model in OpenCL

    Compute Device

    Compute unit 0 Compute unit 1 Compute unit 2

    Global Memory -DRAM

    Global constant memory-DRAM

    Local memory/cache Local memory/cache Local memory/cache

    Private register Private register Private register

    P i d l

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    13/26

    Programming model

    1. Data parallel-single function on multiple data

    2. Task parallel-Multiple functions on single data

    O CL S k

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    14/26

    OpenCL Runtime

    OpenCL Framework

    OpenCL Stack

    OpenCL Device (GPU/CPUhardware)

    Devicedriver

    Compiler

    Applications

    kernals

    OpenCL-Api

    HTML,.java,.NET,c,c++

    String data

    Java,c,.net,WebCL

    contextMemoryApis

    Commandqueues, bufferobjects, kernelexecution

    E ti l D l t T k

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    15/26

    Essential Development Tasks

    Parallelize Code KernelInitializeOpenCL

    environment

    Initiatekernels and

    data

    Executekernel

    Read backdata to host

    C-code with restrictions

    E ti l D l t T k

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    16/26

    Essential Development Tasks

    Parallelize Code KernelInitializeOpenCL

    environment

    Initiatekernels and

    data

    Executekernel

    Read backdata to host

    Query compute device Create context Compile kernels

    E ti l D l t T k

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    17/26

    Essential Development Tasks

    Parallelize Code KernelInitializeOpenCL

    environment

    Initiatekernels and

    data

    Executekernel

    Read backdata to host

    Create memory objects Map data structures to OpenCL

    supported data structures. Initialize kernel parameters

    E nti l D l nt T k

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    18/26

    Essential Development Tasks

    Parallelize Code KernelInitializeOpenCL

    environment

    Initiatekernels and

    data

    Executekernel

    Read backdata to host

    Specify number of threads toexecute task

    Trigger the execution of kernel-sync or async

    Essential Development Tasks

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    19/26

    Essential Development Tasks

    Parallelize Code KernelInitializeOpenCL

    environment

    Initiatekernels and

    data

    Executekernel

    Read backdata to host

    Map to application datastructure

    Introduction to WebCL

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    20/26

    Introduction to WebCL

    Java Script bindings for OpenCL

    First announced in March 2011 by Khronos

    API definition underway

    Prototype plugin is available only for Firefox

    browser

    Binding OpenCL to WebCL

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    21/26

    Binding OpenCL to WebCL

    CPU

    Host application JavaScript

    OpenCL Framework

    WebCL

    OpenCL

    compliant

    device

    Coding with WebCL

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    22/26

    Coding with WebCLplatforms = WebCL.getPlatformIDs();

    context = WebCL.createContextFromType([WebCL.CL_CONTEXT_PLATFORM,

    platforms[0]], WebCL.CL_DEVICE_TYPE_CPU);devices = context .getContextInfo(WebCL.CL_CONTEXT_DEVICES);

    program = context .createProgramWithSource(kernelSrc);

    kernelfunction1 = program.createKernel(function1");

    buffparam = context.createBuffer(WebCL.CL_MEM_READ_WRITE, bufSize);

    cmdQueue = context.createCommandQueue(devices[0], 0);

    cmdQueue.enqueueWriteBuffer(buffparam , true, 0, bufSize, parameter, []);

    kernelfunction1.setKernelArg(0, buffparam , WebCL.types.float2);

    cmdQueue.enqueueNDRangeKernel(kernelfunction1 , 1, [], totalWorkitems,

    totalWorkgroups, []);cmdQueue.finish ();

    cmdQueue.enqueueReadBuffer(xyz, true, 0, bufSize, xyzParam, []);

    A li ti f O CL

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    23/26

    Applications of OpenCL

    Database mining

    Neural networks Physics based simulation,mechanics

    Image processing

    Speech processing Weather forecasting and climate research

    Bioinformatics

    Conclusion

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    24/26

    Conclusion

    Significant performance gains in using OpenCLfor computations in client-side environmentslike HTML5

    Algorithms need to be parallelizable

    Further optimizations can be achieved by

    exploiting memory model

    Software/Hardware used in demo application

  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    25/26

    Software/Hardware used in demo application

    Hardware

    Intel(R) Core(TM)2 Quad core CPU Q8400 @ 2.66GHzNvidia 160m Quadro 8 cores @ 580 MHz

    Software

    OpenCL runtime for CPU

    http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/

    OpenCL runtime for GPU

    http://www.nvidia.com/object/quadro_nvs_notebook.html

    WebCL plugin for Firefox

    http://webcl.nokiaresearch.com/

    References

    http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://webcl.nokiaresearch.com/http://webcl.nokiaresearch.com/http://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://www.nvidia.com/object/quadro_nvs_notebook.htmlhttp://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/http://software.intel.com/en-us/articles/vcsource-tools-opencl-sdk/
  • 7/30/2019 IndicThreads-Pune12-Accelerating Computation in HTML 5

    26/26

    References

    http://www.macresearch.org/opencl

    http://en.wikipedia.org/wiki/GPGPU

    http://www.khronos.org/webcl/

    http://www.macresearch.org/openclhttp://en.wikipedia.org/wiki/GPGPUhttp://www.khronos.org/webcl/http://www.khronos.org/webcl/http://en.wikipedia.org/wiki/GPGPUhttp://www.macresearch.org/opencl