optimizing film, media with opencl & intel quick sync video · 2013. 10. 30. · sdk...

16
Optimizing Film, Media with OpenCL & Intel Quick Sync Video Petter Larsson, Senior Software Engineer Ryan Tabrah, Product Manager SIGGRAPH 2012

Upload: others

Post on 26-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

Optimizing Film, Media with OpenCL

& Intel Quick Sync Video

Petter Larsson, Senior Software Engineer

Ryan Tabrah, Product Manager

SIGGRAPH 2012

Page 2: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

SIGGRAPH 2012

The Intel® Vision Enriching the lives of every person on earth through technology

Page 3: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

Visual Tools for Developers

SIGGRAPH 2012

Both Available FREE of charge…

High performance GPU acceleration for complete media pipelines

Seamless surface sharing between Media and OpenCL context

Effortless fallback on CPU processing for legacy platforms

Complete GPU workload analysis via Intel Graphics Performance Analyzer

Intel SDK for OpenCL Applications 2012 is a comprehensive development environment for OpenCL applications

An open standard compute model: Enables applications with cross architecture functional portability on 3rd Generation Intel Core™ processor–based platforms

Intel Media SDK 2012 is a great way to optimize applications to utilize the power of Intel Quick Sync video

Hardware accelerated video encoding, decoding, and transcoding: Fully utilize the power of Intel Core HD Graphics

Extend programmability options on Intel platforms: Augments Intel’s developer choice of programming tools on Intel platforms

Page 4: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

Visual Tools for Developers

SIGGRAPH 2012

Here’s where you would expect a roadmap…. You don’t need one. - API future proofs your software – code NOW and optimize for tomorrow’s platforms - Deliver your products on your own timeline - Single interface for all compute devices - Span across OS and platform versions Deliver the best, most efficient user experience to your customers, utilizing the full power of Intel Core CPU and HD Graphics technology.

Page 5: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

Intel Quick Sync Video & OpenCL*: The Speed You Need

Live Demo with Sony Movie Studio 12

SIGGRAPH 2012

Media Conversion With Intel Quick Sync Video & OpenCL* Hardware Acceleration

Live Demo

Page 6: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

SIGGRAPH 2012

Graphics and Media Interoperability with

OpenCL* APIs Extensions Intel HD

Graphics

support

CPU

Device

DirectX<->OpenCL cl_khr_d3d10_sharing

OpenGL<->OpenCL cl_khr_gl_sharing

cl_khr_gl_event

DirectX Video Acceleration

(DXVA) <->OpenCL

cl_intel_dx9_media_sharing

Intel Media SDK<->OpenCL cl_intel_dx9_media_sharing

Interoperability with Intel Media SDK, DirectX* and OpenGL* APIs allow OpenCL developers to better utilize platform resources on graphics tasks

Page 7: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

SDK Interoperability Sample/Demo - using cl_intel_dx9_media_sharing extension

SIGGRAPH 2012

Video stream

•H.264(AVC)

•MPEG2

•VC1

•MJPEG

Media SDK video decode

•Decode to D3D NV12 surfaces

•DirectXVideo DecoderService surfaces

Open CL frame processing

•Color effect (NV12)

•Water ripples (NV12)

•Twirl (RGB)

•Flip (RGB)

Render

•Rendered to window or full screen via VideoProcessBlt

Shared surface Shared surface

GPU accelerated

Setup buffers:

clCreateFromDX9MediaSurfaceIntel

Processing:

1. clSetKernelArg

2. clEnqueueAcquireDX9ObjectsINTEL

3. clEnqueueNDRangeKernel/clEnqueueTask

4. clEnqueueReleaseDX9ObjectsINTEL

5. clFlush

Page 8: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

SDK Interoperability Sample/Demo - integration specifics

• Code based on Intel Media SDK “sample_decode”

– Includes common file access, memory and device mgmt functions

• OpenCL processing class

– Handles OpenCL device setup/teardown and frame processing

– OpenCL processing applied on D3D surface before rendering

– Integrated with the “sample_decode” renderer class

• Features and effects selected via keyboard input

Next… Demo… Quick code walkthrough…

Page 9: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

SDK Interoperability Sample Code walkthrough

Code “fly-by” will showcase the following

• How to setup OpenCL environment with surface sharing specifics – DX9_MEDIA_SHARING / “cl_ext.h”

– Check for “cl_intel_dx9_media_sharing” extension availability

– Hook up OCL D3D sharing extensions

• How to setup Media SDK sessions and basic decode process

• DXVA surface allocation - How to create shared handles

• Media SDK/OpenCL - Key integration points

• Open CL kernel code for the demo effects

SIGGRAPH 2012

Page 10: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

SDK Interoperability Sample/Demo - performance

• Demo Benchmark : 1440x1088 AVC video clip

• Analysis of GPU performance using

Intel Graphics Performance Analyzer (GPA)

SIGGRAPH 2012

workload fps CPU (%) GPU EU(%) GPU Decode(%)

HW decode + OCL color effect 225 28 95 20

HW decode (30 fps) + OCL color effect 30 4 45 6

SW decode + OCL color effect 110 95 65 0

SW decode (30 fps) + OCL color effect 30 25 40 0

CPU GPU accelerated

Page 11: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

What’s in the future?

• DirectX 11

– Unifies API for both video and 3D content

– ID3D11VideoDevice: Decode directly tied to DX11 / DXGI

– Direct Flip : Save a memory copy during playback of a video frame

• Open CL 1.2

– DX11 buffer sharing extension (cl_khr_d3d11_sharing)

– cl_intel_dx9_media_sharing promoted to cl_khr_dx9_media_sharing

• GPA improvements

– DX11 support

SIGGRAPH 2012

Page 12: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

THANK YOU!

• Download these tools now for free:

http://intel.com/software/vcsource

• Follow us on Twitter:

@IntelMediaSDK

@IntelOpenCL

@IntelVCDev

[email protected]

SIGGRAPH 2012

Page 13: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

Accelerate Visual Development Faster

SIGGRAPH 2012

Page 14: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

SDK Interoperability Resources

• Open CL plug-in sample : Simple rotate kernel (not using shared surfaces) Media SDK

• ResourceSharing sample: D3D10 buffer & DXVA surface sharing

• MediaSDKInterop sample: Media SDK plug-in; Open CL post processing effects

OpenCL SDK

• Media SDK decode – Open CL post processing Session demo

SIGGRAPH 2012

Page 15: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

Additional Resources

• Collecting OpenCL*-related Metrics with Intel® Graphics

Performance Analyzers link

• Using Intel® Graphics Performance Analyzer (GPA) to analyze

Intel® Media Software Development Kit-enabled applications link

• Performance Interactions of OpenCL* Code and Intel® Quick

Sync Video on Intel® HD Graphics 4000 link

• Forums

– Intel Media SDK: link

– Intel Open CL SDK: link

SIGGRAPH 2012

Page 16: Optimizing Film, Media with OpenCL & Intel Quick Sync Video · 2013. 10. 30. · SDK Interoperability Sample/Demo - performance • Demo Benchmark : 1440x1088 AVC video clip ... Please

Legal Disclaimer and Optimization Notice

SIGGRAPH 2012

INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Copyright © , Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon, Core, VTune, and Cilk are trademarks of Intel Corporation in the U.S. and other countries.

Optimization Notice

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804