1 how to realize high-performance compute with multicore dsp

25
1 How to realize high-performance compute with Multicore DSP

Upload: antonio-hagan

Post on 26-Mar-2015

258 views

Category:

Documents


8 download

TRANSCRIPT

Page 1: 1 How to realize high-performance compute with Multicore DSP

1

How to realize high-performance compute with Multicore DSP

Page 2: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

C667x Target Applications (Non- Telecom)

Emerging Others

Test and AutomationMission Critical

Infrastructure Audio

HPC, Imaging and Medical

Video Infrastructure

Emerging Broadband

Innovations

Page 3: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions 3

3

RF and Communication Applications

Key Customer Careabouts •Long Term Partnership•Financial Stability•Strong Roadmap and R&D•Floating Point Performnce•Size, Weight, and Power (SWaP)•I/O Bandwidth •Longevity of supply (10+yrs)

Application ISR (Intelligence/Surveillance/Reconnaissance)

o SIGINT/COMINT/Signal GeneratorsMilitary Communications.

o SDR(JTRS)-Manpack/LMR/Fixedo Comm. Infra - VoIP/Video Gateways

Satellite\Avionics Communicationso Ground Receiver/Repeaterso Weather Radar

FAA – Civil Aviation/Govt Comm.Conventional PS – TETRA/APCO/E911

o Wireless Infrastructureo Comm. Infra - VoIP/Video Gateways

Emerging Broadband (OFDM/LTE/WiMAX)o Utilities/Transport/Smart Grid

Govt & Public SafetyAvionicsMilitary & Defense

Page 4: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions 4

RF and Comm. Product Requirements

Needs Raw Performance in terms of MIPS/GHz/MMACS

Floating Point Capable ISA to achieve “precision” and high GFLOPS.

Large On Chip RAM – Reduce accesses to slow

external memory. High Speed External Memory

Interface Large addressable memory Efficient DMA architecture Wireless specific accelerators

and TCP/IP Offload

Support Multiple Waveforms Common Platform for

TDMA/CDMA/OFDMA Multi-channel VoIP/Video

capability Support FEC and Modulation TCP/IP Networking support

End Product Need DSP Requirement

Page 5: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions 5

Reliability in Mission Critical Designs

Low Power Design

High BW Interface RF Front End and Telecom ports

Connect Multiple DSPs on a board e.g. in ATCA Card

High BW Backplane and Network Connectivity

Needs multiple high speed interfaces

– PCIe ,Serial RapidIO– OBSAI/CPRI Interface– Gigabit Ethernet etc

Memory Error Correction & Checking (ECC)

Efficient Low Power DSPs Support Extended Temp ranges from

-40oC to 105oC and others Temp

Ease of Use

Imaging Product Requirements

Dev and Debug Tools Multicore S/W Frameworks Signal/Image Processing functions. VoIP Library Audio/Video Codecs

End Product Need DSP Requirement

Page 6: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

6

Introducing “Keystone Architecture” (C66x)The Best Combination of Performance (GHz) and Power Consumption in the Industry

16GFLOPs & 32GMACS per Core @ 1GHz

Fixed and Floating-point Core@ 1.25 GHz

4x C64x+ MAC (32)4xC67x Fl pt MAC(8)

16FLOP/cy compared to 6FLOP/cy

8 Core C6678 based on C66x core delivers 320 GMACs/160GFLOPS

@ 1.25GHz/Core (effectively a 10GHz DSP)

100% Code Compatible with allC64x (fixed) & C67x (floating)

Devices

Similar Power Profiles as C64x Core

Supported by Code Composer Studio IDE

Next-Generation Next-Generation C66x DSP CoreC66x DSP Core

FloatingPoint

FixedPoint

C64x+ Core (Fixed pt)

C64x+

Lowest Power Highest Performance DSP Core

C67x Core (Floating pt)

Industry’s Lowest Power FP DSP CoreHigh precision and wide dynamic range

C67xx

NEW MultiCore

DSP C66x

KEYSTONEArchitecture

Page 7: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

0 2000 4000 6000 8000 10000 12000 14000

TMS320C66xx

TMS320C67x

Renesas SH77xx (SH-4)

Intell Pentium III

ADI TS202S/203S (TigerSHARC)

ADI TS201S (TigerSHARC)

ADI 213xx (SHARC)

ADI 2126x (SHARC)

ADI 2116x (SHARC)

Unmatched Performance

BDTI Score for Floating Point Processors

BDTImark2000 BDTImark2000 TMTM Score Score

0 5000 10000 15000 20000 25000

TMS320C66xx

TMS320C64x+

Freescale MSC815x (SC3850)

Freescale MSC814x (SC3400)

Freescale MSC81xx (SC140)

ADI TS202S/203S (TigerSHARC)

ADI TS201S(TigerSHARC)

ADI BF5xx (Blackfin)

NEC uPD77050

BDTI Score for Fixed Point Processors

AlgorithmC67x @ 300MHz

C64x+ @1.2GHz

C66x @1.25GHz Gain

Single Precision Floating Point FFT, 2048 pt, Radix 4

86.84 us 14.00 us* ~600%

Fixed Point FFT, 2048 pt, Radix 4 8.23 us 4.46 us* ~200%

FIR Filter, 40 samples, 40 taps 0.69 us 0.34 us* ~200%

Matrix Multiply 32 x 32 17.92 us 6.16 us* ~300%

Matrix Inverse 4 x 4 0.53 us 0.13 us* ~400%

Page 8: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

8 8

The first network on chip infrastructure to unleash full multicore entitlement

Tera

Net

2

Shared MemoryShared Memory

High Speed I/OHigh Speed I/O

Multicore Shared Memory Controller Multicore Shared Memory Controller

C66x, ARMProcessing Cores

C66x, ARMProcessing Cores

Multicore Navigator Multicore Navigator

Application AcceleratorApplication Accelerator

Application AcceleratorApplication Accelerator

HyperLink50

System Management(Debug, Clocking, Power)System Management(Debug, Clocking, Power)

Network on Chip

TI Multicore KeyStone Architecture

• Highest Integration– Cost & Power

• Common Architecture– Portable Software

• Scalable Tailored Solutions

• Navigator– Innovative Multi-core

• Floating Point– Development Time

• Tools & Debugging– R&D Efficiency

• Quality Software– Solutions & Libraries

Page 9: 1 How to realize high-performance compute with Multicore DSP

9TI Confidential – NDA Restrictions

Product Highlights: C6670 and C6678

TI Confidential – NDA Restrictions

Next Generation C66x Core - Up to 8 C66x Cores @ 1GHz -1.25GHz- Available Options: 1, 2, 4, and 8 Core Devices

Memory Architecture- 4MB Local L2/Core (512KB per Core)- 4MB Multicore Shared Memory

Power Optimized Core - <10W at 1Ghz nominal temp

C6678C6678Power Optimized Core

C6670C6670Performance Optimized Core

Next Generation C66x Core - 4 C66x Cores @ 1GHz - 1.2GHz

Memory Architecture- 4MB Local L2/Core (1MB per Core)- 2MB Multicore Shared Memory

Communication Accelerators- TCP3e (Turbo Encode) – Up to 550Mbps- TCP3d (Turbo Decode) – Up to 600Mbps- FFTC – 2048 FFT every 4.6µs- VCP2 for voice channel decoding

Multicore Navigator

Te

raN

et

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

8 x CorePac8 x CorePac

SRIOx4

SRIOx4

PCIex2

PCIex2

EMIF16

EMIF16

TSIPx2

TSIPx2

I2CSPI

I2CSPI UARTUART

Peripherals & IOPeripherals & IO

GbESwitch

GbESwitch

SGMIISGMIISGMIISGMII

IP InterfacesIP Interfaces

CryptoCrypto

Packet Accelerator

Packet Accelerator

NetworkCoProcessors

NetworkCoProcessors

Power ManagementPower Management

DebugDebug

Multicore Shared Memory Controller(MSMC)

Multicore Shared Memory Controller(MSMC)

Shared Memory 4MBShared Memory 4MB

DDR3-64b

DDR3-64b

EDMAEDMASysMonSysMon

System ElementsSystem Elements

Memory SubsystemMemory Subsystem

Hyp

erLi

nkH

yper

Link

Multicore Navigator

Ter

aNet

C66X DSP

C66X DSP

L1L1 L2L2

SRIOx4

SRIOx4

PCIex2

PCIex2

AIF2 x6

AIF2 x6

I2CSPI

I2CSPI UARTUART

Peripherals & IOPeripherals & IO

SGMII x2

SGMII x2

4x VCP24x VCP2 3x TCP3d3x TCP3d

CommunicationsCoProcessors

CommunicationsCoProcessors

Power ManagementPower Management

DebugDebug

Multicore Shared Memory Controller(MSMC)

Multicore Shared Memory Controller(MSMC)

Shared Memory 2MBShared Memory 2MB

DDR3-64b

DDR3-64b

EDMAEDMASysMonSysMon

System ElementsSystem Elements

Memory SubsystemMemory Subsystem

Hyp

erL

ink

Hyp

erL

ink

C66X DSP

C66X DSP

L1L1 L2L2

2x RAC2x RAC 1x TAC1x TAC

3x FFTC3x FFTC BCPBCP

CryptoCrypto

Packet Accelerator

Packet Accelerator

NetworkCoProcessors

NetworkCoProcessors

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

Page 10: 1 How to realize high-performance compute with Multicore DSP

10TI Confidential – NDA Restrictions

Memory Architecture• 0.5 MB of local Memory per core;• 4 MB of Shared Memory. • Enhanced memory architecture through an enhanced Multicore Shared memory Controller• Bottleneck free fast on- and off-chip memory access including a DDR3-1333MHz (64-bit) interface• L1/L2/L3 ECC

Multicore Navigator

Ter

aN

et

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

C66X DSP

C66X DSP

L1L1 L2L2

8 x CorePac8 x CorePac

SRIOx4

SRIOx4

PCIex2

PCIex2

EMIF16

EMIF16

TSIPx2

TSIPx2

I2CSPII2CSPI UARTUART

Peripherals & IOPeripherals & IO

GbESwitch

GbESwitch

SGMIISGMIISGMIISGMII

IP InterfacesIP Interfaces

CryptoCrypto

Packet Accelerator

Packet Accelerator

NetworkCoProcessors

NetworkCoProcessors

Power ManagementPower Management

DebugDebug

Multicore Shared Memory Controller (MSMC)

Multicore Shared Memory Controller (MSMC)

Shared Memory 4MBShared Memory 4MB

DDR3-64b

DDR3-64b

EDMAEDMASysMonSysMon

System ElementsSystem Elements

Memory SubsystemMemory Subsystem

Hyp

erLi

nkH

yper

Link

Innovation & Integration via C6678 DSP Highlights

Peripherals and I/O InterfacesHigh bandwidth peripherals that operate independently (NOT Shared) allowing simultaneous data transfer to prevent bottle necks - featuring: RapidIO v2.1 – 4lanes @ 5Gbps with 1x, 2x and 4x support PCIe x2 – 2lanes, running independently of RapidIO

Improved DebugS/W Dev and Debug Support Leveraged by CCS

C66x Core Next generation Fixed / Floating-Point DSP core with clock speeds ranging from 1GHz– 1.25GHz and Up to 8 core options

Network Co- Processor and Accelerators A cost effective implementation to off-load the TCP/IP and secure networking functions from the DSP

Multicore NavigatorData transfer engine that is architected to move data between various system elements without using any CPU overhead so maximum system efficiency is achieved

TeraNet Switch fabric that has 2 Terabits of bandwidth which allows maximum data transfer between system components to realize full system entitlement

HyperLinkUltra high-speed ( up to 50 Gbaud), low latency serial interface that connects to other DSPs and FPGAs in the systems

Page 11: 1 How to realize high-performance compute with Multicore DSP

11

Competitive Analysis

Value Prop against FPGA Value Prop against other DSPs

•C66x Performance– 320GMACS/160GFLOP– Baseband on a chip. Handles

multiple waveforms supporting OFDM,CDMA,TDM

– L1/L2/L3 Processing capability– Wireless Accelerators

(VCP/TCP/FFT)

•Software Programmability– Time To Market

•Smaller Package (more DSP/Board)

•Lower Power – smaller battery, simpler cooling

•Low Cost - MIPs/$

•C66x Fixed & Floating Point [email protected]– Industry’s Fastest DSP at 10GHz

•On-Chip RAM up to 8MB•DDR3

– 1600MHz, 64Bit, 8GB Address space•Multiple Independent High Speed IO

– 4xsRIOv2.1,2xPCIe Gen II, 2xSGMII, 2xTSIP•High BW FPGA connectivity

– Hyperlink @ 50Gbps•1/2/4/8 Core Option (Pin Compatible)•L1/L2/L3 Memory ECC – System Reliability•Low Power per GFLOPs and GMACS•Extended Temp support -40oC to 105oC•CCS Tools + S/W Collateral•3rd Party Network

Page 12: 1 How to realize high-performance compute with Multicore DSP

TMDXEVM6678L EVMSinge wide AMC form factor

Code Composer Studio™ IDE*Design *Code and Build *Debug *Analyze *Tune

CCSv5 Allows designers of all experience levels to move quickly through application development (www.ti.com/ccstudio)•Time Limited FREE Evaluation Versions available for download. Includes C667x Simulator

EVM Kit includes•BIOS 6.x, •BIOS-MCSDK / LINUX-MCSDK 2.0 (NDK, PDK, LIB etc), •Sample Program and Out of box demo (OOB) e.g.

• I/O Benchmark, Imaging Processing Pipeline and High Performance DSP Utility Application (HUA)

•User Guide, Starter guide, Tech Ref Guide, App Notes etc

H/W Development Tools

• TMDXEVM6678L – EVM with XDS100 emulation - $399

• TMDXEVM6678LE – EVM with XDS560V2 emulation - $599

• TMDXEVM6678LXE – EVM with XDS560V2 emulation –Encryption Enabled - $599

• TMDSEMU560v2STM-UE - XDS560v2 System Trace Emulator with 128Mb System Trace buffer and Ethernet / USB support

• Optional PCIe adapter card to connect the C6678 EVM to a standard PCI header of a desktop.

C6678C6678

Page 13: 1 How to realize high-performance compute with Multicore DSP

TI’s Multicore Hardware Ecosystem

CustomCustom

Chassis / SystemChassis / System

OthersOthers

PCIExpress (with Gen 2)PCIExpress (with Gen 2)

Advanced Mezzanine (AMC)Advanced Mezzanine (AMC)

ATCAATCA

Standardized BoardsStandardized Boards

Other Other

Page 14: 1 How to realize high-performance compute with Multicore DSP

TI’s Multicore Software Ecosystem

Layer 1 UMTSLayer 1 UMTS Layer 1 LTELayer 1 LTE

Layer 2+Layer 2+

Customer ApplicationCustomer Application

TI Layer 1 LibrariesTI Layer 1 Libraries TI BIOS, Linux, OSE(ck)TI BIOS, Linux, OSE(ck)

Multicore EntitlementMulticore Entitlement

TI’s Device Entitlement LibrariesTI’s Device Entitlement Libraries

IP Network Stack

IP Network Stack

TI RuntimeTI Runtime

Page 15: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

15

DSP

Multicore Tools and Software (MC-SDK)• Tools

– Codegen with OpenMP support

– Emulator/Debugger– Simulator– Profiler / DVT– 3rd party tools

• Software– BIOS/Linux SDK

• Multicore Demonstration• 6.x DSP BIOS

– Platform Abstraction– Basic Networking– Inter core communication

• Application Specific Libraries– Audio/Video CODECS– VoIP Components– WiMAX Toolkit, LTE Toolkit,– DSPLib

• others..

Host Computer Target Board

XDS 560 V2XDS 560 Trace

Eclipse

Code Composer StudioTM

ThirdParty

Plug-Ins

Editor/IDEEditor/IDE

CompilerLinker

(Codegen)

CompilerLinker

(Codegen)

ProfilerProfiler

DebuggerDebugger

RemoteDebug

RemoteDebug

SoC Analyzer

SoC Analyzer

PolycorePolycore

ENEAOptima

ENEAOptima

3L3L

Operating System w/ Boot Loader

BIOS

Full Silicon Entitlement

Multicore Entitlement

Linux

Platform Development Kit

Inter Core Communication

Customer Application

Speech Codec

NDK AudioCodec

Video Codec

Demo App Multicore

BIOS

Demo App Multicore

Linux

Demo App Multicore BIOS and

Linux

DSPLIBIMGLIB

Multicore Software Development Kit

Page 16: 1 How to realize high-performance compute with Multicore DSP

Digital Signal Processing• FFT• Adaptive Filtering• Filtering and convolution• Others…..• Available free from TI

KeyStone Multicore Software – Libraries & Codecs

MATLAB• Image processing• Math operations

Vision Analytics

Image Processing• Edge Detection• Boundary• Morphology• Others…..• Available free from TI

Voice and Fax• Line Echo

Cancellation• Voice Activity

Detection• Others…• Available free from TI

Security/Cryptography• AES, SHA1, 3DES

Voice• G.711, G.722• G.723, G.729• CDMA, AMR(NB/WB),

EVRC-B• Others

Audio• MPEG1 Layer2• AAC LC/HE• AC3 2.0/5.1• Sample Rate

Conversion

Video• H.263• H.264• MPEG2• MPEG4• VC1/WMV9 Decode• Others

Fax• T.38• Fax Modem

Libraries

Codecs

Vision Lib (object only)• 50+ royalty-free kernels:

• Background modeling & subtraction• Object feature extraction• Tracking, recognition• Low-level pixel processing

Page 17: 1 How to realize high-performance compute with Multicore DSP

High-Performance and Multicore Processor

High Value

Easy to Use

Quick to Market

Low-Cost EVM High-Performance at the Right Power & Price

Open & Affordable Tools

User CommunityDrivers &

Example Code

Product CollateralTraining

Enabler Software

Frameworks & Abstraction

Generic Libraries

Application Libraries

Benchmarks & Functional Understanding

Quick-Start Hardware

Keystone Architecture

Page 18: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

Getting Started – More Information/Links• Product Folders:

– C66X Informational Wiki Page– All C6000 Multicore DSPs

• TMS320C6670 • TMS320C6678

• EVMs and Software Tools:– TMS320C6678 EVM– TMS320C6670 EVM– AMC to PCIe Adapter Card– Multicore Software Development Kit for BIOS & Linux

• MCSDK Wiki• CCS v5 Wiki• C66x Linux Wiki

– DSP Signal Processing Library(DSPLIB)– Image and Video Processing Library (IMGLIB)– LTE /WiMAX Toolkit – Discuss with BDM

• Technical Support– TI E2E Community (Online Support)– Product Training

TI Confidential – NDA RestrictionsTI Confidential – NDA Restrictions

Page 19: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

Online Video Traininghttp://focus.ti.com/docs/training/catalog/events/event.jhtml?sku=OLT110027

Page 20: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

Mission Critical DSP Market“What Customers Like about TI”

• Undisputed #1 DSP and SoC supplier– Strong Growth for 8 years in a row, even in 2009

– Higher R&D spending than DSP revenue of most competitors

• KeyStone SoC Architecture secures future success– Rich Product Portfolio & Strong Roadmap

– 2 Families with multiple devices and growing• Nyquist(6670), Shannon(6678/4/2)• 40nm -> 28nm• Tools/Software & Compilers• 3rd Party Eco-System

– Multiple Design Wins Pre-Announcement

• Secure Supply – No DSP product discontinuation (end of life)• History of delivery upon promises (Power, GHz, ..)• Field Experience - Completeness of system analysis, Architecture, Internal Switch, ….• Customer Support• Business Model - Long Term relationships with key customers

– Actively seek and incorporate customer feedback in roadmap devices.

TI SoCArchitecture

Layer 1

Laye

r 2

Layer 3+

PHY

MA

C

Laye

r 3, 4

Radio IP Network

MacroPico

FemtoSoftware

2002 2009

Reve

nue

Page 21: 1 How to realize high-performance compute with Multicore DSP

21

Backup SlidesProduct Details

Page 22: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

C6678 (Shannon) “Lightning” Half-Length PCIe Card Feature SetC6678 (Shannon) “Lightning” Half-Length PCIe Card Feature Set

TI TMS320C6678 (8-core) x 4― C66x Core Frequency: 1.25GHz― DDR3 Memory

― Data Frequency: 1600MHz― Data Bus Width: 64-bit

― Serial RapidIO Gen-2 Interface― PCIe Gen-2 Interface― 10/100/1000Mbps Ethernet w/ SGMII― Hyperlink50 Interface

1024 MB DDR3-1333 on board PLX PEX8624 PCIe Gen-2 Switch Serial RapidIO daisy-chain Ethernet daisy-chain Each DSP device is linked to PCIe

switch by x2 lanes Dual DSPs linked by Hyperlink50 Power: Max 54Watts

Page 23: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

What is Hyperlink?“high-speed, low-latency, and low-pin-count communication interface”

23

•Low pin count (24 pins)•Point to Point Connection•Interconnect

•DSP-to-DSP•DSP-to-FPGA.

•SerDes for data transfer• x1 x4 modes for Tx and Rx•12.5GBaud/lane•Effectively 8b9b encoding

•LVCMOS sideband signals for flow control & power mgmt - errors/events/timeouts

* Simple packet-based transfer protocol for memory-mapped access* Read/Write to DSP/FPGA local memory - discrete memory access of any byte aligned width up to 64bits. - burst transfer modes• Write (Maximum Burst Size 256Bytes)

– Write Request --->– Data Packet --->

• Read (Maximum Burst Size 256Bytes)– Read Request --->– Read Response -

• Interrupt Request <-->

Up to 64 Memory mapped Regionseach region up to 256MB

Page 24: 1 How to realize high-performance compute with Multicore DSP

TI Confidential – NDA Restrictions

Universal Parallel Port (uPP)

• What is it?– Parallel bus, two independent channels (separate data

buses)– I/O speeds up to 75 MHz with 8-16 bit data width per channel– 1 or 2 channel parallel interface operating in RX, TX or FD

mode– Supports Double data rate mode of operation (Bandwidth

does not change/increase)

• Application– Each channel can interface cleanly with high-speed ADCs and/or

DACs with up to 16-bit data width (per channel).

– Useful as low cost interface with FPGAs. Can run up to 120MByte/s per channel in single channel or bi-directional mode ( 240MByte for both channels in unidirectional mode)

– Can also be used to interface two C6655/57 devices or to connect C6655/57 with C674x or OMAP-L13x family of devices.

• Other benefits– Internal DMA – leaves CPU EDMA free– Simple protocol with few control pins (configurable: 2-4 per

channel)– Multiple data packing formats for 9-15 bit data widths– Interleave mode (single channel only)– Simple interface: IO Queued by software

Throughput Estimates:

Note: Max. clock of 50 MHz in (*) configuration

Page 25: 1 How to realize high-performance compute with Multicore DSP

25

Thank You