methodology to standardize the development of fpga- based ...€¦ · why is it expensive to...
TRANSCRIPT
Methodology to standardize the development of FPGA-based intelligent DAQ and processing systems on
heterogeneous platforms using OpenCL
M. Astrain1, M. Ruiz1, S. Esquembri1, A. Carpeño1, E. Barrera1, J. Vega2
1Instrumentation and Applied Acoustic Research Group, Universidad Politécnicade Madrid, Madrid, Spain
2Laboratorio Nacional de Fusión, CIEMAT, Madrid, Spain
[email protected] Slide 1
OUTLINE
• Context
• System Architecture Example
• OpenCL standard
• Development cycle
• Results
• Conclusions
[email protected] Slide 2
Context
[email protected] Slide 3
Context
Advantages of Data Acquisition (DAQ) systems based on FPGAs+ Flexibility to modify the design
+ Low latency
+ Deterministic behavior
+ High-performance processing
+ Parallelization
[email protected] Slide 4
But developing for FPGA is complex and time consuming (expensive …)
PXI Chassis MTCA ChassisFPGA Board
I/O
Interface
AMC + FPGA
I/O Interface
NI Linux DriverMTCA Driver
IRIO
NDS-CORE
Custom
IRIO Templates
NDS-EPICS
User Device Driver
Custom process
User customizationUser customization
Networks
Fast
Controller
Remote I/O
PCIe
Application
independant
Application
dependant
Hardware &
device
drivers
Software
SDN
, DA
N
Main control, Other systems, etc
Why is it expensive to develop for FPGAs
[email protected] Slide 5
Desing UsingLabVIEW/FFPA
Design UsingHDLsOnly Products
from OneManufacturer
IRIO-OpenCL
[email protected] Slide 6
PXI Chassis MTCA ChassisFPGA Board
I/O
Interface
AMC + FPGA
FMC
I/O
NI Linux Driver OpenCL RTE
IRIO
NDS-CORE
Custom
Kernels
IRIO Templates
NDS-EPICS
IRIO-OpenCL
IRIO-OpenCL
BSP
IRIO-OpenCL
KernelsCustom process
User customizationUser customization
Networks
Fast
Controller
Remote I/O
PCIe
Application
independant
Application
dependant
Hardware &
device
drivers
Software
SDN
, DA
N
Main control, Other systems, etcControl
System
Desing UsingOpenCL(80%)
HDL(20%)
Products frommultiple
Manufacturers
System Architecture Example
• PC + PCIe + MTCA + AMC + (PROCESSING) + (I/O)
[email protected] Slide 7
COMPUTER CODAC Core System 6.0
MCH carrier
PCIe
optical link
Device
AMC ARRIA10 FPGA
FMC DAQ Board
MTCA4 CHASSIS
HOST+
+ NDSv3
1
OpenCL: Basic overview
[email protected] Slide 8
High level language + COMPUTING MODEL • A host and multiple devices (CPU, GPU, FPGA).• Computation is divided into functions called Kernels.• There is one or several queues that send the Kernels to execute concurrently.• Memory organized in buffers/images and transfers are explicit.• Parallelization is a big focus.
OpenCL: Kernels
• Short: The part of OPENCL that goes into the FPGA and is allocated in the FPGA in partial reconfigurable partition.
• Kernels have access to all device memory layers.
• Key to performance is optimizing this memory usage.
• Parallelization is achieved in the form of a pipeline for FPGA.
[email protected] Slide 9
Intel RTE
HOST Interface
KernelPipeline
DDR
Host
JESDE204B
K_DAQ
K_GEN
K_TIMING
K_NFM
BSP
[email protected] Slide 10
OpenCL: BSP
• Short: The part of OPENCL that goes into the FPGA and is FIXED.
• Manages DDR memory.
• It interfaces with the host
• JESD204B interface
• Requires HDL to modify (usually given)
Intel RTE
HOST Interface
Kernelsmanagement
DDR
Ke
rne
lsHost
JESD204B
SPI
OpenCL: Host
• Short: The part of OPENCL that goes into the C++ drivers.
• Host sends commands and can set Global Memory (DDR) (SLOW!)
• Critical processes can be organized with a chain of pipes (HDL=AXI ST)
• Data can be gathered from I/O pins, but kernels are queued from host.
[email protected] Slide 11
Host
Transfer Transfer
I/O I/O
AXI ST/AVALON ST
Development cycle: OpenCL
[email protected] Slide 12
Three main scenarios for a new application:
1- New algorithm
2- FMC module
3- AMC + FPGA
JESD204B controller
(.qip)
Kernels Source
Code
Custom
(.cl)
OpenCL System IP
Files (.qip, .qsys)
Board Config Files(board_env.xml)(board_spec.xml)
Intel Quartus
Prime Design
Software
Board
Support
Package
Intel FPGA SDK
for OpenCL
Kernels Offline
Compiler (aoc)
FPGA bitstreamFixed partition
(.aocx)
HOST
Applicatio
n Code
(.cpp,.h)
System host
compiler
HOST Binary
HOST DESIGN FLOW
DEVICE DESIGN FLOW
KERNELS
FPGA bitstreamPartial
Reconfiguration(.aocx)
Intel
OpenCL
RTE
Intel FPGA SDK
for OpenCL
Kernels Offline
Compiler (aoc)
SPI controller
(.qip)
IRIO-OpenCL
Kernels
DAQ
GEN
(.cl)
Application
dependant
FMC moduledependant
Software
and tools
AMC FPGA
board
dependant
Board
Test
Kernel
(.cl)
Results
• Implementation: more on ID 494 !!
• Basic kernels FPGA resource utilization
[email protected] Slide 13
Under evaluation:o Timingo Triggeringo Routing
Functionality:DAQWFGProcessingRouting
• Totally integrated in EPICS using NDS
• Tested in CODAC CORE SYSTEM 6.0
Results
• Working example: fission chamber neutron flux measurement.
• Why this example:• The measurements benefit from high sampling rates.
• High data rate but simple operations (floating point).
• Logic inside the FPGA can classify different pulses
• Parallelization enables on-line comparison of different algorithms making it and intelligent system.
• Alternatively other algorithms based in machine learning techniques can be executed in parallel.
[email protected] Slide 14
Conclusions
• DAQ combined with OpenCL reduces development of high-performance processing.
• The DAQ and processing systems developed with OpenCL are less manufacturer dependent.
• OpenCL enables C-like development of FPGA with lots of OpenCLalgorithms examples.
• OpenCL handles data transfers and device interface, hardware abstraction.
• Combined with NDSv3 a modular solution was developed, abstracted from the control system. IRIO-OpenCL
• You only need to do the algorithm.
[email protected] Slide 15
Future Work
• Future work:• Implementation of Machine Learning Applications using machine learning and Deep
Learning Tools*.
• Expand IRIO-OpenCL functionalities
• Looking for more use cases (possible collaborations) using IRIO-OpenCL -> Contact us!
[email protected] Slide 16
* Dr Jesus Vega (484. Automatic recognition of plasma relevant events: implications for ITER) 14 may. 2019 9:20
Acknowledgements
This work was supported in part by the Spanish Ministry of Economyand Competitiveness, Projects Nº ENE2015-64914-C3-3-R andMadrid regional government (YEI fund), Grant Nº PEJD-2018-PRE/TIC-8571.
[email protected] Slide 17
The Intel® FPGA SDK for OpenCL™ is based on a published Khronos Specification.Altera, Arria, Intel, the Intel logo, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission of the Khronos Group™.*Other names and brands may be claimed as the property of others.
[email protected] Slide 18
Thank You! Questions?
Backup
[email protected] Slide 19
FPGA
FPGA
FPGA
HOST APPLICATION
Ke
rnel A
rgu
men
ts
12 bits ADC
(1 GS/s)
JESD204B
Pulse Counting
Kernel
DaqStartStop
SamplingRate
BlockSize
FIFO
IO PIPESAI0 (FC)
AI1
Thre
sh
old
s
(m
in,m
ax)
Wid
th
(min
,max)
ST
Ava
lon
Po
rt
CPU
FMC DAQ Device
Threshold
Parameters
Global Buffers
CODAC Core System 6.0
AO0 (FC)
AO1
FC
SAMPLES * 8
125MHz
FC
SAMPLES * 8
125MHz
FC
SAMPLES * 8
125MHz
Campbelling
Kernel
Current
Kernel
FIFO
FIFO
FIFO
PIPES
IRIO-
OpenCL
DAQ
Kernel
Results
Global Buffers
Re
sults
JESD204B
ST Avalon Port
IRIO-OpenCL
SPI Config
12 bits DAC
(1 GS/s)
IRIO-OpenCL
GEN
Kernel
Samples
Global Buffers
Sam
ple
s
PIPESFIFO
FC
SA
MP
LE
S *
4
25
0 M
Hz
NDSv3
Main control, Other systems, etc
PIPES
Low-jitter
Clock GeneratorCLKREF, SYSREF
SPI Master
ST Avalon Port
IRIO-OpenCL
SPI CONFIG
Kernel
IRIO-OpenCL
DAQ
IRIO-OpenCL
GEN
IRIO-OpenCL
NFM
IRIO-OpenCL
BSP
FPGA Logic
ARRIA 10
Reconfigurable
Kernels
Backup
[email protected] Slide 20
asynArrayInt8
asynFloat64
<reason>DAQ.CH1
NDS-Core and NDS-EPICS Libraries
asynDriver Module
<reason>UpdateRate
<reason>UpdateRate_RBV
<more PVs)……...
asynFloat32Array
<reason>DAQ.CH0
asynDriver
Management of NELM for waveforms
Other functionalities specific for HWA
<node>Firmware
<node>Health Monitoring
<PV>SamplingRate_RBV(VariableIn)
<node>Waveform Generation (WFG)
<PV>UpdateRate (DelegateOut)
<PV>UpdateRate_RBV(VariableiN)
<node>Data Acquisition (DAQ)
<PV>SamplingRate (DelegateOut)
<node>Device
<more PVs)……...
Hardware Device A: HWA
EPICS Database
HW Function: DAQ HW FuncTion: WFG
Kernel
Device Driver
(manufacturer)
AP
I
HW Function:TRG-CLK HW Function:Firmware
NDS-EPICS
EPICS-IOC
NDS-EPICS-Ext for HWA
X-Series
PXI6683
PTM-DAMC-1588-10F
PXIe
MTCA.4
FILTERA/D
CONVERTERD/A
CONVERTER FILTER
FILTER
100 MHz
PLL
DIVIDER
MEMORY
MUX
FILTERA/D
CONVERTERMUX D/A
CONVERTER
HW Function:Health Monitoring
Temp Power
NDS-CORE
FlexRIO+NI5761