Download - LabVIEW FPGA and GPU
![Page 1: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/1.jpg)
FYS3240
PC-based instrumentation and microcontrollers
LabVIEW FPGA and GPU
Spring 2011 – Lecture #11
Bekkeng, 11.5.2011
![Page 2: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/2.jpg)
FPGA = Field Programmable Gate Array
![Page 3: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/3.jpg)
FPGA advantages
• High reliability
• High determinism
• High performance
• True parallelism
• Reconfigurable
![Page 4: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/4.jpg)
Common Applications for FPGAs
• High-speed control
• Intelligent DAQ
• Digital communication protocols
– e.g. SPI and Wireless
• Sensor simulation
• Onboard processing and data reduction
– e.g. video processing
• Co-processing
– offload the CPU
![Page 5: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/5.jpg)
![Page 6: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/6.jpg)
To configure FPGAs
on NI hardware
![Page 7: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/7.jpg)
LabVIEW FPGA Hardware Targets
• NI CompactRIO
• NI Single-Board RIO
• NI FlexRIO
• NI R Series Multifunction RIO
• NI Compact Vision System
FlexRIO
FPGA ModulesNI FlexRIO
Adapter Modules
RIO = Reconfigurable I/O
![Page 8: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/8.jpg)
FPGAs in DAQ-systems (”intelligent DAQ”)
• DAQ-cards with a programmable FPGA
• Multi-rate sampling
– Allows different sampling frequencies on the I/O channels
– For comparison, when using an “ordinary” DAQ-card
(without a user reconfigurable FPGA) all channels must
have the same sampling frequency
• User defined processing in the FPGA
• FPGA-based hardware timing/synchronization
Both for PXI and PCI
![Page 9: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/9.jpg)
LabVIEW FPGA VIs
![Page 10: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/10.jpg)
LabVIEW FPGA VIs
![Page 11: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/11.jpg)
LabVIEW FPGA VIs
![Page 12: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/12.jpg)
With the LabVIEW FPGA Module you develop FPGA applications on
a host computer running Windows, and then LabVIEW compiles and
implements the code in hardware
open closecom with FPGA
http://zone.ni.com/devzone/cda/tut/p/id/3358
Including GUI
![Page 13: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/13.jpg)
Host Application for Live
Communication with the FPGA
• The FPGA interface pallet makes it easy to perform real-time
communication between the FPGA and the real-time or Windows host
application
• The Open FPGA Reference function is first used to open and run a
specified FPGA application
• The FPGA Read/Write Control can be used to read data from the
FPGA indicators (outputs) or write data to the controls (inputs)
![Page 14: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/14.jpg)
LabVIEW FPGA – Event counter
Digital I/O nodes can take advantage of the specialized structure
called the single-cycle timed loop in LabVIEW FPGA, with which
you can execute code at specified rates ranging from 2.5 to 200 MHz.
Using a 40 MHz clock, for example, you can use a single-cycle timed
loop to create a 40 MHz counter on any digital line
![Page 15: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/15.jpg)
LabVIEW FPGA - Parallel Operations
• Two parallel loops with
different sampling rates
– Run in parallel
– No shared resources
between the two loops
![Page 16: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/16.jpg)
Single Cycle Timed Loop (SCTL)
• All operations within the SCTL loop must fit into one clock cycle
• Some functions are not supported in SCTL, among others:
– Long sequences of serial code
– Loop timer, wait functions
– Analog input an analog output I/O nodes
– Loop structures (While Loop, For Loop)
• LabVIEW Help informs which functions are supported in SCTL
![Page 17: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/17.jpg)
Optimization
Loop rates limited by longest path
• A0 takes about 35 ticks
• DI takes 1 tick (HW Specific)
DI limited by AO when in same loop
NOTE: A while loop takes 3 ticks
Clock frequency = 40 MHz
![Page 18: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/18.jpg)
Optimization - Parallel loops
• Separate (parallel) loops allow DI
to run independent of AO
• This allows DI to be sampled 10
times faster by using a separate
loop
![Page 19: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/19.jpg)
From LabVIEW to Hardware
![Page 20: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/20.jpg)
Include HDL-code in LabVIEW
• The IP Integration Node (replaces the HDL Interface from
LabVIEW 2010) can bring in third-party Xilinx IPs.
– a wizard to import files and configure the interface step by step
• You can also use the IP Integration Node to include your own
VHDL code
• Once you have configured the node, you can use the IP just
like any other LabVIEW node with inputs and outputs.
IP = Intellectual Property
![Page 21: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/21.jpg)
Cycle-Accurate Simulation with ModelSim
• New in LabVIEW FPGA 2010
![Page 22: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/22.jpg)
LabVIEW FPGA IPNet
www.ni.com/ipnet
![Page 24: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/24.jpg)
GPUs
• GPU = Graphics Processing Unit
• GPUs can be used as hardware accelerators for
numerical/computational tasks
• Can be used in Real-Time High-Performance Computing
systems
NVIDIA® TeslaTM C1060
240 cores
![Page 25: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/25.jpg)
Need for Speed ...
• GPUs have more transistors
dedicated for processing
than a CPU
– The performance gain when
using GPUs can be
significant
• A single workstation with
two x16 PCIe slots can have
two GPUs
• Two TeslaTM C1060 GPUs
gives 480 cores ....
![Page 26: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/26.jpg)
CUDA
• CUDA = Compute Unified Device Architecture
• CUDA is developed by Nvidia and is a GPU interface
for C
![Page 27: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/27.jpg)
CUDA example
void saxpy_serial(int n, float a, float* x, float* y)
{
for (int i=0; i<n; ++i)
y[i] = a*x[i] + y[i];
}
// Kjør seriell saxpy
saxpy_serial(n, 2.0, x, y);
__global__ void saxpy_parallel(int n, float a, float* x, float* y)
{
int i = blockIdx.x * blockDim.x + threadIdx.x;
if (i < n) y[i] = a*x[i] + y[i];
}
// Kjør parallell saxpy med 256 tråder/blokk
int nBlocks = (n + 255) / 256;
saxpy_parallel<<<nBlocks, 256>>>(n, 2.0, x, y);
Standard C
Parallel
CUDA C
![Page 28: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/28.jpg)
Application examples
![Page 29: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/29.jpg)
Application example II
http://www.top500.org/
![Page 30: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/30.jpg)
LabVIEW and GPUs
• GPUs can not be directly programmed with
LabVIEW
• However, a framework is designed to integrate GPU
execution into LabVIEW's parallel execution, to
execute the CUDA code
• Some applications are tailor made for deployment to
GPUs, such as those related to matrix operations
![Page 31: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/31.jpg)
LabVIEW and GPUs II
• The GPU interface is made up of two LabVIEW libraries –
lvcuda.lvlib and lvcublas.lvlib – that contain LabVIEW data
types and VIs
http://zone.ni.com/devzone/cda/tut/p/id/11972
![Page 32: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/32.jpg)
Hybrid architecture using COTS
• Numerical Computing
– two NI 8353 multicore
computers
– one nVIDIA® Tesl GPU
computing system (with four
GPUs)
• Real-Time Measurement and
Control
– two PXIe chassis
– embedded controllers with a
multicore CPU
– FPGA-based data
acquisition/control boards
![Page 33: LabVIEW FPGA and GPU](https://reader031.vdocuments.mx/reader031/viewer/2022020706/61fca3719d50e757a521f364/html5/thumbnails/33.jpg)
GPUs – A
threat?