hardware solutions for radar processing fpga / gpu / cpu...

17
1 Hardware solutions for radar processing FPGA / GPU / CPU Trade-offs J.F. Degurse – Y. Mourot – H.Cantalloube – L. Savy Sondra Wokshop in Singapore March 21th 2012

Upload: others

Post on 26-May-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

1

Hardware solutions for radar processing

FPGA / GPU / CPU Trade-offsJ.F. Degurse – Y. Mourot – H.Cantalloube – L. Savy

Sondra Wokshop in SingaporeMarch 21th 2012

Page 2: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

2

Hardware components for radar processing

FPGA GPU CPU

Page 3: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

3

Characteristics of candidate embedded products

AMD Phen. II1090T CPU

AMD Radeon E6760 GPU

NVIDIA GeForce

GT240 GPU(GRA 111)

Xilinx Virtex-6 SW475T

FPGA

SP GFLOPs 153.6 576 250 550

GFLOPS / W 1.23 16.5 7.1 13.7

GFLOPS / $ 0.54 N/A 2.7 0.14

FPGAGPUCPU

Page 4: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Why use FPGAs in radar processing?

Nowadays ADCs can sample signals at rates up to a few GigaSamples per second

signal sampling on carrier frequency or at the first

intermediate frequency in the radar receiver

Often two configurations are incoutered on modern versatile radars (different modes at different times)

1.Wideband + a few channels (1 to 4)

2.Narrowband + High dynamic + numerous channels (10 to 100 or even 1000 for the full digital radar)

Need to reduce the data rateReal time digital beamformingReal time filtering and decimation

4

FPGADigital

Receiver

Page 5: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Synoptic of a digital receiver (One channel)

5

ADC16 bits

16 bits

Analog Signal

FPGA

Page 6: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Benefits of Digital Receiver

Benefits Cost effective compared to analog solution I&Q are well balanced, as they are digitally filtered Highly reconfigurable

BUT Requires highly skilled people to program FPGAs

6

Page 7: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Why GPU for radar processing instead of CPU ?

Potential benefits Best ratios Gflops/W, Glops/$ 500 Gflops for 200 W on Telsa C2075 Cost (~ 2 K$) No ITAR issue (mass market) 100% costs and open source Scientific library available(CuBLAS for

linear algebra) Existing programming environnement

(CUDA)

Open questions How to formalize radar processing for

GPU efficient implementation? What effective computing power for

which radar processing?

630

77

(Double precision)

GFlops (Double precision)20 000

630

77

(Double precision)

GFlops (Double precision)20 000

27 cm12 cm

27 cm12 cm

T20

448

515

1000

Tesla C2075

Kepler

C3070

Page 8: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

8

CPU / GPU Comparison

• Fast on iterative calculations• Low level parallelization tasks

• High performance on massively parallel algorithms• High level parallelization tasks

e.g CPU : Intel X5670 (6 cores)

70 Gflops (double precision)

0.7 GFlops per Watt

e.g GPU : NVIDIA Tesla C2075

500 Gflops (double precision)

2.5 GFlops per Watt

~ Mo

Page 9: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Current supercomputer (100+ PowerPC G4 processors)

is being replaced by a single GPU card

9

Application case 1 at Onera : space surveillance ground radar

Mercury SC

Price = 1 M$ (2003)

TDP > 8 kW

Tesla C2075

Price = 2.5 K$ (2011)

TDP = 215 W

Page 10: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Application case 1 at Onera : space surveillance ground radar

Pulse compression + Doppler Processing

10

* All 8 cores active

Performance Speedup GPU/CPU = x7.2

16384 Doppler filters

100 receivers

20 Doppler rate

1.3Gflops

Page 11: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Application case 1 at Onera : space surveillance ground radar

Digital beamforming + detection

11

* All 8 cores active

Performance Speedup GPU/CPU = x16.5

1200 beams

Page 12: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

12

Application case 2 at Onera : STAP on Airborne Radar

Rugged GPGPU product available for defense and aerospace market

GE-IP’s rugged GRA111

NVIDIA GeForce GT240 GPU

CPU only = 16 GFlops peak, 60 W, 0.27 GFlops/W

CPU+GPU = 250 GFlops, 100 W, 2.5 GFlops/W

Courtesy of GE Intelligent Plateforms

MAGIC 1

Page 13: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Application case 2 at Onera : STAP on Airborne Radar

Reference Processing

13

STAP

Transversal

filter

K range

gates

N s

patia

l

cha

nnel

s

Mp pulses

FFT

STAP

Transversal

filter

STAP

Transversal

filter

M

STAP

Data Cube …

+

τ

τ τ

τ τ τ

τ

ττM taps

N c

han

nels

+

τ

τ τ

τ τ τ

τ

ττM taps

N c

han

nels

K

]0000...0001...11[=Hqw

qththHq

thHq

qthHq

H

ww

wwww

ΓΓΓΓΓ

⋅Γ= −

1

1

)(

N times (N-1)M times

*1w *

1+Kw *1)1( +−MKw

*2)1( +−MKw

*KMw*

KKw +

*2w

*Kw

*2+Kw

STAP channel

Matrix inversion

Page 14: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Application case 2 at Onera : STAP on Airborne Radar

Matrix inversion

Double precision matrix inversion

14

* All 4 cores active

Performance Speedup GPU/CPU = x6.0

** DP Matrix inversion

1000 matrix of size 60x60

Page 15: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Application case 3 at Onera : SAR Processing

Processing time for one SAR image

15

To reach Tesla C2050’s performances, 11 CPUs are needed !

(+1*Intel X5650)

Stripmap mode

Tacq=2’

Image size RangexAzimut

=10 000 x 80 000

pixels

Page 16: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Application case 3 at Onera : SAR Processing

Comparaison GPU/CPU at same performances :

16

Reaching the same performances with CPU will cost and consume 5 time more !

0

2000

4000

6000

8000

10000

12000

Nvidia C2050 11*X5650

Price (€)

0

200

400

600

800

1000

1200

Nvidia C2050 11*X5650

Electricity consumption (W)

(+1*Intel X5650) (+1*Intel X5650) (+1*Intel X5650)

Page 17: Hardware solutions for radar processing FPGA / GPU / CPU ...jf.degurse.free.fr/files/singap_gpu.pdf · 3 Characteristics of candidate embedded products AMD Phen. II 1090T CPU AMD

Hardware solutions for radar processing

Summary

FPGA are the only solution to face high data rate Broadband, data rate reduction (DDC), performances Highly complex programming, hard to modify Cost

CPU does efficiently sequential work (and controls the GPU)

Easy programming, portability, easy algorithms evolution Fast on iterative / non easy parallelizable operations (CFAR,…) Performances , electric consumption, cost

GPU performs very well on huge parallel tasks Performances on parallel processing, easy algorithms evolution Efficiency, small size Data transfers, iterative operations

17

Digital Receiver