2016.04.21 - state of the art

25
Politecnico di Milano Dipartimento di Elettronica, Informazione e Bioingegneria (DEIB) TITOLO XOHW16 Meeting Tizio Caio [email protected] Thursday, November 11, 2015 project HAMS Chiara Gatti [email protected] Guido Lanfranchi [email protected] STATE OF THE ART April 21th, 2016 NECST Lab, Politecnico di Milano Credits: Shahriar Emil from the Noun Project

Upload: hamsproject

Post on 10-Apr-2017

131 views

Category:

Devices & Hardware


0 download

TRANSCRIPT

Page 1: 2016.04.21 - State of the Art

Politecnico di MilanoDipartimento di Elettronica, Informazione e Bioingegneria (DEIB)

TITOLO

XOHW16 Meeting

Tizio Caio

[email protected]

Thursday, November 11, 2015

project

HAMS Chiara Gatti

[email protected]

Guido [email protected]

STATE OF THE ARTApril 21th, 2016

NECST Lab, Politecnico di Milano

Credits: Shahriar Emil from the Noun Project

Page 2: 2016.04.21 - State of the Art

2

State of the

Art

Matlab HDL Coder HW matrix inversion

Page 3: 2016.04.21 - State of the Art

3

State of the

Art

Matlab HDL Coder HW matrix inversion

- Matrices can not be passed

directly as I/O (but can be

managed internally)

- Requires fixed-point conversion

(not directly available for function

«inv» and «pinv»)

- Requires HW-adapted algorithms

(eg. CORDIC)

not trivial!

Page 4: 2016.04.21 - State of the Art

4

State of the

Art

Matlab HDL Coder HW matrix inversion

- Matrices can not be passed

directly as I/O (but can be

managed internally)

- Requires fixed-point conversion

(not directly available for function

«inv» and «pinv»)

- Requires HW-adapted algorithms

(eg. CORDIC)

not trivial!

Page 5: 2016.04.21 - State of the Art

5

State of the

Art

Matlab HDL Coder HW matrix inversion

- Matrices can not be passed

directly as I/O (but can be

managed internally)

- Requires fixed-point conversion

(not directly available for function

«inv» and «pinv»)

- Requires HW-adapted algorithms

(eg. CORDIC)

not trivial!

HW DevicesApplicative

domains

Algorithms

Page 6: 2016.04.21 - State of the Art

6

84%

11%

5%

Xilinx

Altera

other

- Virtex II

- Virtex 4 FXGO

- Virtex 5

- Virtex 7

- RC-1000

Hardware Devices (*)

(*) data extracted from 27 papers related

to our topic. References at the end

Page 7: 2016.04.21 - State of the Art

7

APPLICATIONS

DIGITAL SIGNAL PROCESSING (DSP)

PURE MATHS OTHER SIMULATIONS

Applicative domains

(*) data extracted from 27 papers related

to our topic. References at the end

Page 8: 2016.04.21 - State of the Art

8

APPLICATIONS

DIGITAL SIGNAL PROCESSING (DSP)

Image processing

Communications

Tele, radio, wireless…

Data detection

PURE MATHSOTHER

SIMULATIONS

75%

(*) data extracted from 27 papers related

to our topic. References at the end

Page 9: 2016.04.21 - State of the Art

9

APPLICATIONS

DIGITAL SIGNAL PROCESSING (DSP)

Image processing

Communications

Tele, radio, wireless…

Data detection

PURE MATHSOTHER

SIMULATIONS

14%

(*) data extracted from 27 papers related

to our topic. References at the end

Page 10: 2016.04.21 - State of the Art

10

APPLICATIONS

DIGITAL SIGNAL PROCESSING (DSP)

Image processing

Communications

Tele, radio, wireless…

Data detection

PURE MATHSOTHER

SIMULATIONS

1%

(*) data extracted from 27 papers related

to our topic. References at the end

Page 11: 2016.04.21 - State of the Art

11

APPLICATIONS

DIGITAL SIGNAL PROCESSING (DSP)

Image processing

Communications

Tele, radio, wireless…

Data detection

PURE MATHSOTHER

SIMULATIONS

(*) data extracted from 27 papers related

to our topic. References at the end

Page 12: 2016.04.21 - State of the Art

12

Algorithms

SVD method°Greville’s algorithmFull rank QR

factorization

Moore-Penrose Pseudo Inverse*

* Corrieu P, «Fast Computation of Moore-Penrose Inverse Matrices», Neural Information Processing, 2005

Page 13: 2016.04.21 - State of the Art

13

Algorithms

SVD method°Greville’s algorithmFull rank QR

factorization

Moore-Penrose Pseudo Inverse*

Let be A = U*∑*V’ then pinv(A) = V*pinv(∑)*U’

* Corrieu P, «Fast Computation of Moore-Penrose Inverse Matrices», Neural Information Processing, 2005

Page 14: 2016.04.21 - State of the Art

14

Algorithms

SVD method°Rank Decomposition QR Method

° Rahmati et al, “FPGA Based Singular Value Decomposition for Image Processing Applications ”, 2008

QR algorithm

Computationally efficient

Hemkumar, "A systolic VLSI architecture

for complex SVD", 1992

Jacobi method

More accurate, parallelism

Luk, Park, "A proof of convergence for two

parallel Jacobi SVD algorithms", 2002

Moore-Penrose Pseudo Inverse*

Page 15: 2016.04.21 - State of the Art

15

Some results«Reconfigurable FPGA-Based Unit for Singular Value Decomposition of

Large m x n Matrices», Ledesma-Carrillo et al., 2011

vs Matlab 7.3.0.267 utilizing 2.4GHz Intel Core Duo Processor

Page 16: 2016.04.21 - State of the Art

16

Some results

Singular Value Matlab* FPGA % error

σ1 2.6603 2.7500 3.3718

σ2 2.3113 2.3125 0.0519

Elapsed Time 2.7141 s 24.3143 ms

“Reconfigurable FPGA-Based Unit for Singular Value Decomposition

of Large m x n Matrices”, Ledesma-Carrillo et al., 2011

SVD Computation of a 32x127 Matrix: this table shows the corresponding

singular values with the minimum and maximum estimation errors for the

case of a 32 x 127 matrix. This table also shows the elapsed time for the

software and hardware implementations.*Matlab 7.3.0.267 utilizing 2.4GHz Intel Core Duo Processor

Page 17: 2016.04.21 - State of the Art

17

Some results

“Reconfigurable FPGA-Based Unit for Singular Value Decomposition

of Large m x n Matrices”, Ledesma-Carrillo et al., 2011

Resources Utilization Xilinx Spartan 3

3S1000ft256-4

Altera Cyclone II

EP2C35F672C6

Programmable Logic 78% 14%

Memory 100% 75%

Multipliers 100% 39%

Max. Op. Freq. 57.981 MHz 65.928 MHz

Resource Utilization of the Proposed FPGA-Based SVD Computation Unit for the

32x127 case study matrix

Page 18: 2016.04.21 - State of the Art

18

Some results«Reconfigurable FPGA-Based Unit for Singular Value Decomposition of

Large m x n Matrices», Ledesma-Carrillo et al., 2011

o Before this work:

• non-symmetric matrices up to 8x8

• larger symmetric matrices

o After this work:

• Large mxn matrices…

• but up to 32x127

Page 19: 2016.04.21 - State of the Art

19

Matlab HDL Coder HW matrix inversion

Our contributionvs vs

Page 20: 2016.04.21 - State of the Art

20

Matlab HDL Coder HW matrix inversion

• Managing of the whole interface

• It is not needed to write HDL-

friendly Matlab code (only

function)

Our contributionvs

Page 21: 2016.04.21 - State of the Art

21

Matlab HDL Coder

Our contributionvs

HW matrix inversion

Applicative

domains

Fluid dynamics simulation of

an oxygenator for ECC

• Managing of the whole interface

• It is not needed to write HDL-

friendly Matlab code (only

function)

Page 22: 2016.04.21 - State of the Art

22

Our contribution

Matlab HDL Coder

Management of larger matrices

(up to 8000x8000)

vs

HW matrix inversion

Page 23: 2016.04.21 - State of the Art

23

Our contribution

Matlab HDL Coder

vs

Management of larger matrices

(up to 8000x8000)

through

(i) strong parallelism

(ii) streaming in data transfer

(iii) Xilinx Virtex 7 VC707

HW matrix inversion

Page 24: 2016.04.21 - State of the Art

24

HAMSproject

Contact us!

You can find us…

[email protected]

[email protected]

[email protected]

www.facebook.com/hams.project

https://twitter.com/HAMS_project

http://www.slideshare.net/HAMSproject

https://www.youtube.com/channel/UCaovqRpUc7D_Uf2WJHL0rvA

ANY QUESTIONS?

Page 25: 2016.04.21 - State of the Art

25

References[1] Wang et al, “A CORDIC-Based Dynamically Reconfigurable FPGA Architecture for Signal Processing Algorithms”, 2008

[2] Burian et al, “A Fixed-Point Implementation of Matrix Inversion Using Cholesky Decomposition”, 2004

[3] Bigdeli et al, “A New Pipelined Systolic Array-Based Architecture for Matri Inversion in FPGAs with Kalman Filter Case Study”, 2005

[4] Edmann et al, “A Scalable Pipelined Complex Valued Matrix Inversion Architecture”, 2005

[5] Garcia et al, “A Suitable FPGA Implementation of Floating-Point Matrix Inversion Based on Gauss-Jordan Elimination», 2011

[6] Ahmedsaid et al, “Accelerating SVD on Reconfigurable Hardware for Image Denoising”, 2004

[7] Kumar et al, “An Approach to Design a Matrix Inversion HW Module using FPGA”, 2014

[8] Irturk et al, “An Efficient FPGA Implementation of Scalable Matrix Inversion Core usign QR Decomposition”, 2009

[9] Norton et al, “An Evaluation of the Xilinx Virtex-4 FPGA for On-Board Processin in an Advanced Imaging System”, 2009

[10] Irturk et al, “An FPGA Design Space Exploration Tool for Matrix Inversion Archiectures”, 2008

[11] Ma et al, “An FPGA-based Singular Value Decomposition Processor ”, 2006

[12] Wu et al, “Approximate Matrix Inversion for High-Throughput Data Detection in the Large-Scale MIMO Uplink ”, 2013

[13] Irturk et al, “Automatic Generation of Decomposition based Matrix Inversion Architectures ”, 2008

[14] Szekowka et al, “CORDIC and SVD Implementation in Digital Hardware ”, 2010

[15] Sergiyenko et al, “Error-Free Computation of Inverse Matrices in FPGA ”, 2013

[16] Rahmati et al, “FPGA Based Singular Value Decomposition for Image Processing Applications ”, 2008

[17] Grammenos et al, “FPGA Design of a Truncated SVD Based Receiver for the detection of SEFDM Signals ”, 2011

[18] Karkooti et al, “FPGA Implementation of Matrix Inversion Using QRD-RLS Algorithm”, 2005

[19] Blace et al, “High level Prototyping and FPGA Implementation of the Orthogonal Matching Pursuit Algorithm ”, 2012

[20] Ahmedsaid et al, “Improved SVD Systolic Array and Implementation on FPGA”, 2003

[21] S. Hu and Q. Yan, “Inversion of Vandermonde Matrices in FPGAs ”, 2004

[22] Ohta et al, “Matrix Decomposition Suitable for FPGA Implementation of N-contnuous OFDM ”, 2014

[23] Chisty et al, “Matrix Inversion Using QR Decomposition by Parabolic Synthesis ”, 2012

[24] Ma et al, “QR Decomposition-Based Matrix Inversion for High Embedded MIMO Receivers ”, 2011

[25] Wernke et al, “Real-Time Data Processing for an Advanced Imaging System Using the Xilinx Virtex-5 FPGA ”, 2009

[26] Ledesma-Carrillo et al, “Reconfigurable FPGA-Based Unit for Singular Value Decomposition of Large mxn Matrices ”, 2011

[27] Wang et al, “Singular Value Decomposition Hardware for MIMO - State of the Art and Custom Design ”, 2010