2016.04.21 - state of the art
TRANSCRIPT
Politecnico di MilanoDipartimento di Elettronica, Informazione e Bioingegneria (DEIB)
TITOLO
XOHW16 Meeting
Tizio Caio
Thursday, November 11, 2015
project
HAMS Chiara Gatti
Guido [email protected]
STATE OF THE ARTApril 21th, 2016
NECST Lab, Politecnico di Milano
Credits: Shahriar Emil from the Noun Project
2
State of the
Art
Matlab HDL Coder HW matrix inversion
3
State of the
Art
Matlab HDL Coder HW matrix inversion
- Matrices can not be passed
directly as I/O (but can be
managed internally)
- Requires fixed-point conversion
(not directly available for function
«inv» and «pinv»)
- Requires HW-adapted algorithms
(eg. CORDIC)
not trivial!
4
State of the
Art
Matlab HDL Coder HW matrix inversion
- Matrices can not be passed
directly as I/O (but can be
managed internally)
- Requires fixed-point conversion
(not directly available for function
«inv» and «pinv»)
- Requires HW-adapted algorithms
(eg. CORDIC)
not trivial!
5
State of the
Art
Matlab HDL Coder HW matrix inversion
- Matrices can not be passed
directly as I/O (but can be
managed internally)
- Requires fixed-point conversion
(not directly available for function
«inv» and «pinv»)
- Requires HW-adapted algorithms
(eg. CORDIC)
not trivial!
HW DevicesApplicative
domains
Algorithms
6
84%
11%
5%
Xilinx
Altera
other
- Virtex II
- Virtex 4 FXGO
- Virtex 5
- Virtex 7
- RC-1000
Hardware Devices (*)
(*) data extracted from 27 papers related
to our topic. References at the end
7
APPLICATIONS
DIGITAL SIGNAL PROCESSING (DSP)
PURE MATHS OTHER SIMULATIONS
Applicative domains
(*) data extracted from 27 papers related
to our topic. References at the end
8
APPLICATIONS
DIGITAL SIGNAL PROCESSING (DSP)
Image processing
Communications
Tele, radio, wireless…
Data detection
PURE MATHSOTHER
SIMULATIONS
75%
(*) data extracted from 27 papers related
to our topic. References at the end
9
APPLICATIONS
DIGITAL SIGNAL PROCESSING (DSP)
Image processing
Communications
Tele, radio, wireless…
Data detection
PURE MATHSOTHER
SIMULATIONS
14%
(*) data extracted from 27 papers related
to our topic. References at the end
10
APPLICATIONS
DIGITAL SIGNAL PROCESSING (DSP)
Image processing
Communications
Tele, radio, wireless…
Data detection
PURE MATHSOTHER
SIMULATIONS
1%
(*) data extracted from 27 papers related
to our topic. References at the end
11
APPLICATIONS
DIGITAL SIGNAL PROCESSING (DSP)
Image processing
Communications
Tele, radio, wireless…
Data detection
PURE MATHSOTHER
SIMULATIONS
(*) data extracted from 27 papers related
to our topic. References at the end
12
Algorithms
SVD method°Greville’s algorithmFull rank QR
factorization
Moore-Penrose Pseudo Inverse*
* Corrieu P, «Fast Computation of Moore-Penrose Inverse Matrices», Neural Information Processing, 2005
13
Algorithms
SVD method°Greville’s algorithmFull rank QR
factorization
Moore-Penrose Pseudo Inverse*
Let be A = U*∑*V’ then pinv(A) = V*pinv(∑)*U’
* Corrieu P, «Fast Computation of Moore-Penrose Inverse Matrices», Neural Information Processing, 2005
14
Algorithms
SVD method°Rank Decomposition QR Method
° Rahmati et al, “FPGA Based Singular Value Decomposition for Image Processing Applications ”, 2008
QR algorithm
Computationally efficient
Hemkumar, "A systolic VLSI architecture
for complex SVD", 1992
Jacobi method
More accurate, parallelism
Luk, Park, "A proof of convergence for two
parallel Jacobi SVD algorithms", 2002
Moore-Penrose Pseudo Inverse*
15
Some results«Reconfigurable FPGA-Based Unit for Singular Value Decomposition of
Large m x n Matrices», Ledesma-Carrillo et al., 2011
vs Matlab 7.3.0.267 utilizing 2.4GHz Intel Core Duo Processor
16
Some results
Singular Value Matlab* FPGA % error
σ1 2.6603 2.7500 3.3718
σ2 2.3113 2.3125 0.0519
Elapsed Time 2.7141 s 24.3143 ms
“Reconfigurable FPGA-Based Unit for Singular Value Decomposition
of Large m x n Matrices”, Ledesma-Carrillo et al., 2011
SVD Computation of a 32x127 Matrix: this table shows the corresponding
singular values with the minimum and maximum estimation errors for the
case of a 32 x 127 matrix. This table also shows the elapsed time for the
software and hardware implementations.*Matlab 7.3.0.267 utilizing 2.4GHz Intel Core Duo Processor
17
Some results
“Reconfigurable FPGA-Based Unit for Singular Value Decomposition
of Large m x n Matrices”, Ledesma-Carrillo et al., 2011
Resources Utilization Xilinx Spartan 3
3S1000ft256-4
Altera Cyclone II
EP2C35F672C6
Programmable Logic 78% 14%
Memory 100% 75%
Multipliers 100% 39%
Max. Op. Freq. 57.981 MHz 65.928 MHz
Resource Utilization of the Proposed FPGA-Based SVD Computation Unit for the
32x127 case study matrix
18
Some results«Reconfigurable FPGA-Based Unit for Singular Value Decomposition of
Large m x n Matrices», Ledesma-Carrillo et al., 2011
o Before this work:
• non-symmetric matrices up to 8x8
• larger symmetric matrices
o After this work:
• Large mxn matrices…
• but up to 32x127
19
Matlab HDL Coder HW matrix inversion
Our contributionvs vs
20
Matlab HDL Coder HW matrix inversion
• Managing of the whole interface
• It is not needed to write HDL-
friendly Matlab code (only
function)
Our contributionvs
21
Matlab HDL Coder
Our contributionvs
HW matrix inversion
Applicative
domains
Fluid dynamics simulation of
an oxygenator for ECC
• Managing of the whole interface
• It is not needed to write HDL-
friendly Matlab code (only
function)
22
Our contribution
Matlab HDL Coder
Management of larger matrices
(up to 8000x8000)
vs
HW matrix inversion
23
Our contribution
Matlab HDL Coder
vs
Management of larger matrices
(up to 8000x8000)
through
(i) strong parallelism
(ii) streaming in data transfer
(iii) Xilinx Virtex 7 VC707
HW matrix inversion
24
HAMSproject
Contact us!
You can find us…
www.facebook.com/hams.project
https://twitter.com/HAMS_project
http://www.slideshare.net/HAMSproject
https://www.youtube.com/channel/UCaovqRpUc7D_Uf2WJHL0rvA
ANY QUESTIONS?
25
References[1] Wang et al, “A CORDIC-Based Dynamically Reconfigurable FPGA Architecture for Signal Processing Algorithms”, 2008
[2] Burian et al, “A Fixed-Point Implementation of Matrix Inversion Using Cholesky Decomposition”, 2004
[3] Bigdeli et al, “A New Pipelined Systolic Array-Based Architecture for Matri Inversion in FPGAs with Kalman Filter Case Study”, 2005
[4] Edmann et al, “A Scalable Pipelined Complex Valued Matrix Inversion Architecture”, 2005
[5] Garcia et al, “A Suitable FPGA Implementation of Floating-Point Matrix Inversion Based on Gauss-Jordan Elimination», 2011
[6] Ahmedsaid et al, “Accelerating SVD on Reconfigurable Hardware for Image Denoising”, 2004
[7] Kumar et al, “An Approach to Design a Matrix Inversion HW Module using FPGA”, 2014
[8] Irturk et al, “An Efficient FPGA Implementation of Scalable Matrix Inversion Core usign QR Decomposition”, 2009
[9] Norton et al, “An Evaluation of the Xilinx Virtex-4 FPGA for On-Board Processin in an Advanced Imaging System”, 2009
[10] Irturk et al, “An FPGA Design Space Exploration Tool for Matrix Inversion Archiectures”, 2008
[11] Ma et al, “An FPGA-based Singular Value Decomposition Processor ”, 2006
[12] Wu et al, “Approximate Matrix Inversion for High-Throughput Data Detection in the Large-Scale MIMO Uplink ”, 2013
[13] Irturk et al, “Automatic Generation of Decomposition based Matrix Inversion Architectures ”, 2008
[14] Szekowka et al, “CORDIC and SVD Implementation in Digital Hardware ”, 2010
[15] Sergiyenko et al, “Error-Free Computation of Inverse Matrices in FPGA ”, 2013
[16] Rahmati et al, “FPGA Based Singular Value Decomposition for Image Processing Applications ”, 2008
[17] Grammenos et al, “FPGA Design of a Truncated SVD Based Receiver for the detection of SEFDM Signals ”, 2011
[18] Karkooti et al, “FPGA Implementation of Matrix Inversion Using QRD-RLS Algorithm”, 2005
[19] Blace et al, “High level Prototyping and FPGA Implementation of the Orthogonal Matching Pursuit Algorithm ”, 2012
[20] Ahmedsaid et al, “Improved SVD Systolic Array and Implementation on FPGA”, 2003
[21] S. Hu and Q. Yan, “Inversion of Vandermonde Matrices in FPGAs ”, 2004
[22] Ohta et al, “Matrix Decomposition Suitable for FPGA Implementation of N-contnuous OFDM ”, 2014
[23] Chisty et al, “Matrix Inversion Using QR Decomposition by Parabolic Synthesis ”, 2012
[24] Ma et al, “QR Decomposition-Based Matrix Inversion for High Embedded MIMO Receivers ”, 2011
[25] Wernke et al, “Real-Time Data Processing for an Advanced Imaging System Using the Xilinx Virtex-5 FPGA ”, 2009
[26] Ledesma-Carrillo et al, “Reconfigurable FPGA-Based Unit for Singular Value Decomposition of Large mxn Matrices ”, 2011
[27] Wang et al, “Singular Value Decomposition Hardware for MIMO - State of the Art and Custom Design ”, 2010