compressive sensing: theory, algorithms and applications · 2017. 5. 6. · compressive sensing /...
TRANSCRIPT
Compressive sensing: Theory,
Algorithms and Applications
University of Montenegro
Faculty of Electrical Engineering
Prof. dr Srdjan Stanković
About CS group
Project CS-ICT supported by the Ministry of Science of
Montenegro
12 Researchers
◦ 2 Full Professors
◦ 3 Associate Professors
◦ 1 Assistant Professor
◦ 6 PhD students
Partners:
◦ INP Grenoble, FEE Ljubljana, University of Pittsburgh, Nanyang
Technological University, Technical faculty Split, Zhejiang
University, Zhejiang University of Technology, Hangzhou Normal
University
Shannon-Nyquist sampling
Standard acquisition approach
◦ sampling
◦ compression/coding
◦ decoding
• Sampling frequency - at least twice higher than
the maximal signal frequency (2fmax)
• Standard digital data acquisition approach
Audio, Image, Video Examples
Audio signal:
- sampling frequency 44,1 KHz
- 16 bits/sample
Color image:
-256x256 dimension
– 24 bits/pixel
- 3 color channels
• Video:
- CIF format (352x288)
-NTSC standard (25 frame/s)
-4:4:4 sampling scheme (24 bits/pixel)
Uncompressed:
86.133 KB/s
Uncompressed:
576 KB
Uncompressed:
60.8 Mb/s
MPEG 1 – compression
ratio 4:
21.53 KB/s
JPEG – quality 30% :
7.72 KB
MPEG 1- common
bitrate 1.5 Kb/s
MPEG 4
28-1024 Kb/s
Compressive Sensing / Sampling
• Is it always necessary to sample the signals according to the
Shannon-Nyquist criterion?
• Is it possible to apply the compression during the acquisition
process?
Compressive Sensing:
• overcomes constraints of the traditional sampling theory
• applies a concept of compression during the sensing procedure
CS Applications
Biomedical
Appl.
MRI
CS promises SMART acquisition and processing
and SMART ENERGY consumption
Make entire “puzzle” having just a few pieces:
Reconstruct entire information from just few measurements/pixels/data
Compressive sensing is useful in the
applications where people used to make a large
number of measurements
Standard sampling CS reconstruction
using 1/6 samples
CS
Applic
atio
ns
L-statistics based Signal Denoising
0 50 100 150 200 250 300 350 400 450 5000
5
10
15
0 50 100 150 200 250 300 350 400 450 500-30
-20
-10
0
10
20
30
40
Remaining samples
0 50 100 150 200 250 300 350 400 450 5000
10
20
30
40
Noisy samples
Sorted samples - Removing the extreme values
Denoised signal
0 50 100 150 200 250 300 350 400 450 5000
5
10
15
Non-noisy signal
Discarded samples are declared
as “missing samples” on the corresponding
original positions in non-sorted sequence
This corresponds to CS formulation
After reconstructing “missing samples”
the denoised version of signal is obtained
Video sequences
*Video Object
Tracking
*Velocity
Estimation
*Video
Surveillance
CS Applications
Reconstruction of
the radar images
50% available pulses reconstructed image
30% available pulses reconstructed image
ISAR image with full
data set
Mig 25 example
Compressive Sensing Based Separation of
Non-stationary and Stationary Signals
Absolute STFT values Sorted values
CS mask in TF
STFT of the
composite signal
STFT sorted
values
STFT values that
remain after the L-
statistics
Reconstructed
STFT values
Fourier transform of the
original composite signal
The reconstructed
Fourier transform
by using the CS
values of the STFT
(c)
CS Applications
• Simplified case: Direct search reconstruction
of two missing samples (marked with red)
Time domain Frequency domain
If we have more missing samples, the direct search would
be practically useless
CS Applications-Example ( ) sin 2 2/ 0,..,20xf n N n for n • Let us consider a signal:
[0 0 0 0 0 0 0 0 10.5 0 0 0 10.5 0 0 0 0 0 0 0 0];x i i F
• The signal is sparse in DFT, and vector of DFT values is:
0 5 10 15 20-1
-0.5
0
0.5
1
time
-10 -5 0 5 100
5
10
frequency
Signal fx DFT values of the signal fx
1. Consider the elements of inverse and direct Fourier
transform matrices, denoted by and -1 respectively
(relation holds)
• CS reconstruction using small set of samples:
2 2 2 22 3 20
21 21 21 21
2 2 2 22 4 6 40
21 21 21 21
2 2 2 219 38 57 380
21 21 21 21
2 2 2 220 40 60 400
21 21 21 21
1 1 1 1 ... 1
1 ...
1 ...1
... ... ... ... ... ...21
1 ...
1 ...
j j j j
j j j j
j j j j
j j j j
e e e e
e e e e
e e e e
e e e e
Ψ
2 2 2 22 3 20
21 21 21 21
2 2 2 22 4 6 40
21 21 21 211
2 2 2 219 38 57 380
21 21 21 21
2 2 2 220 40 60 400
21 21 21 21
1 1 1 1 ... 1
1 ...
1 ...
... ... ... ... ... ...
1 ...
1 ...
j j j j
j j j j
j j j j
j j j j
e e e e
e e e e
e e e e
e e e e
Ψ
x xf ΨF
2. Take M random samples/measurements in the time domain
It can be modeled by using matrix : xy Φf
• is defined as a random permutation matrix
• y is obtained by taking M random elements of fx
• Taking 8 random samples (out of 21) on the positions:
5 9 10 12 13 15 18 20
x x y ΦΨF AF
A=ΦΨ
obtained by using the 8
randomly chosen rows
in
2 2 24 8 80
21 21 21
2 2 28 16 160
21 21 21
2 2 29 18 180
21 21 21
2 2 211 22 220
21 21 21
2 2 212 24 240
21 21 21
2 2 214 28 280
21 21 21
2 217 34
21 21
1 ...
1 ...
1 ...
1 ...1
211 ...
1 ...
1 ...
j j j
j j j
j j j
j j j
MxNj j j
j j j
j j
e e e
e e e
e e e
e e e
e e e
e e e
e e
A ΦΨ
2340
21
2 2 219 38 380
21 21 211 ...
j
j j j
e
e e e
0 5 10 15 20-1
-0.5
0
0.5
1
time
• The system with 8 equations and 21 unknowns is obtained
Blue dots – missing samples
Red dots – available samples
The initial Fourier transform
-10 -5 0 5 100
2
4
6
frequency
2 24 72
21 21
2 28 144
21 21
2 29 162
21 21
2 211 198
21 21
2 212 216
21 21
2 214 253
21 21
2 217 306
21 21
2 219 342
21 21
1
21
j j
j j
j j
j j
j j
j j
j j
j j
e e
e e
e e
e e
e e
e e
e e
e e
A
{2,19}
Components are on the
positions -2 and 2 (center-
shifted spectrum), which
corresponds to 19 and 2 in
nonshifted spectrum
A
A is obtained by taking
the 2nd and the 19th column of A
Least square solution
0 5 10 15 20-1
-0.5
0
0.5
1
time
-10 -5 0 5 100
5
10
frequency
Reconstructed Reconstructed
xA F =y
1( )
T Tx
T Tx
A A F =A y
F A A A y
2 212 216
21 21
2 29 162
21 21
2 214 252
21 21
2 211 198
21 21
2 28 144
21 21
2 217 306
21 21
2 24 72
21 21
2 219 342
21 21
1
21
j j
j j
j j
j j
j j
j j
j j
j j
e e
e e
e e
e e
e e
e e
e e
e e
A
Problem
formulation:
CS Applications
0 5 10 15 20-1
-0.5
0
0.5
1
0 5 10 15 200
1
2
3
4
0 5 10 15 20-1
-0.5
0
0.5
1
0 5 10 15 200
5
10
15
Randomly undersampled
FFT of the randomly
undersampled signal
Math.
algorithms
Reconstructed signal
Signal frequencies
CS problem formulation
The method of solving the undetermined system of equations
, by searching for the sparsest solution can be described as: y=ΦΨx Ax
0min subject to x y Ax
0x l0 - norm
• We need to search over all possible sparse vectors x with K
entries, where the subset of K-positions of entries are from the
set {1,…,N}. The total number of possible K-position subsets is
N
K
A more efficient approach uses the near optimal solution
based on the l1-norm, defined as:
1min subject to x y Ax
CS problem formulation
• In real applications, we deal with noisy signals.
• Thus, the previous relation should be modified to include the
influence of noise:
1 2min subject to x y Ax
2e
L2-norm cannot be used because the minimization problem
solution in this case is reduced to minimum energy solution,
which means that all missing samples are zeros
CS conditions
CS relies on the following conditions:
Sparsity – related to the signal nature;
Signal needs to have concise representation when expressed in a
proper basis (K<<N)
Incoherence – related to the sensing modality; It
should provide a linearly independent measurements
(matrix rows)
Random undersampling is crucial
Restriced Isometry Property – is important for
preserving signal isometry by selecting an appropriate
transformation
Signal - linear combination of the
orthonormal basis vectors
Summary of CS problem formulation
1
( ) ( ), : .N
i ii
f t x t or
f =Ψx𝐟
y=ΦfSet of random measurements:
random measurement
matrix
transform
matrix
transform
domain vector
CS conditions
Restricted isometry property
◦ Successful reconstruction for a wider range of sparsity
level
◦ Matrix A satisfies Isometry Property if it preserves the
vector intensity in the N-dimensional space:
◦ If A is a full Fourier transform matrix, i.e. :
2 2
2 2Ax x
NA Ψ
2 2
2
2 2
21
N Ψx x
x
RIP
CS conditions
• For each integer number K the isometry constant K
of the matrix A is the smallest number for which the
relation holds:
2 2 2
2 2 2(1 ) (1 ) K Kx Ax x
2 2
2
2 2
2 K
Ax x
x
0 1 K - restricted isometry constant
CS conditions
Matrix A satisfies RIP the Euclidian length of
sparse vectors is
preserved
• For the RIP matrix A with (2K, δK) and δK < 1, all subsets
of 2K columns are linearly independent
( ) 2spark KA
spark - the smallest number of dependent columns
CS conditions
2 ( ) 1spark M A
( ) 1spark A - one of the columns has all zero
values
( ) 1 spark MA - no dependent columns
A (MXN)
1 1
( ) 12 2
K spark MA
the number of measurements should be at
least twice the number of components K:
2M K
Incoherence Signals sparse in the transform domain , should be dense in the
domain where the acquisition is performed
Number of nonzero samples in the transform domain and the
number of measurements (required to reconstruct the signal)
depends on the coherence between the matrices and Φ.
and Φ are maximally coherent - all coefficients would be
required for signal reconstruction
22
,( , ) max
i j
i ji j
Mutual coherence: the
maximal absolute value of
correlation between two
elements from and Φ
Incoherence
Mutual coherence:
22,1 ,
,( ) max ,
i j
i j i j Mi j
A A
A A
A A = ΦΨ
maximum absolute value of normalized inner product
between all columns in A
Ai and Aj - columns of matrix A
• The maximal mutual coherence will have the value 1 in the
case when certain pair of columns coincides
• If the number of measurements is: ( , ) logM C K N Φ Ψ
then the sparsest solution is exact with a high probability (C is a
constant)
Reconstruction approaches
• The main challenge of the CS reconstruction: solving an
underdetermined system of linear equations using
sparsity assumption
1 - optimization, based on linear programming methods,
provide efficient signal reconstruction with high accuracy
• Linear programming techniques (e.g. convex
optimization) may require a vast number of iterations
in practical applications
• Greedy and threshold based algorithms are fast
solutions, but in general less stable
f
Transform matrix
Measurement matrix
y Measurement vector
Greedy algorithms –
Orthogonal Matching
Pursuit (OMP)
Influence of missing samples to the
spectral representation
• Missing samples produce noise
in the spectral domain. The
variance of noise in the DFT
case depends on M, N and
amplitudes Ai:
2
2( ) 1 exp( )
N K
MS
TP T
2 2
1
var{ }1i
K
MS k k i
i
N MF M A
N
• The probability that all (N-K)
non-signal components are
below a certain threshold value
defined by T is (only K signal
components are above T):
Consequently, for a fixed value of P(T) (e.g.
P(T)=0.99), threshold is calculated as:
12
12
log(1 ( ) )
log(1 ( ) )
N KMS
NMS
T P T
P T
When ALL signal
components are above the
noise level in DFT, the
reconstruction is done
using a Single-Iteration
Reconstruction algorithm
using threshold T
How can we determine the number of available samples
M, which will ensure detection of all signal components?
Assuming that the DFT of the i-th signal component (with the
lowest amplitude) is equal to Mai, then the approximate
expression for the probability of error is obtained as:
argmin{ }opt errM
M P
• For a fixed Perr, the optimal value of M (that allows to detect all
signal components) can be obtained as a solution of the
minimization problem:
For chosen value of Perr and expected value of minimal amplitude ai, there
is an optimal value of M that will assure components detection.
2 2
21 1 1 exp
N K
ierr i
MS
M aP P
Optimal number of available samples M
Algorithms for CS reconstruction of sparse signals
y – measurements
M - number of measurements
N – signal length
T - Threshold
• DFT domain is assumed as sparsity domain
• Apply threshold to initial DFT components
(determine the frequency support)
• Perform reconstruction using identified support
Single-Iteration Reconstruction Algorithm in DFT domain
20 40 60 80 100 120-4
-2
0
2
4
6Original signal, time domain
20 40 60 80 100 1200
20
40
60
80
100
120
140
20 40 60 80 100 120-4
-2
0
2
4
6Original signal, time domain
20 40 60 80 100 120-4
-2
0
2
4
6Recovered, time domain, w ith 31 samples
20 40 60 80 100 120
20
40
60
80
100
120
140
160
180
200Recovered, DFT w ith 31 samples
20 40 60 80 100 120
20
40
60
80
100
120
140
160
180
200Original signal, DFT
Exam
ple
1: S
ingle
ite
rati
on
25% random measurements
Original DFT is sparse
Incomplete DFT is not sparse
Threshold
Reconstructed signal in frequency Reconstructed signal in time
In each iteration we
need to remove the
influence of previously
detected components
and to update the
value of threshold
Case 2: Threshold cannot
select desired components
– iterative solution
• External noise + noise
caused by missing samples
2 2 2 2 2
11
K
MS N i N
i
N MM M A M
N
2 2 22 1 2
22 2 2 2
1 2
( ... )1 1
( ... )1
KMS
NN K N N
N MM A A A
NN M
M A A A MN
Case 3: External noise
12 log(1 ( ) )NT P T
• To ensure the same probability of
error as in the noiseless case we need
to increase the number of
measurements M such that:
( )1
( ) ( 1)
N N
M N M SNR
M SNR N M N
2 2( 1) ( ) 0 N NM SNR M SNR N N SNR MN M
Solve the equation:
Dealing with a set of noisy data – L-estimation
approach
20 40 60 80 100 120
-100
-50
0
50
100
Original signal, time domain
20 40 60 80 100 120
-100
-50
0
50
100
Original signal, time domain
20 40 60 80 100 120-150
-100
-50
0
50
100
150Original noisy signal, sorted values
DISCARD DISCARD
MEASUREMENTS
20 40 60 80 100 120-50
0
50Original signal, time domain
MEASUREMENTS
Original data- DESIRED Noisy data - AVAILABLE
Sorted noisy data Denoised data
20 40 60 80 100 120-50
0
50Measurements
We end up with a random
incomplete set of samples
that need to be recovered
20 40 60 80 100 120
-100
-50
0
50
100
ReconstructedReconstructed signal
20 40 60 80 100 120
-100
-50
0
50
100
Original signal, time domainOriginal data- DESIRED
Dealing with a set of noisy data – L-estimation
approach
General deviations-based approach
( )x n -K-sparse in DFT domain
-Navail – positions of the
available samples
- M-number of available samples
2 /{ ( , )} { ( ) ( )}j kn NF e n k F x n e X k
Loss function
2standard form
{ } robust form
general formL
e
F e e
e
An incomplete set of samples causes
random deviations of the DFT outside the
signal frequencies.
The DFT values at the frequencies
corresponding to signal components are
characterized by non-random behavior: the
sum of generalized deviations of the values
at non-signal frequencies is constant and
higher than at the signal components
positions.
1
1
2 / 2 /1
2 / 2 /1
( ) { ( ) ,..., ( ) }
( ) { ( ) ,..., ( ) }
M
m avail
M
m avail
j kn N j kn NM
n N
j kn N j kn NM
n N
X k mean x n e x n e
X k median x n e x n e
Robust statistics
General deviations-based approach
1
:
1( ) { ( , ) ( , ),..., ( , ) }
i avail
i Mn N
Generalized deviations
GD k F e n k mean e n k e n kM
1,
( )( )
1
KL
p ii i p
M N MGD k A
N
1
( )( ) const
1
KL
q ii
M N MGD k A
N
kp – signal frequency
kq – non-signal frequency
1
1
2 / 2 /1
2 / 2 /1
( ) { ( ) ,..., ( ) }
( ) { ( ) ,..., ( ) }
M
m avail
M
m avail
j kn N j kn NM
n N
j kn N j kn NM
n N
X k mean x n e x n e
X k median x n e x n e
2 /( , ) ( ) ( )j kn Nm me n k x n e X k
for all available samples
General deviations-based approach
• If the number of
components/number of iterations is
unknown, the stopping criterion can be
set
• Stopping criterion: Adjusted based on the l2-
norm bounded residue that
remains after removing
previously detected
components
Variances at signal
(marked by red
symbols) and non-
signal positions
Fourier transform of
original signal
General deviations-based approach
Example
in - missing samples positions
( )y n - Available signal samples
- Constant; determines
whether sample should be
decreased or increased
- Constant that affect
algorithm performance
P - Precision
Gradient algorithm
jn - Available samples positions
Form gradient vector G
Estimate the differential of the signal
transform measure
1/1[ ( ( ))] ( ) ,
1
p
pk
T x n X kN
p
M
Gradient algorithm - Example
Signal contains 16 samples
Missing – 10 samples (marked with red)
Signal is iteratively reconstructed using Gradient algorithm
Virtual instrument for
Compressive sensing
EEG signals: QRS complex is sparse in Hermite transform domain,
meaning that it can be represented using just a few Hermite
functions and corresponding coeffs.
CS of QRS complexes in the Hermite transform domain
Choose between CS and FULL dataset
“1"
Input available imageData store
Input keyData store
GradientCS
reconstruction
Impulse noise
Data store
Reconstructed image
“1"
en
en
Image
In1
In2
Out2
Out1
Image
In2
In1
SAMPLER
L-ESTIMATE
In2
In1
Out1
RAND_GEN
Out1 A1
A2
B1
B2
C1
C2
Out2
SelS1 S2 S3
100101
114
105
106
104
108
109
107 110
111
112
113
115116
118120
119
131
122 123
121
130
129
126125
124
127
128
132
N
M
133
134
135
136
“0"
102
start
en
“0"Gaussian noise
117
103
Compressive sensing based image
filtering and reconstruction
Control Logic 0 0 ... 1 ... 0
... ...
POS
POS = p(i)
p(i)-th
DCT2LxL
point
N
y(1), y(2),…,y(N) DCT2LxL
point
N
+
In2 ® Pixels
- 1/N+
_
N
N
N
++
N
N
1
1
_
...
...
POS = p(i)
g(p(i))
g(1)
N
g(N)
p(1) p(2) p(i) p(X) X=N-M=(LxL)-M
N
abs
absN
N
In1Positions LxL
Compare
N=LxL
701702
703
704
705
706707
708
709
710
711712
713
714
715
716
718
719
720
721
722
724
723
725
726
727
728729
730
731
732
735
736
737
733
734
717
CLK2
CLK1
Realization of the adaptive gradient-
based image reconstruction algorithm
THANK YOU
FOR YOUR ATTENTION!