data-driven tight frames of kronecker type and applications to...
TRANSCRIPT
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE AND1
APPLICATIONS TO SEISMIC DATA INTERPOLATION AND2
DENOISING∗3
LINA LIU † , GERLIND PLONKA‡ , AND JIANWEI MA§4
Abstract. Seismic data interpolation and denoising plays a key role in seismic data processing.5These problems can be understood as sparse inverse problems, where the desired data are assumed6to be sparsely representable within a suitable dictionary. In this paper, we present a new data-driven7tight frame of Kronecker type (KronTF) method that avoids the vectorization step and considers8the multidimensional structure of data in a tensor-product way. It takes advantage of the structure9contained in all different modes (dimensions) simultaneously. In order to overcome the limitations10of a usual tensor-product approach we also incorporate data-driven directionality (KronTFD). The11complete method is formulated as a sparsity-promoting minimization problem. It includes two main12steps. In the first step, a hard thresholding algorithm is used to update the frame coefficients of the13data in the dictionary. In the second step, an iterative alternating method is used to update the tight14frame (dictionary) in each different mode. The dictionary that is learned in this way contains the15principal components in each mode. Furthermore, we apply the proposed tight frames of Kronecker16type to seismic interpolation and denoising. Examples with synthetic and real seismic data show17that the proposed method achieves better results in comparison with the traditional projection onto18convex sets (POCS) method based the Fourier transform and the previous vectorized data-driven19tight frame (DDTF) methods. In particular, the simple structure of the new frame construction20makes it essentially more efficient.21
Key words. Seismic data, Interpolation and denoising, Dictionary learning, Tight frame22
AMS subject classifications. 65T60, 68U10, 86A2223
1. Introduction. Seismic data processing is the essential bridge connecting the24
seismic data acquisition and interpretation. Many processes benefit from fully sam-25
pled seismic volumes. Examples of the latter are multiple suppression, migration,26
amplitude versus offset analysis, and shear wave splitting analysis. However, sam-27
pling is often limited by the high cost of the acquisition, obstacles in the field, and28
sampled data contain missing traces. Interpolating 3D data on a regular grid, but29
with an irregular pattern of missing traces, is more delicate. In that case undesired30
attenuate artifacts can arise from improper wave field sampling. Because of its prac-31
tical importance, interpolation and denoising of irregular missing seismic data have32
become an important topic for the seismic data-processing community. Various in-33
terpolation methods have been proposed to handle the arising aliasing effects. We34
especially refer to the f -x domain prediction method by Spitz [24] that is based on35
predictability of linear events without a priori knowledge of the directions of lateral36
coherence of the events. This approach can be also extended to 3D regular data.37
Seismic data interpolation and data denoising can also be regarded as an inverse38
problem where we employ sparsity constraints. Using the special structure of seismic39
data, we want to exploit that it can be represented sparsely in a suitable basis or40
frame, i.e., the original signal can be well approximated using only a small number of41
significant frame coefficients. In this paper, we restrict ourselves to finite-dimensional42
∗Submitted to the editors July 12, 2016.†Department of Mathematics, Harbin Institute of Technology, Harbin; Institute for Numerical
and Applied Mathematics, University of Gottingen, Gottingen, Germany (liulina [email protected])‡Institute for Numerical and Applied Mathematics, University of Gottingen, Gottingen, Germany
([email protected])§Department of Mathematics, Harbin Institute of Technology, Harbin, China ([email protected] )
1
This manuscript is for review purposes only.
2 L. LIU, G. PLONKA, AND J. MA
tight frames. Generalizing the idea of orthogonal transforms, a frame transform is43
determined by a transform matrix D ∈ CM1×N1 with M1 ≥ N1, such that for given44
data y ∈ CN1 the vector u = Dy ∈ CN2 contains the frame coefficients. For tight45
frames, D possesses a left inverse of the form 1cD∗ = 1
cDT ∈ CN1×M1 , i.e., D∗D = cI,46
where I is the identity operator. The constant c is the frame bound that can be set47
to 1 by suitable normalization. In this case the tight frame is called Parseval frame.48
The idea to employ sparsity of the data in a certain basis or frame has been already49
used in seismic data processing. A fundamental question is here the choice of the50
transform D. Transforms can be mainly divided into two categories: fixed-basis (basis51
also can be called frame) transforms and adaptive learned transforms. Transforms in52
the first category are data independent and include e.g. Radon transform [27, 14],53
Fourier transform [22, 19, 30, 1, 26], curvelet transform [12, 23, 13, 21], shearlet54
transform [11], and others. In recent years, adaptive learning transform methods55
came up for data processing. The corresponding transforms are data dependent and56
therefore usually more expensive. On the other hand they often achieve essentially57
better results.58
Transforms of adaptive learning type include principal component analysis (PCA)59
[29] and generalized PCA [28], the method of optimal directions (MOD) [8, 4], the60
K-SVD method (K-mean singular value decomposition) [2], and others. Particularly,61
K-SVD is a very popular tool for training the dictionary. It trains a dictionary from62
patches of the input image. When all of the patches of the input image are used, K-63
SVD is actually an adaptive frame learning method in the image space, where elements64
of the learned adaptive frame are translations of atoms in the patch dictionary. In this65
case, the adaptive frame transform can be understood as a filtering with filters being66
the atoms the learned patch dictionary, or a concatenation of block Toeplitz matrices67
in matrix form. However, the filters in the K-SVD frame are highly correlated and68
highly redundant, and in each step each filter is updated by one SVD, which causes69
the demand of a large computational effort. In order to reduce the computational70
complexity, [6] proposed a new method, named data-driven tight frame (DDTF),71
which in the image space learns a tight frame with a set of orthogonal filters. Due to72
the orthogonality, the DDTF method is able to update the complete set of orthonormal73
filters by one SVD in one iteration step, in contrast to the K-SVD that employs an74
SVD to update each filters consecutively. Therefore, the DDTF method is significantly75
faster than K-SVD, while achieving similar results in terms of image quality. The76
DDTF method has been applied to seismic data interpolation and denoising of two-77
dimensional and high-dimensional seismic data [18, 31].78
Fixed dictionary methods enjoy the high efficiency advantage, while adaptive79
learning dictionary methods can take use of the information of data itself. By com-80
bining the seislets [9] and DDTF, a double sparsity method was proposed for seismic81
data processing [7], which can benefit from both fixed and learned dictionary methods.82
However, a vast majority of existing dictionary learning methods vectorize the patches83
and deal with vectors and therefore do not fully exploit the data structure, i.e., spatial84
correlation of the data pixels. To overcome this problem, Kronecker-based DDTFs85
has been applied to dynamic computed tomography [25, 32] treating the patches as86
tensors instead of vectors and applying a learned filters with tensor structures. The87
tensor decomposition has been also used for seismic interpolation [16].88
Generalizing these ideas, in this paper we present a data-driven tight frame con-89
struction of Kronecker type (KronTF), that inherits the simple tensor-product struc-90
ture of the filters to ensure fast computation also for 3D tensors. This approach is91
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 3
further generalized by incorporating adaptive directionality (KronTFD). The sepa-92
rability of the learned filters makes the resulting dictionaries highly scalable. The93
orthonormality among filters leads to very efficient sparse coding computation, as94
each sub-problem encountered during the iterations for solving the resulting opti-95
mization problem has a simple closed-form solution. These two characteristics, i.e.,96
the computational efficiency and scalability, make the proposed method very suitable97
for processing tensor data. The main contribution of our work includes the following98
aspects: 1) To avoid the vectorization step in previous DDTF method, we unfold the99
seismic data in each mode and use one SVD of small size per iteration to learn the100
filters of the learned tight frame. Therefore, the learned tight frame contains struc-101
tures of data sets in different modes. 2) Further, we introduce a simple method to102
incorporate significant directions of the data structure into the filters. 3) Finally the103
proposed new data-driven dictionaries KronTF and KronTFD are applied to seismic104
data interpolation and denoising.105
This paper is organized as follows. We first introduce the notation borrowed from106
tensor algebra and definitions pertaining tensors that are used throughout our paper.107
In Section 2, we shortly recall the basic idea of construction a data-driven frame108
and present an iterative scheme that is based on alternating updates of the frame109
coefficients for sparse representation of the given training data on the one hand and110
of the data-dependent filters of the tight frame on the other hand. Section 3 is devoted111
to the construction of a new data-driven frame that is based on the tensor-product112
structure, and is therefore essentially cheaper to learn. The employed idea can be113
simply transferred to the case of 3-tensors. In order to make the construction more114
flexible, we extend the data-driven Kronecker frame construction by incorporating115
also directionality in Section 4. The two data-driven tight frames are applied to data116
interpolation/reconstruction and to denoising in Section 5, and the corresponding117
numerical tests for synthetic 2D and 3D data and real 2D and 3D data are presented118
in Section 6. The paper finishes with a short conclusion on the achievements in this119
paper and further open problems.120
1.1. Notations. In this paper, we denote vectors by lowercase letters and ma-121
trices by uppercase letters. For x = (xi)Ni=1 ∈ CN let ‖x‖2 :=
(∑Ni=1 |xi|2
)1/2be122
the Euclidean norm, and for X = (xi,j)N1,N2
i=1,j=1 ∈ CN1×N2 we denote by ‖X‖F =123 (∑N1
i=1
∑N2
j=1 |xi,j |2)1/2
the Frobenius norm.124
For a third order tensor X = (xi,j,k)N1,N2,N3
i=1,j=1,k=1 ∈ CN1×N2×N3 , we introduce the125
three matrix unfoldings that involve the tensor dimensions N1, N2 and N3 in a cyclic126
way,127
X(1) :=(
(xi,j,1)N1,N2
i=1,j=1, (xi,j,2)N1,N2
i=1,j=1, . . . , (xi,j,N3)N1,N2
i=1,j=1
)∈ CN1×N2N3 ,128
X(2) :=(
(x1,j,k)N2,N3
j=1,k=1, (x2,j,k)N2,N3
j=1,k=1, . . . , (xN1,j,k)N2,N3
j=1,k=1
)∈ CN2×N1N3 ,129
X(3) :=(
(xi,1,k)N3,N1
k=1,i=1, (xi,2,k)N3,N1
k=1,i=1, . . . , (xi,N2,k)N3,N1
k=1,i=1
)∈ CN3×N1N2 .130
The multiplication of a tensor X with a matrix is defined by employing the ν-131
mode product, ν = 1, 2, 3, as defined in [17]. For X ∈ CN1×N2×N3 and matri-132
ces V (1) = (v(1)i′,i)
M1,N1
i′=1,i=1 ∈ CM1×N1 , V (2) = (v(2)j′,j)
M2,N2
j′=1,j=1 ∈ CM2×N2 , and V (3) =133
This manuscript is for review purposes only.
4 L. LIU, G. PLONKA, AND J. MA
(v(3)k′,k)M3,N3
k′=1,k=1 ∈ CM3×N3 we define134
X ×1 V(1) :=
(N1∑i=1
xi,j,k v(1)i′,i
)M1,N2,N3
i′=1,j=1,k=1
∈ CM1×N2×N3 ,135
X ×2 V(2) :=
N2∑j=1
xi,j,k v(2)j′,j
N1,M2,N3
i=1,j′=1,k=1
∈ CN1×M2×N3 ,136
X ×3 V(3) :=
(N3∑k=1
xi,j,k v(3)k′,k
)N1,N2,M3
i=1,j=1,k′=1
∈ CN1×N2×M3 .137
Generalizing the usual singular value decomposition for matrices it has been shownin [17] that every (N1 ×N2 ×N3)-tensor X can be written as a product
X = S ×1 V(1) ×2 V
(2) ×3 V(3)
with unitary matrices V (1) ∈ CN1×N1 , V (2) ∈ CN2×N2 , and V (3) ∈ CN3×N3 and138
with a sparse core tensor S ∈ CN1×N2×N3 . By unfolding, this factorization can be139
represented as140
X(1) = V (1)S(1)(V (3) ⊗ V (2))T ,141
X(2) = V (2)S(2)(V (1) ⊗ V (3))T ,(1)142
X(3) = V (3)S(3)(V (2) ⊗ V (1))T ,143
where the S(ν) denote the corresponding ν-modes of the tensor S. Further, ⊗ denotes
the Kronecker product, see [15], which is for two matrices A = (ai,j)M,Ni=1,j=1 ∈ CM×N
and B ∈ CL×K defined as
A⊗B = (ai,jB)M,Ni=1,j=1 ∈ CML×NK .
In order to compute a generalized SVD for the tensor X one may first consider the144
SVDs of the unfolded matrices X(1), X(2) and X(3). Interpreting the equations in (1)145
as SVD decompositions, we obtain the ν-mode orthogonal matrices V (ν), ν = 1, 2, 3,146
directly from the left singular matrices of these SVDs, see [17].147
2. Data-driven Tight frames. In this section, we describe the construction of148
DDTF for two-dimensional data or images, as proposed in [6, 31].149
Generalizing the approach of unitary matrices, we say that a matrix D ∈ CM1×N1150
represents a (Parseval) tight frame of CN1 , if D∗D = IN1 , i.e., D∗ = DT
is the Moore151
Penrose inverse of D. Obviously, each vector a ∈ CN1 can be represented as a linear152
combination of the M1 columns of D∗, i.e., there exists a vector c ∈ CM1 with a = D∗c,153
and for M1 > N1, the coefficient vector is usually not longer uniquely defined. One154
canonical choice for c would be c = Da, since obviously D∗c = D∗Da = a is true.155
In frame theory, D∗ is called synthesis operator while D is the analysis operator.156
Within the last time, many frame transforms have been developed for signal and157
image processing, as e.g. wavelet framelets, curvelet frames, shearlet frames etc. In158
[6] a DDTF construction has been proposed for image denoising, where the analysis159
operator D is composed of filtering with a set of orthogonal filters.160
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 5
Let X be an input image. We now aim at finding a tight frame transform D under161
which X has a sparse representation. It is impossible to achieve this gaol without162
imposing any additional structure of D, as we have only one input image for training.163
In [6], D is assumed to be a filtering by a set of 2-D filters Di for i = 1, . . . ,K. It is164
proved in [6] that D is a tight frame if the filters Di for i = 1, . . . ,K are orthogonal165
and K = S2, where the support of the filters are all S × S. Then, the DDTF can be166
constructed by solving167
(1)
minDi∈CK×K ,C∈CM1×K
K∑i=1
‖Di∗X−Ci‖2F+λ
K∑i=1
‖Ci‖? s.t. 〈Di, Dj〉 = δij/K, ∀ 1 ≤ i, j ≤ K,168
where ‖Ci‖? denotes either ‖Ci‖0, the number of nonzero entries of Ci, or ‖Ci‖1,169
the sum of all absolute values of entries in Ci. The parameter λ > 0 balances the170
approximation term and the sparsity term.171
The DDTF construction (1) can be reformulated as an orthogonal dictionary172
learning in the patch space [3]. Let Y1, . . . , YK are patches of X that are vectorized173
to y1, . . . , yK ∈ CN1 . Define D = [d1, d2, . . . , dK ], where di is the vectorization of174
Di. Then (1) is equivalent to175
(2) minD∈CM1×N1 ,C∈CM1×K
‖DY −C‖2F + λ‖C‖? s.t. D∗D = IN1 .176
Observe that in the above model not only C, the set of sparse vectors ck representing177
yk in the dictionary domain, but also the filters D has to be learned, where the178
side condition D∗D = IN ensures that D is a Parseval tight frame. To solve this179
optimization problem, we employ an alternating iterative scheme that in turns updates180
the coefficient matrix C and the frame matrix D, where we start with a suitable initial181
frame D = D0.182
Step 1. For a fixed frame D ∈ CM1×N1 we have to solve the problem183
(3) minC∈CM1×K
‖DY −C‖2F + λ‖C‖?.184
We obtain for ‖ · ‖1 the solution C = Sλ/2(DY), applied componentwisely, with the185
soft threshold operator186
(4) Sλ/2(x) =
{(signx)(|x| − λ/2) for |x| > λ/2,0 for |x| ≤ λ/2.
187
For for ‖·‖0 we obtain C = Tγ(DY), applied componentwisely, with the hard threshold188
operator189
Tγ(x) =
{x for |x| > γ,0 for |x| ≤ γ,
190
where the threshold parameter γ depends on λ and on the data DY.191
Step 2. For given C we now want to update the frame matrix D by solving the192
optimization problem193
(5) minD∈CM1×N1
‖DY −C‖2F s.t. D∗D = IN1.194
Similarly as in [33] we can show195
This manuscript is for review purposes only.
6 L. LIU, G. PLONKA, AND J. MA
Theorem 1. The above optimization problem (5) is equivalent to the problem
maxD∈CM1×N1
Re (tr (DYC∗)) s.t. D∗D = IN1 ,
where tr denotes the trace of a matrix, i.e., the sum of its diagonal entries, and Re196
the real part of a complex number. If YC∗ has full rank N1, this optimization problem197
is solved by Dopt = VN1U∗, where UΛV ∗ denotes the singular value decomposition of198
YC∗ ∈ CN1×M1 with the unitary matrices U ∈ CN1×N1 , V ∈ CM1×M1 , and VN1∈199
CM1×N1 is the restriction of V to its first N1 columns.200
Proof. We give the short proof for the complex case that is not contained in [33].201
The objective function in (5) can be rewritten as202
‖DY −C‖2F = tr(((DY)∗ −C∗)(DY −C))203
= tr(Y∗D∗DY + C∗C−C∗DY −Y∗D∗C)204
= tr(Y∗Y) + tr(C∗C)− 2Re(tr(DYC∗)),205
where we have used that tr(DYC∗) = tr(C∗DY) = tr(Y∗D∗C). Thus (5) is equiva-lent to
maxD∈CM1×N1
Re (tr (DYC∗)) s.t. D∗D = IN1.
Let now the singular value decomposition of YC∗ be given by
YC∗ = UΛV ∗
with unitary matrices U ∈ CN1×N1 , V ∈ CM1×M1 , and a diagonal matrix of (realnonnegative) singular values Λ = (diag(λ1, . . . , λN1),0) ∈ RN1×M1 . Choosing Dopt :=VN1U
∗, where VN1 is the restriction of V to its first N1 columns, we simply observethat
tr(DoptYC∗) = tr(VN1U∗UΛV ∗) = tr(VN1ΛV ∗) = tr(V ∗VN1Λ) = tr ΛN1 ∈ R,
where ΛN1= diag(λ1, . . . , λN1
) is the restriction of Λ to its first N1 columns. More-over,
(Dopt)∗Dopt = UV ∗N1
VN1U∗ = IN1
.
We show now that the value tr ΛN1is indeed maximal. Let D ∈ CM1×N1 be an
arbitrary matrix satisfying D∗D = IN1. Let U be again the unitary matrix from
the SVD YC∗ = UΛV ∗, and let WN1 := DU . Then the constraint implies thatW ∗N1
WN1 = U∗D∗DU = IN1 . Thus
tr(DYC∗) = tr(WN1U∗UΛV ∗) = tr(WN1
ΛV ∗) = tr(V ∗WN1Λ).
Since V is unitary and since WN1contains N1 orthonormal columns of length M1
also V ∗WN1consists of N1 orthonormal columns. Particularly, each diagonal entry of
V ∗WN1has modulus smaller than or equal to 1. Thus,
Re (tr(V ∗WN1Λ)) ≤ trΛN1
,
and equality only occurs if WN1= VN1
, i.e., if D = Dopt.206
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 7
3. Data-driven Kronecker tight frame. In the patch space, the DDTF trans-207
forms the patches in Cn1×n2 to vectors of length CN1 with N1 = n1 ·n2. However, the208
vectorization ignores the spatial correlation of the patch pixels hence not fully exploit209
the data structure. Moreover, the number of filters becomes huge as the dimension in-210
creases. To overcome these, we present the data-driven Kronecker tight frame, where211
each 2-D filter is a Kronecker product (tensor product) of 1-D vectors.212
Similar to DDTF construction which is equivalent to finding an orthogonal trans-213
form in the patch space, the data-driven Kronecker tight frame construction can be214
reformulated as a tensor-product structured orthogonal transform learning from the215
patches. We omit the details. A usual (tensor-product) two-dimensional orthogonal216
transform for a given patch Y ∈ Cn1×n2 is of the form217
(6) C = D1 Y DT2 ,218
where D1 ∈ Cm1×n1 , D2 ∈ Cm2×n2 are orthogonal matrices. Many existing orthogo-nal transform takes this structure, such as DCT, DFT, and discrete wavelet transform.Employing the properties of the Kronecker tensor product, we obtain by vectorizationof (6)
c := vec(C) = vec(D1Y DT2 ) = (D2 ⊗D1) vec(Y ) = (D2 ⊗D1) y,
where the operator vec reshapes the matrix into a vector, and y := vec(Y ).219
Without vectorizing the patches, and by imposing the tensor-product structure220
of the orthogonal transform, Eq. (2) in Section 2 is revised to221
(7) minD1,D2,C1,...,CK
K∑k=1
(‖D1YkD
T2 − Ck‖2F + λ‖Ck‖?
)s.t. D∗νDν = Inν , ν = 1, 2.222
where Y1, . . . , YK ∈ Cn1×n2 are patches of the input image X, and C1, . . . , CK ∈223
Cm1×m2 are the corresponding coefficient matrices.224
Equivalently, using the notations yk := vec(Yk) ∈ Cn1·n2 , ck := vec(Ck) ∈ Cm1·m2225
as well as Y := (y1, . . . , yK) ∈ Cn1n2×K and C := (c1, . . . , cK) ∈ Cm1m2×K , we can226
write the model in the form227
(8) minD1,D2,C
‖(D2 ⊗D1)Y −C‖2F + λ‖C‖? s.t. D∗νDν = Inν , ν = 1, 2.228
Here, as before ‖·‖? denotes either ‖·‖0 or ‖·‖1. Due to the orthogonality, Proposition229
3 in [6] implies that the filtering with filters being columns of D2 ⊗D1 forms a tight230
frame transform in the image space. Observe that, compared to (2), the matrix231
(D2 ⊗ D1) ∈ Cm1m2×n1n2 of filters has now a Kronecker structure that we did not232
impose before. Thus, learning the filter matrix (D2 ⊗D1) requires to determine only233
n1m1 + n2m2 instead of n1n2m1m2 components. Surely, we need to capture the234
important structures of the images in a sparse manner.235
In order to solve the minimization problem (7) resp. (8), we adopt the alternating236
minimization scheme proposed in the last section. Now, each iteration step consists237
of three independent steps:238
Step 1. First, for fixed matrices D1, D2, we minimize only with respect to C by239
applying either a hard or a soft threshold analogously as in Step 1 for the model (2).240
Step 2. We fix C and D2, and minimize (7) resp. (8) with respect to D1. For this241
purpose, we rewrite (7) as242
(9) minD1∈Cm1×n1
K∑k=1
‖D1YkDT2 − Ck‖2F s.t. D∗1D1 = In1
.243
This manuscript is for review purposes only.
8 L. LIU, G. PLONKA, AND J. MA
As in (5), this problem is equivalent to
maxD1∈Cm1×n1
Re
(tr
(K∑k=1
D1(YkDT2 )C∗k
))s.t. D∗1D1 = In1
.
We take the singular value decomposition
K∑k=1
YkDT2 C∗k = U1Λ1V
∗1
and obtain unitary matrices U1 ∈ Cn1×n1 , V1 ∈ Cm1×m1 and a diagonal matrix of244
singular values Λ1 =(
diag(λ(1)1 , . . . , λ
(1)n1 ), 0
)∈ Rn1×m1 . Now, similarly as shown245
in Theorem 1, the optimal dictionary matrix is obtained by D1,opt = V1,n1U∗1 , where246
V1,n1denotes the restriction of V1 to its first n1 columns.247
Since∑Kk=1 YkD
T2 C∗k is a matrix of size n1 × m1, we only need to apply the248
singular value decomposition of this size here to obtain an update for D1.249
Step 3. Analogously, in the third step we fix C and D1 and minimize (7) resp. (8)with respect to D2. Here, we observe that
minD2∈Cm2×n2
K∑k=1
‖D1YkDT2 − Ck‖2F s.t. D∗2D2 = In2
is equivalent to
minD2∈Cm2×n2
K∑k=1
‖D2YTk D
T1 − CTk ‖2F s.t. D∗2D2 = In2
.
The SVDK∑k=1
Y Tk DT1 Ck = U2Λ2V
∗2
with unitary matrices U2 ∈ Cn2×n2 , V2 ∈ Cm2×m2 , and Λ2 =(
diag(λ(2)1 , . . . , λ
(2)n2 ), 0
)∈ Rn2×m2 yields the update
D2 = V2,n2U∗2 ,
where V2,n2 again denotes the restriction of V2 to its first n2 columns.250
We outline the pseudo code for learning the tight frame with Kronecker structure251
(KronTF) in the following Algorithm 1.252
This approach to construct a data-driven tight frame can now easily be extendedto third order tensors. Using tensor notion (for 2-tensors) the product in (6) reads
C = Y ×1 D1 ×2 D2.
Generalizing the concept above to 3-tensors, we want to find matrices Dν ∈ Cmν×nν ,ν = 1, 2, 3, with mν ≥ nν and D∗νDν = Inν , ν = 1, 2, 3, such that for the tensorpatches Yk ∈ Cn1×n2×n3 , k = 1, . . . ,K, of a given big tensor X , the core tensors
Sk = Yk ×1 D1 ×2 D2 ×3 D3 ∈ Cm1×m2×m3
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 9
Algorithm 1 KronTF Algorithm
Input: Image X, number of iterations TTake patches of X, denoted by Y1, Y2 · · ·YK ∈ Cn1×n2 .Initialize the filter matrices D1 ∈ Cm1×n1 and D2 ∈ Cm2×n2 with D∗1D1 = In1
,D∗2D2 = In2
.for k = 1, 2, . . . , T do
Use the hard/soft thresholding to update the coefficient matrix C = (C1, . . . , CK)as given in Step 1for n = 1 to 2 do do
Use the SVD method to update the filter matrices Dn as given in Steps 2 and3
end forend forreturn D1, D2
are simultaneously sparse. This is done by solving the minimization problem253
minD1,D2,D3,S1,...,SK
K∑k=1
(‖Yk ×1 D1 ×2 D2 ×3 D3 − Sk‖2F + λ ‖Sk‖?
)254
s.t. D∗νDν = Inν , ν = 1, 2, 3.(10)255
Here, the Frobenius norm of X is defined by ‖X‖F := (∑n1
i=1
∑n2
j=1
∑n3
k=3 |xi,j,k|2)1/2,256
and ‖Sk‖? denotes in the case ? = 0 the number of nonzero entries of Sk, and for257
? = 1 the sum of the moduli of all entries of Sk. In the space of the input tensor X , we258
are training a tight frame whose filters are tensor products of three sets of orthogonal259
1-D filters.260
The minimization problem (10) can be solved by four steps at each iterationlevel. In step 1, for fixed D1, D2, D3, one minimizes Sk, k = 1, . . . ,K, by applying athreshold procedure as before. In step 2, for fixed D2, D3 and Sk, k = 1, . . . ,K, weuse the unfolding
(Sk)(1) = D1(Yk)(1)(D3 ⊗D2)T
and have to solve261
minD1∈Cm1×n1
K∑k=1
‖(Sk)(1) −D1(Yk)(1)(D3 ⊗D2)T ‖2F s.t. D∗1D1 = In1.262
This problem has exactly the same structure as (9) and is solved by choosing D1 =V1,n1U
∗1 , where U1Λ1V
∗1 is the singular value decomposition of the (n1 ×m1)-matrix
K∑k=1
(Yk)(1)(D3 ⊗D2)T (Sk)∗(1).
Analogously, we find in step 3 and step 4 the updates D2 = V2,n2U∗2 and D3 = V3,n3U
∗3
from the singular decompositions
K∑k=1
(Yk)(2)(D1 ⊗D3)T (Sk)∗(2) = U2Λ2V∗2
This manuscript is for review purposes only.
10 L. LIU, G. PLONKA, AND J. MA
resp.K∑k=1
(Yk)(3)(D2 ⊗D1)T (Sk)∗(3) = U3Λ3V∗3 .
263
264
Remark 3.1. 1. This dictionary learning approach requires at each iteration step265
three SVDs but for matrices of moderate size nν ×mν , ν = 1, 2, 3.266
2. One may also connect the two approaches considered in Section 2 and inSection 3 by e.g. enforcing a tensor product structure of the filters only in the thirddirection while employing a more general filter for the first and second direction. Inthis case we can use the unfolding
(Sk)(3) = D3 · (Yk)(3) · DT ,
where the dictionary (D2 ⊗D1) is replaced by the matrix D ∈ Cm1m2×n1n2 that not267
necessarily has the Kronecker product structure. The dictionary learning procedure is268
then applied as for two-dimensional tensors.269
4. Directional DDTF with Kronecker structure. Though the data-driven270
Kronecker tight frame considered in Section 3 is of simple structure and therefore271
is faster to implement, the Kronecker structure is limited in representing directional272
features, as usual for tensor product approaches. Therefore, we want to propose now273
a data-driven frame construction whose filters contain both a Kronecker structure and274
a directional structure. We first focus on the 2-D case.275
It is well-known that tensor-product filters are especially well suited for repre-276
senting vertical or horizontal features in an image. Our idea is now to incorporate277
other favorite directions by mimicking rotation of the image. While an exact rotation278
is not possible when we want to stay with the original grid, we apply the following279
simple procedure.280
Let V : Cn1 → Cn1 be the cyclic shift operator, i.e., for some x = (xj)n1−1j=0 we
haveV x := (x(j+1)modn1
)n1−1j=0 .
For a given image X = (x0, . . . , xn2−1) ∈ Cn1×n2 with columns x0, . . . , xn2−1 in Cn1 ,we consider e.g. the new image
X0 = (x0, V x1, V2x2, . . . , V
n2−1xn2−1),
where the j-th column is cyclically shifted by j steps. This procedure for exampleyields
X =
0 1 2 23 1 1 23 3 1 12 3 3 1
X0 =
0 1 1 13 3 3 23 3 2 22 1 1 1
,
such that diagonal structures of X turn to horizontal structures of X0. Vectorizingthe image X into x = vec(X), it follows that
x0 = vecX0 = diag(V j)n2−1j=0 x =
I
VV 2
. . .
V n2−1
x,
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 11
Fig. 1. Illustration of different angles in the range [0, π/4].
where diag(V j)n2−1j=0 ∈ Cn1n2×n1n2 denotes the block diagonal matrix with blocks V j .
Other rotations of X can be mimicked for example by multiplying x = vecX with
diag(V bj`/n1c
)n2−1
j=0, ` = 1, . . . , n1
for capturing the directions in [−π/4, 0] and by
diag(V −bj`/n1c
)n2−1
j=0, ` = −n1 + 1,−n1 + 2, . . . , 0
for capturing directions in [0, π/4]. This range is sufficient in order to bring essential281
edges into horizontal or vertical form. If a priori information about favorite directions282
(or other structures) in images is available, we may suitably adopt the method to283
transfer this structure into a linear vertical or horizontal structure. Possible column284
shifts in the data matrices mimicking directions are illustrated in Figure 1.285
Using this idea to incorporate directionality, we generalize the model (7) resp.286
(8) for filters learning as follows. nstead of considering only one set of orthogonal287
Kronecker filters D2 ⊗D1, we employ R sets of orthogonal Kronecker filters288
(11)
(D2 ⊗D1)(diag
(V bjα1c
)n2−1j=0
)...
(D2 ⊗D1)(diag(V bjαRc
)n2−1j=0
)
∈ CRm1m2×n1n2289
with R constants α1, . . . , αR ∈ { − 1 + 1n1,−1 + 2
n1, . . . n1−1)
n1, 1} capturing R favorite290
directions. Then the model reads291
(12) minD1,D2,CR,α1...,αR
R∑r=1
(‖(D2 ⊗D1)(diagV bjαrc)n2−1
j=0 Y −Cr‖2F + λ‖Cr‖?)
292
s.t. D∗νDν = Inν , ν = 1, 2.
where Cr ∈ Cm1m2R×K contains the blocks of transform coefficients for each direc-293
tion, and where ‖.‖? denotes either the 1-norm or the 0-subnorm as before. Similar to294
[6, Proposition 3], one can prove that the filtering with filters being rows of (11) form295
a tight frame transform for the input image. In other words, through (12), we are find-296
ing a data-driven tight frame with filters containing both Kronecker and directional297
This manuscript is for review purposes only.
12 L. LIU, G. PLONKA, AND J. MA
structures. Compared to (8), the filter matrix is now composed of a Kronecker matrix298
(D2⊗D1) ∈ Cm1m2×n1n2 and block diagonal matrices (diagV bjαrc)n2−1j=0 , r = 1, . . . , R299
capturing different directions of the image features. Particularly, learning the filter300
matrix (11) only requires to determine n1m1 + n2m2 components of D1, D2 and R301
components αr to fix the directions.302
The minimization problem in (12) can again be solved by alternating minimiza-303
tion, where we determine the favorite directions already in a preprocessing step.304
Preprocessing step. For fixed matrices D1 and D2 we solve the problem
minCαr‖(D2 ⊗D1)(diagV bjαrc)n2−1
j=0 Y −Cαr‖2F + λ‖Cαr‖?
for each direction αr from a predetermined set of possible directions by applying ei-305
ther a hard or a soft threshold operator to the transformed patches Y = (y1, . . . , yK).306
We emphasize, that computing (diagV bjαc)n2−1j=0 Y for some fixed α ∈ (−1, 1] is very307
cheap, it just means a cyclic shifting of columns in the matrices Yk, such that308
(D2 ⊗D2)(diagV bjαc)n2−1j=0 Y corresponds to applying the two-dimensional dictionary309
transform to the images that are obtained from Yk, k = 1, . . . ,K, by taking cyclic310
shifts.311
We a priori fix the number R of considered directions (in practice often just R = 1312
or R = 2) and find the R favorite directions, e.g. by comparing the energies of Cαr313
for fixed thresholds or by comparing the PSNR values of the reconstructions of Y314
after thresholding.315
Once the directions α1, . . . , αR are fixed, we start the iteration process as before.316
317
Step 1. At each iteration level, we first minimize Cαr , r = 1, . . . , R, while the318
complete set of filters (directions and D1, D2) is fixed. This is done by applying the319
thresholding procedure. This step is not necessary at the first level since we can use320
here Cαr obtained in the preprocessing step.321
Step 2. For fixed directions α1, . . . , αR, corresponding Cαr , r = 1, . . . , R, and fixedD2 we consider the minimization problem for D1. We recall that Cαr consists ofK columns, where the k-th column is the vector of transform coefficients of Yk. Wereshape
Cαr ∈ Cm1m2×K ⇒ (Cαr,1, . . . Cαr,K) ∈ Cm1×m2K
and
diag(V bjαrc
)n2−1
j=0Y ∈ Cn1n2×K ⇒ (Yαr,1, . . . , Yαr,K) ∈ Cn1×n2K ,
where the k-th column of Cαr of length m1m2 is reshaped back to an (m1 × m2)-matrix Cαr,k by inverting the vec operator, and analogously, the k-th column of
diag(V bjαrc
)n2−1j=0
yk ∈ Cn1n2 is reshaped to a matrix Yαr,k ∈ Cn1×n2 . Now we have
to solve
minD1∈Cm1×n1
R∑r=1
K∑k=1
‖D1(Yαr,kDT2 − Cαr,k‖2F s.t. D∗1D1 = In1
with similar structure as in (9). As before, we apply the singular value decomposition
R∑r=1
K∑k=1
Y`r,kDT2 C∗`r,k = U1Λ1V
∗1
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 13
with unitary matrices U1 ∈ Cn1×n1 , V1 ∈ Cm1×m1 and a diagonal matrix of singular322
values Λ1 =(
diag(λ(1)1 , . . . , λ
(1)n1 ), 0
)∈ Rn1×m1 . Now, as shown in Theorem 1, the323
optimal dictionary matrix is obtained by D1,opt = V1,n1U∗1 , where V1,n1
denotes the324
restriction of V1 to its first n1 columns.325
Step 3. For fixed directions α1, . . . , αR, corresponding Cαr , r = 1, . . . , R, and fixedD1 we consider the minimization problem for D2. With the same notations as above,we can write
minD2∈Cm2×n2
R∑r=1
K∑k=1
‖D2YTαr,kD
T1 − CTαr,k‖
2F s.t. D∗2D2 = In2
and obtain the optimal solution D2,opt = V2,n2U∗2 from the singular value decomposi-
tionR∑r=1
K∑k=1
Y Tαr,kDT1 Cαr,k = U2Λ2V
∗2 ,
where V2,n2denotes the restriction of V2 to its first n2 columns. We outline the pseudo326
code for learning the tight frame with Kronecker structure and one optimal direction327
in Algorithm 2.
Algorithm 2 KronTFD Algorithm
Input: Training set of data, number of iterations TLet Y1, Y2 · · ·YK ∈ Cn1×n2 be patches of XInitialize the filter matrices D1 ∈ Cm1×n1 and D2 ∈ Cm2×n2 with D∗1D1 = In1
,D∗2D2 = In2 .First: Find optimal angle direction of patches.for k = 1, 2, . . . ,K dofor angle = −45,−40, . . . , 45 do
Adjust Yk by cyclic shifting of columns by the angle.Soft thresholding to update the coefficient matrix C = (C1, . . . , CK).Reconstruct the image by the tight frame synthesis operator and record theachieved SNR value.
end forThe largest SNR value yields the optimal angle direction of training dataAdjust Yk by cyclic shifting of columns by the optimal angle.
end forSecond: Learn filters.for k = 1, 2, . . . , T do
Use the hard/soft thresholding to update the coefficient matrix C = (C1, . . . , CK)
for n = 1 to 2 do doUse the SVD method to update the filter matrices Dn as given in Steps 2 and3
end forend forreturn D1, D2, optimal angle
328
This manuscript is for review purposes only.
14 L. LIU, G. PLONKA, AND J. MA
5. Application to seismic data reconstruction and denoising. We want to329
apply the new data-driven tight frame constructions to 2D and 3D data reconstruction330
(resp. data interpolation) and to data denoising. Let X denote the complete correct331
data, and let Y be the observed data. We assume that these data are connected by332
the following relation333
(13) Y = A ◦X + γξ.334
Here A ◦X denotes the pointwise product (Hadamard product) of the two matrices335
A and X, ξ denotes an array of normalized Gaussian noise with expectation 0, and336
γ ≥ 0 determines the noise level. The matrix A contains only the entries 1 or 0 and337
is called trace sampling operator. If γ = 0 then the above relation models a seismic338
interpolation problem, and the task ist to reconstruct the missing data. If A = I1,339
where I1 is the matrix containing only ones, and γ > 0, it models a denoising problem.340
The two problems can be solved by a sparsity-promoting minimization method, see341
e.g. [6].342
We assume that our desired data X can be sparsely represented by the data-343
driven tight frame that has been learned beforehand, either by KronTF with kronecker344
product filters presented in Section 3 or by KronTFD with directional kronecker345
product filters presented in Section 4. We use D to denote the analysis operator of346
the learned tight frame.347
In geophysical community, alternating projection algorithms as POCS (projectiononto convex sets) are very popular. The approach in [1] for Fourier bases (insteadof frames) can be interpreted as follows. We may try to formulate the interpolationproblem as a feasibility problem. We look for a solution X of (13) that on the onehand satisfies the interpolation condition A ◦ X = Y , i.e., is contained in the set ofall data
M := {Z : A ◦ Z = Y }
possessing the observed data Y . This constraint can be enforced by applying theprojection operator onto M ,
PMX = (I1 −A) ◦X + Y
that leaves the unobserved data unchanged and projects the observed traces to Y .348
On the other hand, we want to ensure that the solution X has a sparse represen-349
tation in the constructed data-driven frame. The sparsity constraint is enforced by350
applying a (soft) thresholding to the transformed data, i.e., we compute351
PDλX := D∗SλDX,
where D denotes the tight frame analysis operator, and Sλ is the soft threshold oper-352
ator as in (4). We observe however, that PDλ is not longer a projector and therefore353
this approach already generalizes a usual alternating projection method.354
The complete iteration scheme can be obtained by alternating application of PM355
and PDλk ,356
(14) Xk+1 = PM (PDλkXk) = (I1 −A) ◦ (PDλkXk) + Y,357
where λk is the threshold value that can vary at each iteration step. We recall that all358
elements of the matrix I1 are one. To show convergence of this scheme in the finite-359
dimensional setting one can transfer ideas from [20], where an iterative projection360
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 15
scheme had been applied to a phase retrieval problem incorporating sparsity in a361
shearlet frame.362
In the context of image inpainting, (14) was proposed in [5] with framelet trans-363
form and the convergence was also proved there.364
To improve the convergence of this iteration scheme in numerical experiments,365
we adopt the following exponentially decreasing thresholding parameters, see [10],366
λk = λmaxeb(k−1), k = 1, 2, . . . , iter,367
where λ1 = λmax represents the maximum parameter, λiter = λmin the minimum368
parameter, and b is chosen as b = ( −1iter−1 ) ln(λmaxλmin
), where iter is the fixed number of369
iterations in the scheme.370
For data denoising, we only apply the simple one-step (non-iterative) soft thresh-371
olding procedure given by PDλX, where λ is the threshold parameter related to the372
noise level γ.373
6. Numerical Simulations. In a first illustration, we show the results of the374
dictionary learning method, where we have used (64×64) blocks of the seismic image375
in Figure 5(a). In Figure 2(b) we present the learned dictionary using the DDTF376
method. In comparison, the dictionaries learned by KronTF are shown in Figure 2(c)377
and (d). For the DDTF method, a spline wavelet is applied as the initial filter and378
the filter size is set to be 7× 7. The KronTF method chooses the Fourier basis of size379
64× 64 as initial filter.380
The Figures 2(e) and (f) show, how well the data can be sparsified by the learned381
filters compared to the initial filters. DDTF works here better than KronTF, but382
requires a much higher computational effort for dictionary learning. The comparison383
of time costs is illustrated in Figure 3. Here, we also show the comparison to KSVD384
that is even more expensive since it incorporates many SVDs, see Remark 2.3.385
We use the the new data-driven tight frames of Kronecker type (KronTF) and386
the data-driven directional frames of Kronecker type (KronTFD) for interpolation387
and denoising of 2D and 3D seismic data, and compare the performance with the388
results using the POCS algorithm based on the Fourier transform [1], curvelet frames389
and data-driven tight frames (DDTF) method [18].390
For the POCS method, seismic data are cropped into overlapping patches to391
suppress the artifacts resulting from the global Fourier transform. For the DDTF392
method, again a spline wavelet is applied as the initial dictionary and the filter size393
is set to be 7× 7, see [6]394
The quality of the reconstructed images is compared using the PSNR (peak signal-395
to-noise ratio) value, given by396
(15) PSNR = 10 log(maxX −minX
1MN
∑i,j(Xi,j − Xi,j)
),397
where X ∈ CM×N denotes the original seismic data and X ∈ CM×N is the recovered398
seismic data.399
In a first test we consider synthesis data, see Figure 4. Figure 4(a)-(b) show400
the original real data and sub-sampled data with 50% random traces missing.We401
compare the reconstructions (interpolation) using the DDFT method in Figure 4(c),402
the KronTF in Figure 4(d) and the KronTFD method in Figure 4(e).403
In a next example, we consider real seismic data. In Figure 5 and Figure 6 we404
present the interpolation results using different reconstruction methods. Figure 5(c)-405
(f) show the interpolation results by the POCS method and the curvelet transform as406
This manuscript is for review purposes only.
16 L. LIU, G. PLONKA, AND J. MA
well as the difference images. In Figure 6, we show the corresponding interpolation407
results for DDTF, and KronTF and KronTFD together with the error between the408
recovery and original data.409
For a further comparison of the recovery results, we compare a single trace in410
Figure 7. In Table 1, we list the comparisons of reconstruction results obtained from411
incomplete data with different sampling ratios.
Table 1PSNR comparison of five methods for different sampling ratio.
Sampling 0.2 0.3 0.4 0.5 0.6 0.7 0.8KronTFD 35.1327 37.3572 38.8862 41.2564 43.0243 44.0132 44.8996KronTF 34.6198 36.8759 38.3167 40.6312 42.4675 43.8262 44.5342DDTF 33.2942 35.1486 36.9543 39.0385 40.1064 42.3052 42.5668
Curvelet 31.1674 34.6478 35.6347 37.6377 39.4190 40.8930 41.5727POCS 27.5122 31.8804 35.5560 38.2300 39.9878 41.0084 41.6816
412
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 17
(a) (b)
(c) (d)
100
101
102
103
104
0
0.5
1
1.5
2
2.5
3
3.5
4x 10
4
k
ab
s(u
)
Learned
Initial
0 500 1000 1500 2000 2500 3000 3500 4000 45000
200
400
600
800
1000
1200
k
ab
s(u
)
Learned
Initial
(e) (f)
Fig. 2. (a) Initial filters. (b) Learned filter via DDTF. (c) The matrix D1 learned via KronTF.(d) The matrix D2 learned via KronTF. (e) (f) Absolute values of frame coefficients using KronTFand DDTF. Solid lines denote absolute values of frame coefficients using initial filters.
This manuscript is for review purposes only.
18 L. LIU, G. PLONKA, AND J. MA
150 200 250 300 350 400 450 5000
1
2
3
4
5
6
7
Data size
Tra
inin
g tim
e(s
)
DDTF
KronTF
140 160 180 200 220 2400
500
1000
1500
2000
2500
Data size
Tra
inin
g tim
e(s
)
KSVD
DDTF
(a) (b)
Fig. 3. Comparison of time costs for filter learning.
Figures 8 and 9 show denoising results for the data in Figure 8(b) where the noise413
level is 20. We compare the results of POCS, the curvelet transform, DDTF, and our414
new frames KronTF and KronTFD, respectively. With our new data-driven frames,415
we can achieve better results than with the other methods, because the dictionary416
learning methods contain the information on the special seismic data. For better417
comparison, we also display the single trace comparisons in Figure 10. In Table 2, we418
list the comparisons of the achieved PSNR value with different noise levels. KronTF419
and KronTFD achieve competitive results both for interpolation and denoising.420
Finally, we test the interpolation of 3D data with 50% randomly missing traces. In421
Figure 11, we applied the KronTF and the KronTFD tensor technique to the synthesis422
3D seismic data (size of 178×178×128). Here, 8×8×8 patches are used for training.423
The results of the DDTF method for 3D data are shown in Figures 11(c). In Figure424
11(d)-(e) we present the results obtained by the KronTF and KronTFD method,425
respectively. The single trace comparisons are also shown in Figure 11(f).426
Finally, Figure 12 and Figure 13 show interpolation comparison of the DDTF427
method and KronTF method for a real 3D marine data.428
Table 2PSNR comparison of five methods for different noise levels.
Noise 5 10 15 20 25 30 35KronTFD 42.4425 39.2846 37.0246 36.1475 35.2537 34.5327 33.8631KronTF 41.5282 38.5876 36.4967 35.7451 34.7496 33.9365 33.2156DDTF 41.4918 38.4231 36.2578 35.0778 33.7533 32.5935 31.5993
Curvelet 40.7478 38.0749 35.8603 34.4654 33.2531 32.0124 30.6583POCS 40.6547 36.7751 34.1781 32.0267 30.8375 28.6220 27.9847
7. Conclusion. This paper aims at exploiting sparse representation of seismic429
data for interpolation and denoising. We have proposed a new method to construct430
data-driven tensor-product tight frames, which learn a separable dictionaries. In or-431
der to enlarge the flexibility of the filters, we have also proposed a simple method432
to incorporate favorite local directions that are learned from the data. In the nu-433
merical experiments, we employed the new dictionaries to improve both seismic data434
interpolation and denoising.435
The main advantage of the proposed tight frame construction is its computational436
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 19
efficiency which is due to the imposed tensor-product structure. The tight fame437
learning method is applied here to rather small blocks of data, such that local features438
are well recovered. However tensor-based method has its limitation, The algorithm439
need to store the each unfold-mode matrix is used to learn the dictionary, we need440
more storage space. In future, we will extend the idea to learn more global features441
(e.g. by applying a multiscale approach) and incorporate other direction and slope442
detection methods for the KronTFD. Furthermore, we are interested in applications443
to five-dimensional seismic data, where efficiency of the method is of even greater444
importance.445
This manuscript is for review purposes only.
20 L. LIU, G. PLONKA, AND J. MA
Trace number
Tim
e s
am
plin
g n
um
be
r
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
350
400
450
500
Trace number
Tim
e s
am
plin
g n
um
ber
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
350
400
450
500
(a) (b)Trace number
Tim
e s
am
plin
g n
um
be
r
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
350
400
450
500
Trace number
Tim
e s
am
plin
g n
um
be
r
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
350
400
450
500
(c) (d)Trace number
Tim
e s
am
plin
g n
um
ber
50 100 150 200 250 300 350 400 450 500
50
100
150
200
250
300
350
400
450
500
(e)
0 200 40060
80
100
120
140
160
180
200
Time sampling number
Ampl
itude
0 200 40060
80
100
120
140
160
180
200
Time sampling number
Ampl
itude
0 200 40060
80
100
120
140
160
180
200
Time sampling number
Ampl
itude
OriginalKronTFD
OriginalKronTF
OriginalDDTF
(f)
Fig. 4. Interpolation results using DDTF and the new KronTF methods for synthesis datawith irregular sampling ratio of 0.5. (a) Original seismic data. (b) Seismic data with 50% tracesmissing. (c) Interpolation by DDTF method. (d) Interpolation by KronTF method. (e) Interpolationby KronTFD method. (f) Single trace comparisons of the reconstructions with the original data.
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 21
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(a) (b)Trace number
Tim
e s
am
plin
g n
um
be
r
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(c) (d)Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(e) (f)
Fig. 5. Interpolation results on real data with an irregular sampling ratio of 0.5. (a) Originalseismic data. (b) Seismic data with 50% trace missing. (c) Interpolation using the POCS method.(d) Difference between (c) and (a). (e) Interpolation using the curvelet method. (f) Differencebetween (e) and (a).
This manuscript is for review purposes only.
22 L. LIU, G. PLONKA, AND J. MA
Trace number
Tim
e s
am
plin
g n
um
be
r
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(a) (b)Trace number
Tim
e s
am
plin
g n
um
be
r
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(c) (d)Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(e) (f)
Fig. 6. Interpolation results (continued) of Figure 5(b). (a) Interpolation using the DDTF. (b)Difference between (a) and Figure 5(a). (c) Interpolation using the KronTF. (d) Difference between(c) and Figure 5(a). (e) Interpolation using the KronTFD. (f) Difference between (e) and Figure 5(a).
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 23
0 20 40 60 80 100 1200
50
100
150
200
250
300
Time sampling data
Am
plit
ud
e
original
recover
(a)
0 20 40 60 80 100 1200
50
100
150
200
250
300
Time sampling number
Am
plit
ude
0 20 40 60 80 100 1200
50
100
150
200
250
300
Time sampling data
Am
plit
ud
e
original
recover
(b) (c)
0 20 40 60 80 100 1200
50
100
150
200
250
300
Time sampling number
Am
plit
ud
e
original
recover
0 20 40 60 80 100 1200
50
100
150
200
250
300
Am
plit
ude
Time sampling number
original
recover
(d) (e)
Fig. 7. (a)-(e) Single trace comparison of the five interpolation results in Figure 5 and 6 withthe original data as shown Figure 5(a).
This manuscript is for review purposes only.
24 L. LIU, G. PLONKA, AND J. MA
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(a) (b)Trace number
Tim
e s
am
plin
g n
um
be
r
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(c) (d)Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(e) (f)
Fig. 8. Denoising results of methods on real data with noise level of 20. (a) Original seismicdata. (b) Noisy data. (c) Denoising result by POCS method. (d) Difference between (c) and (a).(e) Denoising result by curvelet method. (f) Difference between (e) and (a).
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 25
Trace number
Tim
e s
am
plin
g n
um
be
r
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(a) (b)Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(c) (d)Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
Trace number
Tim
e s
am
plin
g n
um
ber
20 40 60 80 100 120
20
40
60
80
100
120
(e) (f)
Fig. 9. Denoising results for data in Figure 8(b). (a) Denoising by DDTF method. (b)Difference between (a) and 8(a). (c) Denoising by KronTF method. (d) Difference between (c) and8(a). (e) Denoising by KronTFD method. (f) Difference between (e) and 8(a).
This manuscript is for review purposes only.
26 L. LIU, G. PLONKA, AND J. MA
0 20 40 60 80 100 1200
50
100
150
200
250
300
Time sampling number
Am
plit
ud
e
original
denoise
(a)
0 20 40 60 80 100 1200
50
100
150
200
250
300
Time sampling number
Am
plit
ude
Original
Curvelet
0 20 40 60 80 100 1200
50
100
150
200
250
300
Time sampling data
Am
plit
ud
e
original
denoise
(b) (c)
0 20 40 60 80 100 1200
50
100
150
200
250
300
Time sampling number
Am
plit
ud
e
original
denoise
0 20 40 60 80 100 1200
50
100
150
200
250
300
Time sampling data
Am
plit
ude
original
denoise
(d) (e)
Fig. 10. Single trace comparisons for the recovery results in Figure 8 and Figure 9 and theoriginal Figure 8 (a). Here we have (a) application of POCS method, (b) curvelet method, (c)-(e)application of the data-driven frames DDTF, KronTF and KronTFD.
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 27
Shot (km) Receiver (km)
Tim
e (
s)
R
eceiv
er
(km
)
0 1 2 0 1 2
0
1
2
0
0.2
0.4
−12
−10
−8
−6
−4
−2
0
2
4
6
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
2
0
0.2
0.4
−12
−10
−8
−6
−4
−2
0
2
4
6
8
(a) (b)
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
2
0
0.2
0.4
−12
−10
−8
−6
−4
−2
0
2
4
6
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
2
0
0.2
0.4
−12
−10
−8
−6
−4
−2
0
2
4
6
(c) (d)
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
2
0
0.2
0.4
−12
−10
−8
−6
−4
−2
0
2
4
6
(e)
0 20 40 60 80−10
−5
0
5
Time sampling number
Ampl
itude
OriginalKronTF
0 20 40 60 80−10
−5
0
5
Time sampling data
Ampl
itude
0 20 40 60 80−10
−5
0
5
Time sampling number
Ampl
itude
OriginalDDTF
OriginalKronTFD
(f)
Fig. 11. Interpolation of a synthetic seismic 3D data with data size 178×178×128. (a) Originaldata. (b) Data with 50% randomly missing traces. (c)-(e) Interpolation by DDTF, KronTF andKronTFD.(f) Single trace comparisons of their reconstructions with the original data.
This manuscript is for review purposes only.
28 L. LIU, G. PLONKA, AND J. MA
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
20
0.2
0.4
−0.2
−0.1
0
0.1
0.2
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
20
0.2
0.4
−0.2
−0.1
0
0.1
0.2
(a) (b)
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
20
0.2
0.4
−0.2
−0.1
0
0.1
0.2
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
20
0.2
0.4
−0.2
−0.1
0
0.1
0.2
(c) (d)
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
20
0.2
0.4
−0.2
−0.1
0
0.1
0.2
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
20
0.2
0.4
−0.2
−0.1
0
0.1
0.2
(e) (f)
Shot (km) Receiver (km)
Tim
e (s
)
R
ecei
ver (
km)
0 1 2 0 1 2
0
1
20
0.2
0.4
−0.2
−0.1
0
0.1
0.2
(g)
Fig. 12. Interpolation of a real 3D marine data with data size 251 × 401 × 50. (a) Originaldata; (b) Data with 50% randomly missing traces. (c)-(g) Interpolation by POCS method, Curveletmethod, DDTF, KronTF and KronTFD.
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 29
20 40 60 80 100 120 140−0.06
−0.04
−0.02
0
0.02
0.04
0.06
Time sampling number
Ampl
itude
OriginalPOCS
(a)
0 50 100 150−0.06
−0.04
−0.02
0
0.02
0.04
0.06
0.08
Time sampling number
Ampl
itude
OriginalCurvelet
0 50 100 150−0.06
−0.04
−0.02
0
0.02
0.04
0.06
0.08
Time sampling number
Ampl
itude
OriginalDDTF
(b) (c)
0 50 100 150−0.06
−0.04
−0.02
0
0.02
0.04
0.06
0.08
Time sampling number
Ampl
itude
OriginalKronTF
0 50 100 150−0.06
−0.04
−0.02
0
0.02
0.04
0.06
0.08
Time sampling number
Ampl
itude
OriginalKronTFD
(d) (e)
Fig. 13. Single trace comparisons for the recovery results in Figure 12 and the original Figure12 (a). Here we have (a) POCS method, (b) curvelet denoising, (c)-(e) application of the data-drivenframes DDTF, KronTF and KronTFD.
This manuscript is for review purposes only.
30 L. LIU, G. PLONKA, AND J. MA
Acknowledgments. The authors thank Jian-Feng Cai from Hongkong Univer-446
sity of Science and Technology for insightful technical discussions and helpful edit447
on the DDTF. This work is supported by NSFC (grant number: NSFC 91330108,448
41374121, 61327013), and the Fundamental Research Funds for the Central Universi-449
ties (grant number: HIT.PIRS.A201501).450
REFERENCES451
[1] R. Abma and N. Kabir, 3d interpolation of irregular data with a pocs algorithm, Geophysics,45271 (2006), pp. E91–E97, doi:10.1190/1.2148139.453
[2] M. Aharon, M. Elad, and A. Bruckstein, The k-svd: An algorithm for designing of over-454complete dictionaries for sparse representation, IEEE Transactions On Signal Processing,45554 (2006), pp. 4311–4322, doi:10.1109/TSP.2006.881199.456
[3] C. Bao, J.-F. Cai, and H. Ji, Fast sparsity-based orthogonal dictionary learning for image457restoration, in Proceedings of the IEEE International Conference on Computer Vision,4582013, pp. 3384–3391.459
[4] S. Beckouche and J. Ma, Simultaneously dictionary learning and denoising for seismic data,460Geophysics, 79 (2014), pp. A27–A31, doi:10.1190/geo2013-0382.1.461
[5] J.-F. Cai, R. H. Chan, and Z. Shen, A framelet-based image inpainting algorithm, Applied462and Computational Harmonic Analysis, 24 (2008), pp. 131–149.463
[6] J.-F. Cai, H. Ji, Z. Shen, and G.-B. Ye, Data-driven tight frame construction and im-464age denoising, Applied and Computational Harmonic Analysis, 37 (2014), pp. 89–105,465doi:10.1016/j.acha.2013.10.001.466
[7] Y. Chen, J. Ma, and S. Fomel, Double sparsity dictionary for seismic noise attenuation,467Geophysics, 81 (2016), pp. V103–V116, doi:10.3997/2214-4609.201413416.468
[8] K. Engan, S. O. Aase, and J. H. Husoy, Method of optimal directions for frame design, IEEE469International Conference on., 5 (1999), pp. 2443–2446, doi:10.1109/ICASSP.1999.760624.470
[9] S. Fomel and Y. Liu, Seislet transform and seislet frame, Geophysics, 75 (2010), pp. V25–V38,471doi:10.1190/1.3380591.472
[10] J. J. Gao, X. H. Chen, J. Y. Li, G. C. Liu, and J. Ma, Irregular seismic data reconstruc-473tion based on exponential threshold model of pocs method, Applied Geophysics, 7 (2010),474pp. 229–238, doi:10.1007/s11770-010-0246-5.475
[11] S. Hauser and J. Ma, Seismic data reconstruction via shearlet-regularized directional inpaint-476ing, (2012).477
[12] G. Hennenfent and F. J. Herrmann, Simply denoise: wavefield reconstruction via jittered478undersampling, Geophysics, 73 (2008), pp. V19–V28, doi:10.1190/1.2841038.479
[13] F. J. Herrmann and G. Hennenfent, Non-parametric seismic data recovery with curvelet480frames, Geophysical Journal International, 173 (2008), pp. 233–248, doi:10.1111/j.1365-481246X.2007.03698.x.482
[14] M. M. N. Kabir and D. J. Verschuur, Restoration of missing offsets by parabolic483radon transform, Geophysical Prospecting, 43 (1995), pp. 347–368, doi:10.1111/j.1365-4842478.1995.tb00257.x.485
[15] T. G. Kolda and B. W. Bader, Tensor decompositions and applications, SIAM review, 51486(2009), pp. 455–500, doi:10.1137/07070111X.487
[16] N. Kreimer and M. D. Sacchi, A tensor higher-order singular value decomposition for488prestack seismic data noise reduction and interpolation, Geophysics, 77 (2012), pp. V113–489V122, doi:10.1190/geo2011-0399.1.490
[17] L. D. Lathauwer, B. D. Moor, and J. Vandewalle, A multilinear singular value decom-491position, SIAM journal on Matrix Analysis and Applications, 21 (2000), pp. 1253–1278,492doi:10.1137/S0895479896305696.493
[18] J. Liang, J. Ma, and X. Zhang, Seismic data restoration via data-driven tight frame, Geo-494physics, 79 (2014), pp. V65– V74, doi:10.1190/geo2013-0252.1.495
[19] B. Liu and M. D. Sacchi, Minimum weighted norm interpolation of seismic records, Geo-496physics, 69 (2004), pp. 1560–1568, doi:10.1190/1.1836829.497
[20] S. Loock and G. Plonka, Phase retrieval for fresnel measurements using a shearlet sparsity498constraint, Inverse Problems, 30 (2014), p. 055005.499
[21] M. Naghizadeh and M. D. Sacchi, Beyond alias hierarchical scale curvelet interpolation of500regularly and irregularly sampled seismic data, Geophysics, 75 (2010), pp. WB189–WB202,501doi:10.1190/1.3509468.502
[22] M. D. Sacchi, T. J. Ulrych, and C. J. Walker, Interpolation and extrapolation using a high-503resolution discrete fourier transform, IEEE Transactions on Signal Processing, 46 (1998),504
This manuscript is for review purposes only.
DATA-DRIVEN TIGHT FRAMES OF KRONECKER TYPE 31
pp. 31–38, doi:10.1109/78.651165.505[23] R. Shahidi, G. Tang, J. Ma, and F. J. Herrmann1, Applications of randomized sampling506
schemes to curvelet-based sparsity-promoting seismic data recovery, Geophysical Prospect-507ing, 61 (2013), pp. 973–997, doi:10.1111/1365-2478.12050.508
[24] S. Spitz, Seismic trace interpolation in the f-x domain, Geophysics, 56 (1991), pp. 785–794,509doi:10.1190/1.3509468.510
[25] S. Tan, Y. Zhang, G. Wang, X. Mou, G. Cao, Z. Wu, and H. Yu, Tensor-based dictionary511learning for dynamic tomographic reconstruction, Physics in Medicine and Biology, 60512(2015), pp. 2803–2818.513
[26] D. Trad, Five-dimensional interpolation: Recovering from acquisition constraints, Geophysics,51474 (2009), pp. V123–V132, doi:10.1190/1.3245216.515
[27] D. O. Trad, T. J. Ulrych, and M. D. Sacchi, Accurate interpolation with high-resolution516time-variant radon transforms, Geophysics, 67 (2002), pp. 644–656, doi:10.1190/1.1468626.517
[28] R. Vidal, Y. Ma, and S. Sastry, Generalized principal component analysis(gpca), IEEE518Transactions on Pattern Analysis and Machine Intelligence, 27 (2005), pp. 1945–1959,519doi:10.1109/CVPR.2003.1211411.520
[29] S. Wold, K. Esbensen, and P. Geladi, Principal component analysis, Chemometrics and521intelligent laboratory systems, doi:10.1016/0169-7439(87)80084-9.522
[30] S. Xu, Y. Zhang, D. Pham, and G. Lambar, Antileakage fourier transform for seismic data523regularization, Geophysics, 70 (2005), pp. V87–V95, doi:10.1190/1.1993713.524
[31] S. Yu, J. Ma, X. Zhang, and M. D. Sacchi, Interpolation and denoising of high-525dimensional seismic data by learning a tight frame, Geophysics, 80 (2015), pp. V119–V132,526doi:10.1190/geo2014-0396.1.527
[32] W. Zhou, J.-F. Cai, and H. Gao, Adaptive tight frame based medical image reconstruction:528proof-of-concept study in computed tomography, Inverse Problems, 29 (2013), pp. 125006:1–52918, doi:10.1088/0266-5611/29/12/125006.530
[33] H. Zou, T. Hastie, and R. Tibshirani, Sparse principal component analysis, Journal of com-531putational and graphical statistics, 15 (2006), pp. 265–286, doi:10.1198/106186006X113430.532
This manuscript is for review purposes only.