physics-inspired convolutional neural network for solving

11
1 Physics-Inspired Convolutional Neural Network for Solving Full-Wave Inverse Scattering Problems Zhun Wei and Xudong Chen Abstract—In this paper, to bridge the gap between physical knowledge and learning approaches, we propose an induced current learning method (ICLM) by incorporating merits in tra- ditional iterative algorithms into the architecture of convolutional neural network (CNN). The main contributions of the proposed method are threefold: Firstly, to the best of our knowledge, it is the first time that the contrast source is learned to solve full- wave inverse scattering problems (ISPs). Secondly, inspired by the basis-expansion strategy in the traditional iterative approach for solving ISPs, a combined loss function with multiple labels is defined in a cascaded end-to-end CNN (CEE-CNN) architecture to decrease the nonlinearity of objective function, where no additional computational cost is introduced in generating extra labels. Thirdly, to accelerate the convergence speed and decrease the difficulties of the learning process, the proposed CEE-CNN is designed to focus on learning minor part of the induced current by introducing several skip connections and enforcing the major part of induced current bypassed to the output layers. The proposed method are compared with the state of the arts of deep learning scheme and a well-known iterative ISP solver, where numerical and experimental tests are conducted to verity the proposed ICLM. Index Terms—Inverse scattering problems, convolutional neu- ral network, deep learning. I. I NTRODUCTION E LECTROMAGNETIC inverse scattering problems (ISPs) are aimed at determining the nature of unknown scatterer, such as its shape, position, and electrical properties (permit- tivity and conductivity), from the measured scattered fields. The imaging techniques based on ISPs have wide applications in the fields of nondestructive evaluation [1], geophysics [2], biomedical imaging and diagnosis [3], through-wall imaging [4], [5], remote sensing [6], security checks [7] and so on. The ISPs are challenging to solve due to the fact that they are intrinsically ill-posed and nonlinear. The ill-posed property means that the unknown parameters do not usually depend on the measured data in a stable way and a small error in measured data may lead to a much larger error in the solution. To deal with the ill-posedness, one needs to restrict the space of admissible unknowns by assuming that they satisfy some a priori conditions, such as smoothness and sparseness [8], [9]. The nonlinearity of ISPs comes from the multiple scattering effect that physically exists in relationship between the scattered field and the unknown parameters of scatterers. In this paper, we focus on solving exact ISPs, where no linear Z. Wei and X. Chen are with Department of Electrical and Comput- er Engineering, National University of Singapore, 4 Engineering Drive 3, Singapore 117583, Singapore (Z. Wei: [email protected], X. Chen: [email protected]). Fig. 1. Typical setups for ISPs, where unknown scatterers are located in a domain of interest (D). Transmitting and receiving antennas are labeled as Tx 1 , Tx 2 , ... and Rx 1 , Rx 2 , ..., respectively. approximation is made. To solve exact ISPs is also referred to as a full-wave approach. ISPs can be solved by either traditional optimization ap- proaches that are often referred to as iterative solvers or learning approaches where neural network is first trained and then is used for reconstruction. Generally speaking, the traditional optimization approaches for solving ISPs can be classified into deterministic and stochastic types [8], [10]. Some popular algorithms, such as distorted Born iterative method [11], contrast source-type inversion (CSI) method [12], and subspace optimization method (SOM) [13], [14], belong to the deterministic type, and stochastic inversion method- s, such as genetic algorithm and evolutionary optimization, belong to the stochastic type [15], [16]. In addition to the aforementioned traditional optimization approaches, learning approaches [17] have also been used to solve ISPs, such as artificial neural networks (ANN) [18], [19] and support vector machines [20]. In early work of ANNs for solving ISPs, the training dataset consists of scattered field information as the input of network and the ground-truth scatterer as the output [18], [19]. Although it costs plenty of time for ANN in the training (learning) process, the trained network is able to solve typical ISPs quickly. Nevertheless, due to the limited layer in the architecture of ANN, previous works [18], [19] mainly use ANN to extract rather general information that is parameterized by only few parameters, such as the positions, sizes, shapes, and piecewise constant permittivities of scatterers. Thanks to the rapid evolution in the field of artificial intelligence, ANN has evolved into the field of

Upload: others

Post on 16-Oct-2021

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Physics-Inspired Convolutional Neural Network for Solving

1

Physics-Inspired Convolutional Neural Network forSolving Full-Wave Inverse Scattering Problems

Zhun Wei and Xudong Chen

Abstract—In this paper, to bridge the gap between physicalknowledge and learning approaches, we propose an inducedcurrent learning method (ICLM) by incorporating merits in tra-ditional iterative algorithms into the architecture of convolutionalneural network (CNN). The main contributions of the proposedmethod are threefold: Firstly, to the best of our knowledge, itis the first time that the contrast source is learned to solve full-wave inverse scattering problems (ISPs). Secondly, inspired bythe basis-expansion strategy in the traditional iterative approachfor solving ISPs, a combined loss function with multiple labels isdefined in a cascaded end-to-end CNN (CEE-CNN) architectureto decrease the nonlinearity of objective function, where noadditional computational cost is introduced in generating extralabels. Thirdly, to accelerate the convergence speed and decreasethe difficulties of the learning process, the proposed CEE-CNNis designed to focus on learning minor part of the inducedcurrent by introducing several skip connections and enforcingthe major part of induced current bypassed to the output layers.The proposed method are compared with the state of the artsof deep learning scheme and a well-known iterative ISP solver,where numerical and experimental tests are conducted to veritythe proposed ICLM.

Index Terms—Inverse scattering problems, convolutional neu-ral network, deep learning.

I. INTRODUCTION

ELECTROMAGNETIC inverse scattering problems (ISPs)are aimed at determining the nature of unknown scatterer,

such as its shape, position, and electrical properties (permit-tivity and conductivity), from the measured scattered fields.The imaging techniques based on ISPs have wide applicationsin the fields of nondestructive evaluation [1], geophysics [2],biomedical imaging and diagnosis [3], through-wall imaging[4], [5], remote sensing [6], security checks [7] and so on.The ISPs are challenging to solve due to the fact that theyare intrinsically ill-posed and nonlinear. The ill-posed propertymeans that the unknown parameters do not usually dependon the measured data in a stable way and a small error inmeasured data may lead to a much larger error in the solution.To deal with the ill-posedness, one needs to restrict thespace of admissible unknowns by assuming that they satisfysome a priori conditions, such as smoothness and sparseness[8], [9]. The nonlinearity of ISPs comes from the multiplescattering effect that physically exists in relationship betweenthe scattered field and the unknown parameters of scatterers.In this paper, we focus on solving exact ISPs, where no linear

Z. Wei and X. Chen are with Department of Electrical and Comput-er Engineering, National University of Singapore, 4 Engineering Drive 3,Singapore 117583, Singapore (Z. Wei: [email protected], X. Chen:[email protected]).

Fig. 1. Typical setups for ISPs, where unknown scatterers are located in adomain of interest (D). Transmitting and receiving antennas are labeled asTx1, Tx2, ... and Rx1, Rx2, ..., respectively.

approximation is made. To solve exact ISPs is also referred toas a full-wave approach.

ISPs can be solved by either traditional optimization ap-proaches that are often referred to as iterative solvers orlearning approaches where neural network is first trainedand then is used for reconstruction. Generally speaking, thetraditional optimization approaches for solving ISPs can beclassified into deterministic and stochastic types [8], [10].Some popular algorithms, such as distorted Born iterativemethod [11], contrast source-type inversion (CSI) method [12],and subspace optimization method (SOM) [13], [14], belongto the deterministic type, and stochastic inversion method-s, such as genetic algorithm and evolutionary optimization,belong to the stochastic type [15], [16]. In addition to theaforementioned traditional optimization approaches, learningapproaches [17] have also been used to solve ISPs, suchas artificial neural networks (ANN) [18], [19] and supportvector machines [20]. In early work of ANNs for solvingISPs, the training dataset consists of scattered field informationas the input of network and the ground-truth scatterer asthe output [18], [19]. Although it costs plenty of time forANN in the training (learning) process, the trained networkis able to solve typical ISPs quickly. Nevertheless, due to thelimited layer in the architecture of ANN, previous works [18],[19] mainly use ANN to extract rather general informationthat is parameterized by only few parameters, such as thepositions, sizes, shapes, and piecewise constant permittivitiesof scatterers. Thanks to the rapid evolution in the field ofartificial intelligence, ANN has evolved into the field of

Page 2: Physics-Inspired Convolutional Neural Network for Solving

2

deep learning, where much deeper and consequently morepowerful networks are used. With deep learning, a moreversatile pixel-based learning approach for inverse problemsbecomes possible, where unknowns are represented by pixelbasis instead of by a few parameters. Neural networks withregression features have provided impressive results on inverseproblems, such as signal denoising [21], biomedical imaging[22], and deconvolution [23].

Traditional objective function approach and learning ap-proach have their own strengths and weaknesses. Objectivefunction approach is able to easily incorporate domain knowl-edge into its model, but its iterative solving process usuallytakes a long time. In comparison, learning approach is ableto provide a solution in real time, but it is generally difficultto incorporate domain knowledge. Naturally, how to bridgethe gap between objective-function approach and machine-learning approach forms a fertile ground for future investiga-tion. For many engineering and physical problems, researchershave gained much insightful domain knowledge that has beenleveraged on to devise approximate direct inversion model oreffective iterative solvers. Thus, it is of paramount importanceto ask and solve the problem of how best to combine ma-chine learning with knowledge of underlying physics as wellas objective-function approaches. Some research works havebeen working along this line. For example, unrolling iterativealgorithms [24], combining an approximate solver with ANN[25], and partially learned method [26] have been proposed insolving inverse problems.

As for full-wave pixel-representation ISPs, there are mainlythree types of neural-network solvers. The first type is a directlearning method that directly regresses the contrast of scattererfrom measurement domain. This type of learning methodhas to spend unnecessary computational cost to learn well-known wave physics and thus it is able to reconstruct onlyvery simple profiles [27]. The second type still solves ISPsin the framework of traditional objective function approach,but employs neural network to learn some components ofiterative solvers. For example, a gradient learning scheme hasalso been proposed in [28] to invert transient electromagneticdata. The third type combines an approximate solver (suchas back-propagation) with ANN to deal with ISPs [27], [29],[30], where the learning process is simplified by transferringthe inputs of ANN from measurement domain to the contrastdomain. In [29], the similarities between the iterative methodin ISPs and the architecture of ANN have also been examined,which inspires to propose a cascaded ANN consisting of threemodules, in which each module is trained separately.

In the aforementioned third type of ANN, domain knowl-edge, i. e., wave physics of ISPs, is utilized only to generate theinput of CNN, and input and output of CNN are approximateand ground-truth contrasts. In this paper, physical principlesinspire us to proceed much further along the way of bridgingthe gap between the learning approach and the objective func-tion approach. We propose an induced current learning method(ICLM), which consists of several strategies to incorporatephysical expertise inspired by traditional iterative algorithmsinto the neural netowork architecture. The proposed ICLMdiffers with our previous work [27] and we summarise our

main contributions as follows: Firstly, inspired by contrast-source type iterative inversion solvers [12], [13], [31], we turnour attention from directly regressing contrast to the problemof estimating induced current from the major part of itselftogether with electrical field. To the best of our knowledge, thisconcept of input and output for CNN has been proposed forthe first time to solve ISPs. Secondly, inspired by the effects ofbasis-expansion on decreasing the difficulties in solving ISPs[32], we define a combined loss function in a cascaded end-to-end network with multiple labels to guide the learning processgradually and to decrease the nonlinearities of ISPs. Thirdly,inspired by the SOM method, where the minor part of inducedcurrent is chosen as unknowns, skip connections are added tolink some specific layers so that the network is able to focuson the learning of minor part of induced current, which isessentially similar to the concept of “residual learning”.

It is worth introducing the notations used throughout thepaper. We use X and X to denote the matrix and vectorof the discretized operator or parameter X , respectively. Fur-thermore, the superscripts ∗”, T , and H denote the complexconjugate, transpose, and conjugate transpose of a matrix orvector, respectively. Finally, we use || · ||F to denote Frobeniusnorm of a matrix.

II. INDUCED CURRENT LEARNING METHOD

For convenience, we consider a two-dimensional (2-D)transverse-magnetic (TM) case with the longitude directionalong z [13]. As presented in Fig. 1, nonmagnetic scatterersare located in a domain of interest (DOI) D ⊂ R2 with free-space background. The unknown scatterers are illuminated byincoming waves generated by Ni line sources located at ripwith p = 1, 2, ..., Ni . For each incidence, the scattered fieldis measured by Nr antennas located at rsq with q = 1, 2, ..., Nr.

A. Formulation of the problemAs in [27], we use the method of moment (MOM) with the

pulse basis function and the delta testing function to discretizethe domain D into M ×M subunits [33], and the centers ofsubunits are located at rn with n = 1, 2, ...,M2. If we denotethe discretized form of total electrical field in domain D asEt

with the nth element being Et(n) = Et(rn), it satisfies

the following discretized Lippmann-Schwinger equation:

Et

= Ei+GD · ξ · E

t, (1)

where Ei, GD, and ξ are discretized incident electrical field,

2-D free space Green’s function, and contrast in domain D,respectively. Specifically, the vector E

iis M2 dimensions with

the nth element being Ei(n) = Ei(rn), and the matrix GD is

M2×M2 dimensions. By approximating every square subunitas a small circle of the same area, i.e., with an equivalentradius a =

√S/π, where S is the area of the subunit, GD is

obtained as:

GD(n, n′) =ik0πa

2J1(k0a)H

(1)0 (k0|rn − rn′ |) (2)

for n 6= n′, and

GD(n, n′) =ik0πa

2H

(1)1 (k0a)− 1 (3)

Page 3: Physics-Inspired Convolutional Neural Network for Solving

3

Fig. 2. The proposed CEE-CNN architecture for the proposed ICLM, where L1, L2, L3, Lc represent the loss function corresponding to the 1th, 2nd, 3th,and combined loss function respectively. Specifically, for the expression of L1, Jo1

e and Jl1e are the eth element in the output of induced current Jo1 and the

label J l1 at the first stage, respectively (The same applies to for L2 and L3). The subscript r and i denote real part and imaginary part of the parameters,respectively.

for n = n′. Here, J1(·), H(1)0 (·), and H

(1)1 (·) denote Bessel

function of the first order, the zeroth order, and the firstorder of Hankel function of the first kind, respectively. k0 isthe wavenumber of homogeneous medium background. Thecontrast ξ is a diagonal matrix with the diagonal elementξ(n, n) = εr(rn)−1, where εr(rn) is the relative permittivityat rn.

If we define induced current J as:

J = ξ · Et, (4)

it is convenient to reformulate (1) as

J = ξ · [Ei +GD · J ]. (5)

The discretized scattered field Es

is given by:

Es

= GS · J, (6)

where Es

and GS are Nr and Nr × M2 dimensions withEs(q) = Es(rq) and GS calculated following (2), respective-

ly.In the forward problem, (5) describes the wave-scatterer

interaction in domain D and is usually referred to as stateequation. Equation (6) describes the scattered field as a re-radiation of induced current and is referred to as data equation.

The goal of ISPs is to reconstruct the relative permittivitiescontained in ξ from the measured scattered field E

sbased on

(5) and (6). If we denote Ψ as the operator of solving thecorresponding forward problem, the nonlinear equation [8]:

Es

p = Ψp(ξ) (7)

does not have an exact solution of ξ in the presence of noise.Here, the subscript p denotes the pth incidence of electrical

field. Instead of directly solving (7), an optimization problemis usually constructed:

argminξ

:

Ni∑p=1

f(Ψp(ξ), Es

p) + g(ξ), (8)

where f is an appropriate measure of mismatch, and often asquare of l2 error norm ||Ψp(ξ) − Esp||2 is used. g(ξ) is theregularization used to balance in data fitting and stability of thesolution. It is well known that (8) is nonlinear and nonconvex,and difficult to solve due to the presence of local minima [8],[34], [35]. There are many iterative optimization algorithmsproposed to solve the nonlinear optimization problems, suchas [11], [36]–[38].

In learning approaches, a training set usually contains anumber of Mt pairs of the ground-truth contrasts and theircorresponding measured scattered fields, {ξm, E

s

m}Mtm=1. A

parametric reconstruction algorithm Rl, is learned by solving[39]:

Rl = argminRθ,θ

:

Mt∑m=1

f(Rθ(Es

m), ξm) + g(θ), (9)

where Rθ represents an ANN with specific architecture andparameters θ are the weights to be learned in Rθ. Here, f isthe measure of mismatch, such as Euclidean loss, while g(θ)is the regularizer on the parameters θ that avoids overfitting.

In the learning approaches that combine an approximatedsolver with ANN [27], [29], such as back-prorogation schemeand dominant current scheme (DCS) in [27], E

s

m is replacedby an approximate solution of contrast ξ

a

m. Consequently,the corresponding parametric reconstruction algorithm Ral

Page 4: Physics-Inspired Convolutional Neural Network for Solving

4

becomes

Ral = argminRθ,θ

:

Mt∑m=1

f(Rθ(ξa

m), ξm) + g(θ). (10)

It is seen from [27], [29] that by changing the inputs of Rθin (9) from the measurement domain to the contrast domainD, the wave physics that describes the radiation from thecontrast domain to the measurement domain has been partiallyaccounted for, and consequently these methods simplify thelearning process and also provide a good start for the trainingof ANN.

B. Induced Current Learning

Inspired by contrast-source type iterative inversion solvers[12], [13], [31], we turn our goal from directly regressingcontrast ξ to regressing induced current J , where both theinputs and outputs of Rθ in (9) are changed. To this end, wefirst estimate a major part of induced current J

+from the data

equation (6) without any approximation.1) Derivation of inputs: By conducting singular value

decomposition (SVD) on GS , one obtains GS =∑n unσnν

Hn

with σ1 ≥ σ2 ... ≥ σM2 ≥ 0, where µn, νn, and σn arethe left, right singular vectors, and singular values of GS ,respectively. Consequently, an major part of induced currentJ+

from the first L dominant singular values can be uniquelycalculated from the data equation (6) with [13]

J+

=

L∑j=1

µHj · Es

σjνj . (11)

It is noted that only a thin SVD is needed to obtain J+

since only the large singular values and vectors are needed.In the proposed ICLM, we have chosen as the input of neuralnetwork both the major part of induced current J

+and the

updated electrical field

E+

= Ei+GD · J

+. (12)

As mentioned earlier, the output of neural network is chosenas induced current J . Since both J

+and E

+are complex

valued, we separate the real and imaginary part of them intodifferent channels and consequently we have a total number of4Ni input channels considering that there are Ni incidences.In Fig. 2, we have used a subscript r and i to denote real andimaginary part of each parameter, respectively. It is importantto note that the input E

+provides additional information

needed for the regression, such as Ei

and GD, which alleviatesthe learning task of neural network.

2) Derivation of multiple labels: To estimate the inducedcurrent J (or J

−, which is referred to as the minor part of

J and is defined as J − J+

) with CNN, the first thoughtwould be setting J (or J

−) as the label. However, due to the

nonlinearity of ISPs, a direct regression of J would be difficult.Partially inspired by various multi-resolution schemes that wehave used in earlier iterative inversion solvers [32], [40], herewe propose to set multiple labels in the intermediate layers ofthe neural network to gradually guide the learning process,

Fig. 3. An example of the real (upper row) and imaginary parts (lower row)of the induced current for: the input (1st column), label of first stage (2ndcolumn), label of the second stage (3rd column), and label of the third stage(4th column) for the proposed CEE-CNN when the DOI is illuminated by thefirst incidence.

which eventually reduces the nonlinearity of the problem.Specifically, a total number of Ns stages are used in ICLMand the label at the sth intermediate layer J

lsis defined as:

Jls

= J+

+ FH· [Ms ◦ (F · J−)], (13)

where F denotes the Fourier transform matrix and ◦ representsHadamard product (i.e., element-wise matrix product). Ms isa low-frequency mask, of which the elements correspondingto the central βs × βs block (the low frequency parts) areequal to one and others are zero. In the learning process, withthe increse of s, we gradually increase the vale of βs andeventually βs equals M at the last stage, where βs = Mmeans that the label is equal to the true induced current (i.e.,Jls

= J). An example of the inputs and labels in a CEE-CNNwith three stages is also presented in Fig. 3, which correspondsto the induced currents at one incidence.

To sum up, Rl in (9) is changed to the following parametricreconstruction algorithm RIl in ICLM , which is learned bysolving:

RIl = argminRθ,θ

:

Mt∑m=1

f(Rθ(J+, E

+), J

l1, J

l2, ..., J) + g(θ),

(14)3) Recovery of contrasts: At the testing stage, the direct

output of CNN would be the estimated induced current Jo

(Jo=J

o3in Fig. 2), and the corresponding total electrical field

Eo

is given byEo

= Ei+GD · J

o(15)

An analytical solution of contrast ξ can be directly obtainedwith E

oand J

oby combining all incidences following (4):

ξo(n, n) =

Ni∑p=1

Jo

p(n) · [Eop(n)]∗

Ni∑p=1|Eop(n)|2

. (16)

C. Cascaded end-to-end CNN

To effectively learn induced current from its major part andelectrical field, the network Rθ in (14) is carefully designedwith a CEE-CNN architecture.

Page 5: Physics-Inspired Convolutional Neural Network for Solving

5

Inspired by our earlier work on traditional objective-function approach to ISPs [13], where minor part of inducedcurrent is chosen as unknowns, we add skip connections, asshown in Fig. 2, to enforce the major part of induced currentbypassed to the output layers (J

o1, J

o2, J

o3) so that the

network focuses on learning the minor part of induced current.For example, although CNN1 in Fig. 2 is trained to regressthe label J

l1, only J

l1 − J+

is learned from the input J+

and E+

. We mention in passing that skip connections havebeen used by earlier works, nevertheless with motivations fromother perspectives.

Further, a combined loss function is defined in CEE-CNNinstead of training each stage separately. As in Fig. 2, theEuclidean loss functions Ls for sth stage is defined as:

Ls =

Mo∑e=1

(Jos

e − Jls

e )2, (17)

where Mo denotes the total number of elements in Jos

or Jls

.The combined loss function Lc is defined as

Lc =

Ns∑s=1

(αs · Ls). (18)

with αs being the weighting coefficients at each stage. It isnoted that all sub-networks of the proposed CEE-CNN inFig. 2 is trained simultaneously instead of separately. In theseparated training, each sub-network is trained with a uniqueloss function and the weights in each sub-network are notaffected by the labels in other sub-networks. On contrary, theweights in the proposed CEE-CNN are updated in a dynamicway following the combined loss function, which ensures thatthe outputs of the early stage are helpful for those of the laterstage since our final goal is to match the labels in the lastlayer.

D. Reducing number of incidences with virtual experiment

In this section, we introduce an approach to “virtually”reduce the number of incidences (Ni) by virtual experiment(VE) so that the number of input channels (Nc) is reducedin Fig. 2. VE is a technique which utilizes the linearity ofthe relationship between the incident and the scattered field tooffer the possibility of a posteriori recombining the performedscattering experiments. Since everything is done “virtually”via a rearrangement of the scattered fields and no additionalmeasurements are necessary, it is widely used to enforceparticular and convenient conditions and some pioneeringworks have been done [41], [42]. Please note that VE is avery general concept, and the choice of the coefficients forlinear combination of incidence wave depends on users’ need.Here, to reduce the number of channels in input and output ofthe proposed network, we purposely designed a set of linearcombination coefficients that have never been reported before.It is also stressed that VE is not compulsory for the proposedICLM, but offers an effective approach to reduce the numberof parameters to be regressed from the network.

In this paper, we take the convenience of VE technique to“virtually” reduce Ni in the proposed ICLM when there are a

large number of incidences available (Ni is large). Specifically,when there are Ni line sources illuminating the DOI (D), weplace in a column-by-column manner the column-vectors ofincident fields due to each incidence, which results in an M2×Ni matrix E

i, given by:

Ei

= Gi · J i (19)

where Gi and J i are the 2D Green’s function matrix andthe magnitude of line sources with the size of M2 ×Ni andNi × Ni, respectively. It is easy to see that Gi is calculatedin a similar way as GS and is just the transpose of GSwhen the transmitting and receiving antennas are located atthe same positions. When Ni unit line sources illuminateDOI one by one, J i is simply a unit matrix. Similar to GS ,Gi can be decomposed by Gi =

∑n u′nσ′nν′

H

n . Here, wevirtually combine the incidences by selecting the first Li rightsingular vectors as the magnitudes of line sources, i. e., thenew magnitude of line sources J

′i becomes

J′i = [ν′1, ν

′2, ..., ν

′Li ]. (20)

Consequently, the virtually formed incident-field matrix Ei′

becomes:Ei′

= Ei· [ν′1, ν′2, ..., ν′Li ]. (21)

We easily see that since the magnitude of incident fields in (21)is proportional to singular values σ′n that are sorted in non-increasing order, keeping only Li incidences in (21) amountsto removing the other Ni−Li weak illumination cases. Thus,the number of incidences Ni is reduced to Li and the numberof input channels is reduced from 4Ni to 4Li.

E. Computational complexity

Although the proposed ICLM is an approach based onmultiple labels, there is no additional computational costintroduced in generating the multiple labels in the trainingstage. In (13), the multiple labels are directly given fromthe forward solver for each scatterer, which is of paramountimportance since dataset usually contains thousands of profilesof scatterers.

In this work, the proposed ICLM has the same computa-tional complexity as DCS proposed in [27]. Specifically, thecomputational complexity of the proposed ICLM is dominatedby the matrix-vector multiplication of GD · J

+in (12) or

GD · Jo

in (15), where the computational complexity isO(M2 logM2) if fast Fourier transform (FFT) is applied inthe matrix-vector multiplication. To obtain J+, a thin SVDconducted on GS is involved and the computational cost ofthe thin SVD is O(N2

rM2) [43]. The computational cost

O(N2iM

2) is also involved if a thin SVD is conducted onGi in (19) to virtually reduce the number of incidences byVE.

For the network, the computational cost includes the basicoperations in CNN, such as convolutions, additions, calcula-tions of ReLU function, and max poolings. Among them, thecomplexity is dominated by convolutions. Specifically, if there

Page 6: Physics-Inspired Convolutional Neural Network for Solving

6

Fig. 4. Four representative examples when testing on the MNIST handwrittendigits by the trained network with (a) ground truth, (b) DCS, and (c) ICLM,where 15% Gaussian noises (SNR=16 dB) are presented in the scattered fields.

are Qi input feature maps, Qo output feature maps, and thefeature map size and convolution kernel size are M × Mand Kf × Kf (Kf = 3 in this paper), respectively, thecomputational workload in the convolution layer is in the orderof O(M2K2

fQiQo) [27], [44]. It is also noted that, in CNNbased inversion methods, an advantage is that each operationin the CNN is ideal for GPU-based parallelization since alloperations are simple and local.

III. TESTS WITH NUMERICAL AND EXPERIMENTAL DATA

This section presents some results to evaluate the perfor-mance of the proposed ICLM in reconstructing relative permit-tivities from scattered field. All reconstructions are conductedat a single frequency, i.e., frequency hoping technique is notinvolved. In all tests, we train the proposed CEE-CNN withhandwritten digits in MNIST datasets [45] but test the trainednetwork on some commonly used profiles including digits,Latin letters, and some complicated profiles consisting ofcylinders. In this section, we also compare the performanceof the proposed ICLM with a well-known iterative method(i.e., SOM [13]) and a recently proposed contrast-based deeplearning scheme (i.e., DCS [27]).

A. Configuration of the scattering system

In the numerical tests, we consider a DOI D with the sizeof 2 × 2 m2 and discretize the domain into 64 × 64 pixels.There are 32 line sources and 32 line receivers equally placedon a circle of radius 3 m centered at (0, 0) m. For ICLM,we use only data corresponding to the first Li = 16 largestsingular values by VE as done in (20) and (21). The operatingfrequency is 400 MHz, and a priori information is that thescatterers are lossless and have nonnegative contrast [13].

Fig. 5. Trajectories of training and validation losses as a function of epochfor (a) DCS and (b) ICLM. It is noted that the loss functions are defined as themismatches in contrast and induced current for DCS and ICLM, respectively.The circles denote the early-stopping positions despite that the loss on thetraining data still decreases.

As in the first column of Fig. 4, the scatterers are modeledby handwriting digits in MNIST database [45] in training,where all images are assigned a dimension of 2×2 m2 and thendiscretized into to 64 × 64 pixels in scattering problems. Toaccount for an arbitrary distribution of scatterers, the digits areallowed to randomly rotate with an angle between 0◦ and 360◦.Further, the scatterers are set to be dielectric with a relativepermittivity 1.5, and to represent possible multiple scatterers,a random circle is also generated in domain D to interlap withthe digit, where the interlapping region is randomly assigneda relative permittivity between 1.5 and 2.5.

In the forward problem, we generate the scattered fieldsfrom Ni incidences numerically using method of moment(MOM) and record them into a matrix E

swith the size of

Nr × Ni. In the training process, we directly use noiselessscattered field, whereas in the testing process, additive whiteGaussian noise n is added to E

s. The resultant noisy matrix

Es

+ n is treated as the measured scattered field that isused to reconstruct relative permittivities, and the noise levelis quantified as (||n||F /||E

s||F ). In order to quantitatively

evaluate the quality of reconstructions, the structural similarityindex (SSIM) between the true and reconstructed relativepermittivity profiles [46] is compared among different recon-struction methods including an iterative method (SOM), a

Page 7: Physics-Inspired Convolutional Neural Network for Solving

7

Fig. 6. Tests on Latin letters with the network trained by MIMIST dataset, where 15% Gaussian noises (SNR=16 dB) are presented in the scattered fields.Here, the ground-truth profiles of T1(b), T2(b), and T3(b) have larger relative permittivities than those in T1(a), T2(a), and T3(a), respectively. It is notedthat, for ICLM, the displayed relative permittivities are calculated from the input and output of networks with (15) and (16).

TABLE ISSIM COMPARISONS AMONG SOM, DCS, AND ICLM, WHERE 15% GAUSSIAN NOISES ARE PRESENTED. (NA MEANS NOT APPLICABLE SINCE 500

TESTS FOR SOM CAUSE A LOT OF COMPUTATIONAL TIME)

SSIM T1(a)/(b) T2(a)/(b) T3(a)/(b) T4(a)/(b) T5(a)/(b) T6(a)/(b) 500 Latin letters 500 circular scatterersSOM 0.76/0.46 0.72/0.41 0.77/0.51 0.79/0.52 0.78/0.75 0.62/0.36 NA NADCS 0.77/0.68 0.73/0.67 0.77/0.67 0.76/0.45 0.82/0.69 0.67/0.49 0.65 0.74

ICLM 0.8/0.75 0.76/0.68 0.78/0.735 0.8/0.8 0.81/0.77 0.69/0.57 0.75 0.77

learning approach (DCS), and the proposed ICLM.

B. Implementation Details

In all the CNN schemes, MatConvNet toolbox [47] is usedto implement the proposed CNN schemes. For the trainingprocess, we use a server with Intel Xeon CPU and 128GB RAM. It takes about 8 hours for the training of ANNsfor both DCS and ICLM, and it takes less than 1 secondfor reconstructing profiles using the trained networks. Oncontrary, it takes few minutes for the iterative method (SOM)to reconstruct a single case.

The hyperparameters are as follows: momentum equals0.99; weight decay equals to 10−6; learning rate decreaseslogarithmically from 10−6 to 10−8. As in Fig. 2, three stages(Ns = 3) are used, and throughout this work, the correspond-ing low-frequency coefficients β1, β2, and β3 are 7, 15, and

64 in (13), respectively. We have also tried CEE-CNN withfive and even seven stages, where no improvement is observedwith the increase of number of stages. Also, the three-stagestructure and low-frequency coefficients are inspired by theprevious work [32]. The weighting coefficients in (18) are0.3, 0.3, and 0.4, respectively. L = 15 is used followingthe criterion of [27]. For fair comparisons, all the parameterspertinent to implementation of SOM and DCS are set to theirsuggested values. It is also noted that a deeper network mayhelp when the dimension of the problem is largely increased(i.e., M2 increased), and we have chosen the simplest structurewithout degrading the performance in this paper. Further, forICLM, to balance the scale of induced current and electricalfield, electrical field is multiplied by a constant coefficient (theratio between the maximum of induced current and electricalfield) in the the input of Fig. 2 before it is used as input ofthe network.

Page 8: Physics-Inspired Convolutional Neural Network for Solving

8

Fig. 7. Tests on circular scatterers with the network trained by MNIST digit dataset, where 15% Gaussian noises (SNR=16 dB) are presented in the scatteredfields. Here, the ground-truth profiles of T4(b) and T5(b) have larger spatial sizes than those in T4(a) and T5(a), respectively. the ground-truth profile of T6(b)has larger relative permittivity than that in T6(a). It is noted that, for ICLM, the displayed relative permittivities are calculated from the input and output ofnetworks with (15) and (16).

Maximum 40 epoches are set in the training process.Further, we empirically applied an “early stopping” strategyto mitigate the effects brought by possible overfitting. Specif-ically, the dataset are split into training and validation datasetin the training process, where we employ 4750 images fortraining purpose and 250 images for validation purpose. As inFig. 5, we stop the training for both DCS and ICLM whenthere is no obvious change in the validation loss (marked outby circles in Figs. 5(a) and (b)), despite the fact that the losson the training data still decreases.

C. Numerical Results

In the first example, we test the trained network on recon-structing profiles which are in the range of training dataset.Specifically, 250 profiles from MNIST digit dataset are recon-structed with 15% Gaussian noises (SNR=16 dB) presentedin the scattered fields for both DCS and ICLM, and fourrepresentative reconstructions are shown in Fig. 4. It is seenthat very satisfying results are obtained for both methods, andthe average SSIMs for both of them are larger than 0.85,where the proposed ICLM does not show any improvementover DCS. It is noted that the reconstructed results from SOMare not presented in this example since these tests are in therange of training dataset, which may be unfair for SOM.

Fig. 8. Comparison of SSIMS of reconstructed T1(a) profiles using SOM,DCS, and ICLM with 15% (SNR=16 dB), 25% (SNR=12 dB), 35% (SNR=9dB), and 50% (SNR=6 dB) Gaussian noises using SOM, DCS, and ICLM.

In the second example, to test the generalization capabilityof the proposed ICLM, we test the network trained by MNISTdigits on different Latin letters, which is more challenging thanthe first numerical example. It is noted that the range of relativepermittivities in the training data is randomly distributed inthe range of 1.5-2.5, whereas the relative permittivities of thetesting letters are in the range of 2.5-3. Also, 15% Gaussian

Page 9: Physics-Inspired Convolutional Neural Network for Solving

9

Fig. 9. True profile of “FoamDielExt” [48]: The large cylinder (SAITECSBF 300) has a diameter of 80 mm with the relative permittivity εr = 1.45±0.15; The small cylinder (berylon) has a diameter of 31 mm with the relativepermittivity εr = 3± 0.3.

noises (SNR=16 dB) are added in the scattered fields. Threerepresentative reconstruction results, labelled as T1, T2, andT3, are shown in Fig. 6. For each of three letters, a lowerpermittivity value case is considered and is labeled as (a)and a higher permittivity case is considered and is labeledas (b). It is obvious that it is more challenging for the trainednetwork to reconstruct the latter case than the former sincethe nonlinearity of the ISPs is increased for a higher value ofpermittivity. It is seen from Fig. 6 that the proposed ICLMis able to obtain satisfying profiles for all the cases, whereasSOM fails for those challenging cases in (b) and DCS showssome apparent distortions especially for those cases in (b). Toquantitatively evaluate the performance, the SSIM for all thethree cases (T1, T2, T3) are also presented in Table I, whereapparent advantages are observed for the proposed ICLM,especially for those more challenging examples.

In the third example, we consider some more challengingprofiles for the trained network, where the network trainedby MNIST digits are tested on some complicated profilesconsisting of cylinders presented in Fig. 7. In particular, thecase T6 is the well-known “Austria ring” that is commonlyused as a benchmark to test the performance of ISP inversionalgorithms. Similar to those in the previous tests, two casesare considered for each profile and the levels of difficulties areincreased for the case in (b) by either increasing the spatialsizes or relative permittivities compared with those in (a). Wesee from the reconstruction results shown in Fig. 7 that DCSexhibits even more apparent distortions and inaccuracies. Forexample, for the case of T4(a), DCS fails to reconstruct thesmall cylinders in the center and in addition the middle ringwith relative permittivity of 1.5 is reconstructed to have muchlower values of permittivity. On contrary, satisfying resultsare obtained for the proposed ICLM. The SSIMs for all thecases (T4, T5, T6) are presented in Table I, where we observeapparent advantages of the proposed ICLM over the DCS andSOM. These results show that the proposed ICLM is ableto significantly increase the generalization capability of deeplearning in solving ISPs.

To further compare the proposed method with DCS, wealso randomly generate 500 Latin letters and 500 circularscatterers with random permittivities to test both DCS and

Fig. 10. Experimental data tests: image reconstruction of “FoamDielExt’using the proposed ICLM at (a) 3 GHz and (b) 4 GHz, where the network istrained with MNIST digit with the random relative permittivities in a rangeof 1.5-2.5. The SSIM between the reconstructed and ground-truth profiles is0.86 and 0.84 for 3 GHz and 4 GHz, respectively. It is noted that the dashedlines denote the exact boundaries of cylinders in Fig. 9.

the proposed ICLM. Specifically, these Latin letters haverandom relative permittivities in a range of 2.5-3. For circularscatterers, random numbers of circular scatterers (1-3) aregenerated in DOI with random relative permittivities in a rangeof 2-2.5. We further calculate the average SSIMs among thesereconstructions, and the results are presented in Table 1. It isfound that the proposed ICLM apparently outperforms DCSin terms of SSIMs.

It is known from previous publications that both SOMand DCS are robust to noise contaminations. To verify therobustness of the proposed ICLM, further simulations havebeen conducted on reconstructing T1(a) with 15% (SNR=16dB), 25% (SNR=12 dB), 35% (SNR=9 dB), and 50% (SNR=6dB) Gaussian noises using SOM, DCS, and ICLM. The SSIMsfor all of these cases are also displayed in Fig. 8. It is seenthat, the proposed ICLM is robust to the noise contaminations.The reason is that we regress the induced current from thedominant part of it, which is calculated from the first L largersingular values with (11) and is stable when the measuredscattered field E

sis contaminated with noise.

D. Tests with Experimental Data

To further validate the proposed ICLM, tests have alsobeen conducted with experimental data measured from InstitueFresnel [48]. Specifically, a “FoamDielExt” profile in Fig. 9with TM case is considered in this section, which is measuredwith 241 receivers for 8 transmitters. The “FoamDielExt” con-sists of two cylinders, where the large cylinder (SAITEC SBF300) has a diameter of 80 mm with the relative permittivityεr = 1.45 ± 0.15, and the small cylinder (berylon) has adiameter of 31 mm with the relative permittivity εr = 3±0.3.It is important to highlight that the ± sign here denotesthe inaccuracy of experimental measurement of real value ofrelative permittivity.

We use the same profiles, including the range of relativepermittivities, as those in the Section III.C, except that theDOI is changed from 2 m× 2 m to 20 cm× 20 cm due to thechange of operating frequencies (from 400 MHz to 3 GHz). Inother words, we still train the network by MNIST dataset withrandom relative permittivities in the range of 1.5-2.5. In Fig.

Page 10: Physics-Inspired Convolutional Neural Network for Solving

10

10, we present the reconstructed relative permittivity profilesfrom the proposed ICLM, where it takes less than 1 second tofinish the reconstruction. It is found that the proposed ICLMis able to obtain satisfying results with the network trainedby MNIST digits. Specifically, the SSIM is 0.86 between thereconstructed image in Fig. 10(a) and the ground-truth profilein Fig. 9, which quantitatively validates the proposed method.In Fig. 10(b), we also present the reconstructed results withthe proposed ICLM when the operating frequencies is changedto 4 GHz, where the SSIM is 0.84 between the reconstructedimage in Fig. 10(b) and the ground-truth profile in Fig. 9.

IV. CONCLUSION

In this paper, we propose an physics-inspired learningapproach to solve full-wave nonlinear inverse scattering prob-lems. Deep learning approach has not yet so far had theprofound impact on nonlinear imaging problems that they havehad for object classification and recognition. One of the crucialsteps in enhancing machine learning’s ability to solve ISPs isto incorporate knowledge of the underlying physics as well asinsights originated from objective-function approaches into thearchitecture of neural network. It is known that an apparentadvantage of traditional optimization approaches is that thedomain or physical knowledge can be explicitly incorporatedin the the optimization process, whereas learning approachestend to more obscure over which we have very little control.In this paper, we introduce several strategies to incorporatephysical expertise inspired by traditional iterative algorithmsinto the neural network architecture, and the learning modelin the proposed ICLM (described by (14)) differs with thatin DCS (described by (10)). The proposed induced currentlearning method (ICLM) consists of three major components:first-ever concepts of input and output where induced current isestimated from the major part of itself and electrical field, cas-caded end-to-end CNN, and introduction of skip connectionsfor learning only minor part of the induced current. Roughlyspeaking, the three components greatly reduce the level ofcomplexity of the nonlinear relationship between the inputand output of CNN, which has been verified by numerousnumerical and experimental tests.

It is found that, although the network is trained with MNISTdataset, it is able to solve general ISPs and outperformsSOM and DCS in reconstructing some challenging profiles.Namely, it is demonstrated that the proposed ICLM hasmuch better generalization capability than that of recentlyproposed contrast-based deep learning scheme (DCS). Anotheradvantage of the proposed method is that it does not introduceadditional computational cost in generating multiple labels,which is important since there are usually thousands of sam-ples in the training dataset. As for the typical reconstructions inISPs, the proposed method is able to finish within 1 second,which is promising in providing quantitative images in realtime. It is important to highlight that, this paper discussesISP, but the proposed methodology provides a promising newdirection for utilizing physical principles and the long-standingdevelopments of traditional objective function methods to de-sign learning machines, which can be extended or generalizedto solve other physical and engineering inverse problems.

Although the proposed method quantitatively achieves sig-nificantly better results than SOM and DCS in dealing withhighly nonlinear cases, there is still room for improvementsconsidering the advantages that have been exhibited by someiterative algorithms that are specially designed to tackle highlynonlinear ISPs, such as the iterative methods based on a newintegral equations [14], [49]. A future direction would beincorporating the physical information in these algorithms intothe architectures of ANN to further improve the abilities ofANN in dealing with highly nonlinear ISPs.

ACKNOWLEDGMENT

This research was supported by the National ResearchFoundation, Prime Minister’s Office, Singapore under its Com-petitive Research Program (CRP Award No. NRF-CRP15-2015-03).

REFERENCES

[1] R. Zoughi, Microwave non-destructive testing and evaluation principles.Springer Science & Business Media, 2012.

[2] R. Persico, Introduction to ground penetrating radar: inverse scatteringand data processing. John Wiley & Sons, 2014.

[3] A. Quarteroni, L. Formaggia, and A. Veneziani, Complex systems inbiomedicine. New York, Milan: Springer, 2006.

[4] L.-P. Song, C. Yu, and Q.-H. Liu, “Through-wall imaging (TWI) byradar: 2-D tomographic results and analyses,” IEEE Transactions onGeoscience and Remote Sensing, vol. 43, no. 12, pp. 2793–2798, 2005.

[5] R. Chen, Z. Wei, and X. Chen, “Three dimensional through-wall imag-ing: Inverse scattering problems with an inhomogeneous backgroundmedium,” in 2015 IEEE 4th Asia-Pacific Conference on Antennas andPropagation (APCAP), 2015, pp. 505–506.

[6] H. Kagiwada, R. Kalaba, S. Timko, and S. Ueno, “Associate memoriesfor system identification: Inverse problems in remote sensing,” Mathe-matical and Computer Modelling, vol. 14, pp. 200–202, 1990.

[7] X. Zhuge and A. G. Yarovoy, “A sparse aperture mimo-sar-based uwbimaging system for concealed weapon detection,” IEEE Transactions onGeoscience and Remote Sensing, vol. 49, no. 1, pp. 509–518, 2011.

[8] X. Chen, Computational Methods for Electromagnetic Inverse Scatter-ing. Wiley, 2018.

[9] G. Oliveri, M. Salucci, N. Anselmi, and A. Massa, “Compressivesensing as applied to inverse problems for imaging: Theory, applications,current trends, and open challenges,” IEEE Antennas and PropagationMagazine, vol. 59, no. 5, pp. 34–46, 2017.

[10] M. Pastorino, Microwave imaging. John Wiley & Sons, 2010.[11] W. C. Chew and Y. M. Wang, “Reconstruction of two-dimensional

permittivity distribution using the distorted Born iterative method,” IEEETransactions on Medical Imaging, vol. 9, no. 2, pp. 218–225, 1990.

[12] P. M. van den Berg and R. E. Kleinman, “A contrast source inversionmethod,” Inverse Problems, vol. 13, no. 6, p. 1607, 1997.

[13] X. Chen, “Subspace-based optimization method for solving inverse-scattering problems,” IEEE Transactions on Geoscience and RemoteSensing, vol. 48, no. 1, pp. 42–49, 2010.

[14] Y. Zhong, M. Lambert, D. Lesselier, and X. Chen, “A new integralequation method to solve highly nonlinear inverse scattering problems,”IEEE Transactions on Antennas and Propagation, vol. 64, no. 5, pp.1788–1799, 2016.

[15] M. Pastorino, “Stochastic optimization methods applied to microwaveimaging: A review,” IEEE Transactions on Antennas and Propagation,vol. 55, no. 3, pp. 538–548, 2007.

[16] P. Rocca, M. Benedetti, M. Donelli, D. Franceschini, and A. Massa,“Evolutionary optimization as applied to inverse scattering problems,”Inverse Problems, vol. 25, no. 12, 2009.

[17] A. Massa, G. Oliveri, M. Salucci, N. Anselmi, and P. Rocca, “Learning-by-examples techniques as applied to electromagnetics,” Journal ofElectromagnetic Waves and Applications, vol. 32, no. 4, pp. 516–541,2018.

[18] S. Caorsi and P. Gamba, “Electromagnetic detection of dielectric cylin-ders by a neural network approach,” IEEE Transactions on Geoscienceand Remote Sensing, vol. 37, no. 2, pp. 820–827, 1999.

Page 11: Physics-Inspired Convolutional Neural Network for Solving

11

[19] I. T. Rekanos, “Neural-network-based inverse-scattering technique foronline microwave medical imaging,” IEEE Transactions on Magnetics,vol. 38, no. 2, pp. 1061–1064, 2002.

[20] E. Bermani, A. Boni, S. Caorsi, and A. Massa, “An innovative real-timetechnique for buried object detection,” IEEE Transactions on Geoscienceand Remote Sensing, vol. 41, no. 4, pp. 927–931, 2003.

[21] J. Xie, L. Xu, and E. Chen, “Image denoising and inpainting with deepneural networks,” in Proceedings of the 27th International Conferenceon Neural Information Processing Systems, 2012, pp. 341–349.

[22] Z. Wei, D. Liu, and X. Chen, “Dominant-current deep learning schemefor electrical impedance tomography,” IEEE Transactions on BiomedicalEngineering, Accepted, 2019.

[23] L. Xu, J. S. J. Ren, C. Liu, and J. Jia, “Deep convolutional neural net-work for image deconvolution,” in Proceedings of the 27th InternationalConference on Neural Information Processing Systems, 2014, pp. 1790–1798.

[24] Y. Yang, J. Sun, H. Li, and Z. Xu, “Deep ADMM-Net for compressivesensing MRI,” in Advances in Neural Information Processing Systems29. Curran Associates, Inc., 2016, pp. 10–18.

[25] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep convolution-al neural network for inverse problems in imaging,” IEEE Transactionson Image Processing, vol. 26, no. 9, pp. 4509–4522, 2017.

[26] J. Adler and O. Oktem, “Solving ill-posed inverse problems usingiterative deep neural networks,” Inverse Problems, vol. 33, no. 12, p.124007, 2017.

[27] Z. Wei and X. Chen, “Deep-learning schemes for full-wave nonlinearinverse scattering problems,” IEEE Transactions on Geoscience andRemote Sensing, Accepted, 2018.

[28] R. Guo, M. Li, F. Yang, S. Xu, G. Fang, and A. Abubakar, “Applicationof gradient learning scheme to pixel-based inversion for transient emdata,” in 2018 IEEE International Conference on Computational Elec-tromagnetics (ICCEM), 2018, pp. 1–3.

[29] L. Li, L. G. Wang, F. L. Teixeira, C. Liu, A. Nehora, and T. J. Cui,“Deepnis: Deep neural network for nonlinear electromagnetic inversescattering,” IEEE Transactions on Antennas and Propagation, vol. 67,no. 3, pp. 1819–1825, 2019.

[30] Y. Sun, Z. Xia, and U. S. Kamilov, “Efficient and accurate inversion ofmultiple scattering with deep learning,” Optics Express, vol. 26, no. 11,pp. 14 678–14 688, 2018.

[31] T. M. Habashy, M. L. Oristaglio, and A. T. d. Hoop, “Simultaneous non-linear reconstruction of two-dimensional permittivity and conductivity,”Radio Science, vol. 29, no. 04, pp. 1101–1118, 1994.

[32] Y. Zhong and X. Chen, “An FFT twofold subspace-based optimizationmethod for solving electromagnetic inverse scattering problems,” IEEETransactions on Antennas and Propagation, vol. 59, no. 3, pp. 914–927,2011.

[33] A. F. Peterson, S. L. Ray, and R. Mittra, Computational methods forelectromagnetics. Wiley-IEEE Press New York, 1998.

[34] D. Colton and R. Kress, Inverse Acoustic and Electromagnetic ScatteringTheory. Springer-Verlag, Berlin, Germany, 2nd edn., 1998.

[35] A. Kirsch, An introduction to the mathematical theory of inverseproblem. Springer, New York, 1996.

[36] Y. M. Wang and W. C. Chew, “An iterative solution of the two-dimensional electromagnetic inverse scattering problem,” InternationalJournal of Imaging Systems and Technology, vol. 1, no. 1, pp. 100–108,1989.

[37] T. M. Habashy, M. L. Oristaglio, and A. T. de Hoop, “Simultaneous non-linear reconstruction of two-dimensional permittivity and conductivity,”Radio Science, vol. 29, no. 4, pp. 1101–1118, 1994.

[38] K. Xu, Y. Zhong, and G. Wang, “A hybrid regularization technique forsolving highly nonlinear inverse scattering problems,” IEEE Transac-tions on Microwave Theory and Techniques, vol. 66, no. 1, pp. 11–21,2018.

[39] M. T. McCann, K. H. Jin, and M. Unser, “Convolutional neural networksfor inverse problems in imaging: A review,” IEEE Signal ProcessingMagazine, vol. 34, no. 6, pp. 85–95, 2017.

[40] Z. Wei, R. Chen, H. Zhao, and X. Chen, “Two FFT subspace-basedoptimization methods for electrical impedance tomography,” ProgressIn Electromagnetics Research, vol. 157, pp. 111–120, 2016.

[41] L. Crocco, I. Catapano, L. D. Donato, and T. Isernia, “The linearsampling method as a way to quantitative inverse scattering,” IEEETransactions on Antennas and Propagation, vol. 60, no. 4, pp. 1844–1853, 2012.

[42] L. Crocco, L. D. Donato, I. Catapano, and T. Isernia, “The factorizationmethod for virtual experiments based quantitative inverse scattering,”Progress In Electromagnetics Research,, vol. 157, pp. 121–131, 2016.

[43] G. W. Stewart, Matrix algorithms. Philadelphia: Society for Industrialand Applied Mathematics, 1998.

[44] J. Cong and B. Xiao, “Minimizing computation in convolutional neuralnetworks,” in 24th International Conference on Artificial Neural Net-works, Proceedings, 2014, pp. 281–290.

[45] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learningapplied to document recognition,” Proceedings of the IEEE, vol. 86,no. 11, pp. 2278–2324, 1998.

[46] W. Zhou, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Imagequality assessment: from error visibility to structural similarity,” IEEETransactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.

[47] A. Vedaldi and K. Lenc, “Matconvnet: Convolutional neural networksfor matlab,” in Proc. ACM Int. Conf. Multimedia, 2015, pp. 689–692.

[48] G. Jean-Michel, S. Pierre, and E. Christelle, “Free space experimentalscattering database continuation: experimental set-up and measurementprecision,” Inverse Problems, vol. 21, no. 6, p. S117, 2005.

[49] K. Xu, Y. Zhong, X. Chen, and D. Lesselier, “A fast integral equation-based method for solving electromagnetic inverse scattering problemswith inhomogeneous background,” IEEE Transactions on Antennas andPropagation, vol. 66, no. 8, pp. 4228–4239, 2018.

Zhun Wei was born in 1990. He received the B.S.degrees in School of Physical Electronic, Universityof Electronic Science and Technology of China, in2012, and the Ph.D. degree from the Departmentof Electrical and Computer Engineering (ECE) inNational University of Singapore (NUS), in 2016.He is also a Visiting Postdoc at Department of Ap-plied Physics, Stanford University from Nov., 2018to April, 2019. He is a recipient of 2016 Best StudentPaper Competition Award, IEEE Singapore MTT/APJoint Chapter, and also a recipient of “Ulrich L.

Rohde Innovative Conference Paper Award” at ICCEM 2019. He is currently aresearch fellow in Department of ECE in NUS. His research interests includemicrowave imaging, deep learning, biomedical imaging, non-invasive test, andphase retrieval.

Xudong Chen (M’09-SM’14) received the B.S. andM.S. degrees in electrical engineering from ZhejiangUniversity, China, in 1999 and 2001, respective-ly, and the Ph.D. degree from the MassachusettsInstitute of Technology, Cambridge, MA, USA, in2005. Since then, he joined the National Universityof Singapore, Singapore, where he is currently anAssociate Professor. His research interests are main-ly electromagnetic wave theories and applications,in particular with a focus on inverse problems andimaging. He has published 140 journal papers on

inverse scattering problems, material parameter retrieval, microscopy, andoptical encryption. He has authored the book Computational Methods forElectromagnetic Inverse Scattering, Wiley, 2018. Dr. Chen was a recipientof the Young Scientist Award by the Union Radio-Scientifique Internationalein 2010, and also a recipient of “Ulrich L. Rohde Innovative ConferencePaper Award” at ICCEM 2019. He has been an Associate Editor of theIEEE Transactions on Microwave Theory and Techniques since 2015. Hehas organized twenty sessions on the topic of inverse scattering and imagingin various conferences.