ensemble p-laplacian regularization for scene image … · 2020. 4. 17. · parameter selection...

Ensemble p-Laplacian Regularization for Scene Image Recognition

Xueqi Ma1 & Weifeng Liu1 & Dapeng Tao2 & Yicong Zhou3

Received: 28 March 2018 /Accepted: 10 March 2019# Springer Science+Business Media, LLC, part of Springer Nature 2019

AbstractRecently, manifold regularized semi-supervised learning (MRSSL) received considerable attention, because it successfully exploits thegeometry of the intrinsic data probability distribution to leverage the performance of a learning model. As a natural nonlinear general-ization of graph Laplacian, p-Laplacian has been proved having the rich theoretical foundations to better preserve the local structure.However, it is difficult to determine the fitting graph p-Lapalcian, i.e., the parameter p, which is a critical factor for the performance ofgraph p-Laplacian. Therefore, we develop an ensemble p-Laplacian regularization (EpLapR) to fully approximate the intrinsic manifoldof the data distribution. EpLapR incorporates multiple graphs into a regularization term in order to sufficiently explore the complemen-tation of graph p-Laplacian. Specifically, we construct a fused graph by introducing an optimization approach to assign suitable weightson different p value graphs. And then, we conduct semi-supervised learning framework on the fused graph. Extensive experiments onUC-Merced dataset and Scene 15 dataset demonstrate the effectiveness and efficiency of the proposed method.

Keywords Manifold regularization . Semi-supervised learning . Ensemble p-Laplacian regularization

Introduction

With rapid advances in storage devices and mobile networks,large-scale multimedia data have become available to ordinaryusers. However, in practical applications, e.g., human actionrecognition [19, 20], scene classification [21], text categoriza-tion [1], and video annotation and retrieval [2], the labeledsamples are always insufficient, though vast amounts of unla-beled samples are readily accessible and provide auxiliaryinformation. Semi-supervised learning (SSL) aiming to ex-ploit both labeled data and unlabeled data is designed to ad-dress such problem. In SSL [3–8], it is assumed that nearbysamples are likely to share the same label. The manifold reg-ularization [9] is one of the most representative works, whichassumes that the geometry of the intrinsic data probabilitydistribution is supported on the low-dimensional manifold.

In order to build better classifiers, the typical MRSSL al-gorithms exploit the intrinsic geometry of the labeled and

unlabeled samples, then naturally captured by a graph.Therefore, researchers paid their attention to build a goodgraph to capture the essential data structure. Laplacian regu-larization (LapR) [9, 17] is one prominent manifold regulari-zation based SSL algorithm, which explores the geometry ofthe probability distribution by using the graph Laplacian.LapR-based SSL algorithms have been widely used in manyapplications. Luo et al. [23] employed manifold regularizationto smooth the functions along the data manifold for multitasklearning. Hu et al. [22] introduced graph Laplacian regulari-zation for joint denoising and superresolution of generalizedpiecewise smooth images. Jiang et al. [24] presented a multi-manifold method for recognition by exploring the local geo-metric structure of samples. Another relatively new prior is theHessian regularization (HesR), which has been shown empir-ically to perform well in a wide range of inverse problems[10–12]. In comparison with LapR, HesR steers the valuesof function varying linearly in reference to the geodesic dis-tance. In result, HesR can be more accurate to describe theunderlying manifold of data. However, Hessian estimationwill be inaccurate, while it has poor quality of the local fitfor each data point [14]. The p-Laplacian [15] [16] is nonlineargeneralization of general graph Laplacian and has tighterisoperimetric inequality. In particular, Bühler et al. [13] pro-vided a rigorous proof of the approximation of the secondeigenvector of p-Laplacian to the Cheeger cut, which indicatesthe superiority of graph p-Laplacian in local geometry

* Weifeng [email protected]

1 China University of Petroleum (East China), Qingdao 266580, China2 Yunnan University, Kunming 650091, China3 Faculty of Science and Technology, University of Macau,

Macau, China

https://doi.org/10.1007/s12559-019-09637-zCognitive Computation (2019) 11:841 854

/ Published online: 22 March 2019

http://crossmark.crossref.org/dialog/?doi=10.1007/s12559-019-09637-z&domain=pdfhttp://orcid.org/0000-0002-5388-9080mailto:[email protected]

exploiting. An efficient approximation method of graph p-Laplacian has been proposed in [30]. However, the parameterp of graph p-Laplacian is difficult to determine.

In manifold regularized SSL, the manifold is determined bythe graph with the predefined hyperparameters. Unfortunately,it cannot define an objective function to choose graphhyperparameters for intrinsic manifold estimation. In general,the cross-validation [25] has been widely utilized for parameterselection. But, to the best of our knowledge, this method thatselects parameters in a discrete and limited parameter spacelacks the ability to approximate the optimal solution tries.Furthermore, the performance of the classification model isweakly relevant to the difference between the intrinsic and ap-proximated manifolds. Thus, the pure cross-validation-basedparameter selection cannot perform well on model learning.An automatic and full approximation of the intrinsic manifoldof the data distribution will be valuable for the SSL methods.

In this paper, we propose an ensemble p-Laplacian regular-ized method, which combines a series of graphs p-Laplacian.By a conditionally optimal way, the proposed method learns toassign suitable weights on graphs and finally construct an opti-mal fused graph. The fused graph can sufficiently approximatethe intrinsic manifold by the complementation of graphs p-Laplacian. We build the graph-regularized classifiers, includingsupport vector machines (SVM) and kernel least squares (KLS)as special cases for scene image recognition. Experiments onthe UC-Merced dataset [27] and Scene 15 dataset [34] validate

the effect of proposed method compared with the popular algo-rithms, including Laplacian regularization (LapR), Hessian reg-ularization (HLapR), and p-Laplacian regularization (pLapR).

The rest of the paper is organized as follows. BRelatedWork^ briefly reviews related work on manifold regulariza-tion and p-Laplacian learning. BMethod^ proposes EpLapR.BExample Algorithms^ presents the EpLapR for KLS andSVM. BExperiments^ provides the experimental results andanalysis on UC-Merced dataset and Scene 15 dataset. Finally,BConclusion^ gives the conclusions.

Related Work

The proposed EpLapR is motivated by MRSSL and p-Laplacian learning. This section briefly describes the relatedworks for better understanding.

Manifold Regularization

Suppose, the labeled samples are (x, y) pairs drawn from aprobability distribution P, and unlabeled samples x are drawnaccording to the marginal distribution Px of P. It assumes thatthe probability distribution of data is supported on asubmanifold of the ambient space. In general, the manifoldregularization defines a similarity graph over labeled and un-labeled examples and incorporates it as an additional regular-ization term. Hence, the manifold regularization frameworkhas two regularization terms: one controlling the complexitymeasures in an appropriately chosen Reproducing KernelHilbert Space (RKHS) and the other controlling the additionalinformation about the geometric structure of the marginal. Theobjective function of the framework is defined as:

Fig. 1 The framework of EpLapR for remote sensing image classification

Table 1 Iterative solution method for EpLapR

Step1: initialize μk ∈RN.Step2: update α according to Eq. (21).Step3: based on the updated α, re-calculate μk according to Eq. (22)Step4: repeat from step 2 until convergence.

Cogn Comput (2019) 11:841854842

f * ¼ argminfϵΗΚ

1

l∑l

i¼1V xi; yi; fð Þ þ ΥA fk k2K þ Υ I fk k2I ð1Þ

where V is a loss function, such as the hinge loss functionmax[0, 1 − yif(xi)] for SVM. ∥ f ∥2K is used to control the com-plexity of the classification model, while fk k2I is an appropri-ate penalty term, which is approximated by the graph matrix(e.g., graph Laplacian L, L =D −W, where Wij is the weightvector, the diagonal matrix D is given by Dii ¼ ∑nj¼1W ij ) andthe function prediction. The parameters ΥA and ΥI control thecomplexity of the function in the ambient space and the in-trinsic geometry, respectively.

p-Laplacian Regularization

As a nonlinear generalization of the standard graph Laplacian,graph p-Laplacian has the superiority on local structurepreserving.

As we all know, the standard graph Laplacian Δ2 can bedefined as the operator by inducing the following quadraticform for a function f

f ;Δ2 fh i ¼ 1=2 ∑i; j∈V

wij f i− f j� �2

ð2Þ

Fig. 2 Some example images of UC-Merced dataset. The dataset totally has 21 remote sensing categories

Fig. 3 Some example images of Scene 15 dataset. The dataset totally has 15 scene classes

Cogn Comput (2019) 11:841854 843

where the vertices V represent the points in the feature space,and the positive edge weights wij encode the similarity ofpoints (i, j).

Similar to the graph Laplacian [13], the unnormalized p-Laplacian Δp (for p > 1) can be defined by:

f ;Δp f� � ¼ 1=2 ∑

i; j∈Vwij f i− f j� �p

ð3Þ

or

Δp f� �

i¼ ∑

j∈Vwijϕp f i− f j

� �ð4Þ

where ϕp is defined as ϕp(x) = |x|p − 1sig(x). With p = 2, the

p-Laplacian becomes the standard graph Laplacian.Bühler and Hein [13] used the graph p-Laplacian for spec-

tral clustering and demonstrated the relationship between thesecond eigenvalue of the graph p-Laplacian and the optimalCheeger cut as follows: for value p > 1,

RCC≤RCC*≤p maxi∈V di

� p−1p

RCC1p ð5Þ

or

NCC≤NCC*≤pNCC1p ð6Þ

where RCC∗ and NCC∗ are the ratio/normalized Cheeger cutvalues obtained by tresholding the second eigenvector of the

unnormalized/normalized p-Laplacian; di is the degree of ver-tex i; RCC and NCC are the optimal ratio/normalized Cheegercut values. This proves that in the limit as p→ 1, the cut foundby thresholding the second eigenvector of the graph p-Laplacian converges to the optimal Cheeger cut.

In mathematic community, discrete p-Laplacian has beenstudied in a general regularization framework. In [28], theobjective function of a general discrete p-Laplacian regulari-zation framework can be computed as follows:

f * ¼ argminf∈H Vð Þ Sp fð Þ þ μ f −yk k2n o

ð7Þ

where Sp fð Þ∶ ¼ 12 ∑v∈V

∇v fk kp is the p-Dirichlet form of thefunction f; μ is a parameter balancing the two competingterms; y ∈ {−1, 0, 1} depends on labels of vertex v.

In [26], Luo et al. proposed full eigenvector analysis of p-Laplacian and obtained a natural global embedding for multi-class clustering problems, the whole eigenvector analysis of p-Laplacian was achieved by an efficient gradient descend op-timization approach as:

minF JE Fð Þ ¼ ∑k

∑ijwij fki − f

kj

p∥ f k∥pp

s:t: F TF ¼ I ð8Þ

where wij is the edge weight; fk is an eigenvector of p-

Laplacian; F ¼ f 1; f 2;⋯; f n� � are whole eigenvectors.

Fig. 4 Performance of mAP withdifferent p values on validationset of UC-Merced dataset

Fig. 5 Performance of mAP withdifferent p values on validationset of Scene 15 dataset

Cogn Comput (2019) 11:841854844

Liu et al. [29] proposed p-Laplacian regularized sparsecoding for human activity recognition.

Liu et al. [30] illustrated the differences of semi-supervisedregression by using LapR, HesR, and pLapR for fitting twopoints on the 1-D spiral. The results show the pLapR can fitthe data exactly and extrapolates smoothly to unseen data withthe geodesic distance. Specially, they proposed an efficientapproximation method of graph p-Laplacian and built p-Laplacian regularization framework.

Ma et al. [31] presented an efficient and effectiveapproximation algorithm of hypergraph p-Laplacian andthen proposed hypergraph p-Laplacian regularization(HpLapR) to preserve the geometry of the probabilitydistribution.

Method

The proposed EpLapR approximates the manifold of the datadistribution by fusing a set of graph p-Laplacian. First, weshow the fully approximation of the graph p-Laplacian.Then, we propose the EpLapR.

Approximation of Graph p-Laplacian

We approximate the graph p-Laplacian (Lp) by the fully anal-ysis of the eigenvalues and eigenvectors. In [13], the compu-tation of eigenvalue and the corresponding eigenvector onnonlinear operator Δwp can be solved by the theorem:

The functionalFp has a critical point at f if and only if f is aneigenvector of Δwp ; the corresponding eigenvalue λp is givenby λp = Fp( f ). The definition of Fp is given as:

Fp fð Þ ¼∑ijwij f i− f j

p2∥ f ∥pp

ð9Þ

where

∥ f ∥pp ¼ ∑i f ij jp:

The above theorem serves as the foundational analysis ofeigenvectors and eigenvalues. Moreover, we have Fp(αf) =Fp( f ) for all real value α.

Suppose that the graph p-Laplacian has n eigenvectors (f∗1,f∗2, ⋯, f∗n) associated with unique eigenvaluesλ*1;λ

*2;⋯;λ

*n

� �. According to the above theorem, if we want

Fig. 6 mAP performance ofdifferent algorithms on KLSmethod of UC-Merced dataset

Cogn Comput (2019) 11:841854 845

to get all eigenvectors and eigenvalues of graph p-Laplacian,we have to find all critical points of the functional Fp.Therefore, we exploit the full eigenvectors space by solvinglocal solution of the following minimization problem:

minF J Fð Þ ¼ ∑k Fp f

k� �

s:t: ∑iϕp f

ki

� �ϕp f

li

� � ¼ 0; k≠lð10Þwhere F ¼ f 1; f 2;⋯; f n� �.

Rewrite the optimization problem (10), we analyze the fulleigenvectors by solving the following graph p-Laplacian em-bedding problem:

minF JE Fð Þ ¼ ∑k

∑ijwij fki − f

kj

p∥ f k∥pp

s:t: F TF ¼ I ð11Þ

The gradient of JE with respect to fki yields the following

equation:

∂JE∂ f ki

¼ 1∥ f k∥pp

∑ jwijϕp fki − f

kj

� �−ϕp f

ki

� �∥ f k∥pp

" #ð12Þ

The problem (11) can be solved with the gradient descendoptimization. However, if we simply use the gradient descendapproach, the solution fk might not be orthogonal [26]. So, thegradient is modified in Eq. (13) in order to enforce the orthog-onality.

G ¼ ∂JE∂F −F

∂JE∂F

� TF ð13Þ

Meanwhile, the full eigenvalue λ = (λ1, λ2,⋯, λn) can be

computed by λk ¼ ∑ijwij fki − f

kjj jp

∥ f k∥pp. Finally, the approximate Lp

can be computed by Lp ¼ FλF T. We summarize the approx-imation of graph p-Laplacian in Algorithm 1. In the algorithm,

the step length α is set to be α ¼ 0:01 ∑ik F ikj j∑ik Gikj j .We can get that if F TF ¼ I , then using the simple

gradient descend approach can guarantee to give a fea-sible solution. Since Laplacian L is symmetric, we haveFTF ¼ I for initialization, and from Algorithm 1, wehave F tþ1 ¼ F t−αG, thus, F tþ1� �TF tþ1 ¼ F t−αGð ÞTF t−αGð Þ ¼ F tð ÞTF t−η GTF t þ F tð ÞTG

h i. Here,

Fig. 7 mAP performance ofdifferent algorithms on SVMmethod of UC-Merced dataset

Cogn Comput (2019) 11:841854846

GTF t þ F tð ÞTG

¼ ∂JE∂F −F

∂JE∂F

� TF

!TF t þ F tð ÞT ∂JE

∂F −F∂JE∂F

� TF

" #

¼ ∂JE∂F

� TF t− F tð ÞT ∂JE

∂F −∂JE∂F

� TF t þ F tð ÞT ∂JE

∂F ¼ 0

So F tþ1� �TF tþ1 ¼ F tð ÞTF t ¼ I. The orthogonality ofsolution F is enhanced by G, and solution F is the feasiblesolution.

Assume the tth iteration JE F tð Þ and (t − 1)th iterationJE F t−1� �

. For this degenerated problem, we have

JE F tð Þ ¼ JE F t−1−αG� �

≤ JE F t−1� �

. S i n c e JE Fð Þ≥0,thus, our algorithm is guaranteed to converge.

EpLapR

Consider the MRSSL setting, where two sets of samples X are

available, i.e., l labeled samples xi; yið Þf gli¼1 and u unlabeled

samples x j� �� lþu

j¼lþ1, for a total of n = l + u samples. Class

labels are given in Y ¼ yif gli¼1, where yi ∈ {±1}. Typically,l ≪ u and we focus on predicting the labels of unseenexamples.

According to the manifold regularization framework, theproposed EpLapR can be written as the following optimiza-tion problem:

f * ¼ argminfϵΗΚ

1

l∑l

i¼1V xi; yi; fð Þ þ ΥA fk k2K þ

Υ In2

fTL f ð14Þ

Here, f is given as f = [f(x1), f(x2),⋯, f(xl + u)]T, L is the

optimal fused graph with L ¼ ∑m

k¼1μkL

pk, s:t: ∑

m

k¼1μk ¼ 1;μk≥

0; for k ¼ 1;⋯;m. Where we define a set of candidate graphp-Laplacian C ¼ Lp1;⋯; Lpm

� �and denote the convex hull of

set A as: conv A = {ψ1x1 +⋯ +ψmxm|ψ1 +⋯ +ψm = 1, xi ∈A, ψi ≥ 0, i = 1,⋯, m}, where A = {x1,⋯, xm} . Therefore, wehave L ∈ convC , where convC ¼ μ1Lp1 þ⋯þ μmLpmj

�μ1 þ⋯þ μm ¼ 1; Lpi ∈C;μi≥0; i ¼ 1;⋯;mg .

Fig. 8 AP performance ofdifferent KLS methods on severalclasses of UC-Merced dataset

Cogn Comput (2019) 11:841854 847

To avoid the parameter μk overfitting to one graph [18], wemake a relaxation by changing μk to μ

γk and obtain the opti-

mization problem as:

f * ¼ argminfεHK

1

l∑li¼1V xi; yi; fð Þ þ ΥA fk k2K þ

Υ In2

fT ∑mk¼1μγkL

pk

� �fs:t: ∑

m

k¼1μk

¼ 1; μk≥0; for k ¼ 1;⋯;mð15Þ

Next, we present theoretical analysis for EpLapR.

Theorem 1: For L ∈ convC, the solution of the problem (15)exists and admits the following representation:

f * xð Þ ¼ ∑lþu

i¼1α*i K xi; xð Þ ð16Þ

which is an expansion in terms of the labeled and unlabeledexamples. Where the kernel matrix K with Kij = K(xi, xj) issymmetric positive definite; α∗ is the coefficient.

The represented theorem shows that the solution of Eq.(15) exists and has the general form of Eq. (16) under a fixedμ.

So, we rewrite the objective function as

f * ¼ argminfεHK

1

l∑li¼1V xi; yi; fð Þ þ ΥA fk k2K

þΥ In2αTK ∑mk¼1μ

γkL

pk

� �Kαs:t:∑mk¼1μk ¼ 1; μk≥0; for k ¼ 1;⋯;m

ð17Þ

Example Algorithms

Generally, the proposed EpLapR can be applied to variantMRSSL-based applications with different choices of lossfunction V(xi, yi, f). In this section, we apply EpLapR to KLSand SVM.

EpLapR Kernel Least Squares (EpLapKLS)

By employing the least square loss in optimization problem(14), we can present the EpLapKLS model defined in Eq. (18)as follows:

Fig. 9 AP performance ofdifferent SVM methods onseveral classes of UC-Merceddataset

Cogn Comput (2019) 11:841854848

f * ¼ argminfεHK

1

l∑li¼1V yi− f xið Þð Þ2 þ ΥAαTKα

þΥ In2αTK ∑mk¼1μ

γkL

pk

� �Kαs:t:∑mk¼1μk ¼ 1; μk≥0; for k ¼ 1;⋯;m

ð18Þ

Then, we obtain the partial derivative of the objective func-tion with respect to μk and α as follows:

∂f∂μk

¼ Υ In2αTK γμk

γ−1� �LpkKα ð19Þ

∂f∂α

¼ 1l

Y−JKαð ÞT −JKð Þ þ ΥAK þ Υ In2KLK�

α ð20Þ

Here, we adopt a process that iteratively updates α and μkto minimize f∗. Firstly, when μk is fixed, Eq. (20) turns toargminμk f , from which we can derive that

α* ¼ JK þ ΥAlI þ Υ Iln2 LK� −1

Y ð21Þ

where I is n ∗ n diagonal matrix; J is an n ∗ n diagonal matrixwith the labeled points diagonal entries as 1 and the rest 0; Y is ann dimensional label vector given by Y = diag(y1, … , yl, 0,⋯, 0).

Given fixed α, we obtain the solution of μk with

s:t: ∑m

k¼1μk ¼ 1:

μk ¼n2

Υ IαTKLpkKα

� � 1γ−1

∑m

k¼1n2

Υ IαTKLpkKα

� � 1γ−1

ð22Þ

The iterative solution procedure of EpLapR is described inTable 1.

Now, we prove the convergence of this iterative solution.Intuitively, the update criteria in Eq. (22) tend to assign larger

value to μk with smallerΥ In2α

TKLpkKα. Denoted by αt and μt,

the values of α and μ in tth iteration of the process, we havef(αt, μt + 1 ) ≤ f(αt, μt ), and according to Eq. (21), the solutionof α is updated to minimize f; thus, we can get f(αt + 1, μt +1 ) ≤ f(αt, μt + 1 ). Therefore, f(αt + 1, μt + 1 ) ≤ f(αt, μt + 1 ) ≤f(αt, μt ).

Fig. 10 mAP performance ofdifferent algorithms on KLSmethod of Scene 15 dataset

Cogn Comput (2019) 11:841854 849

EpLapR Support Vector Machines (EpLapSVM)

The EpLapSVM solves the optimization problem (17) withthe hinge loss function as

f *ð Þ ¼ argminfϵΗΚ

1

l∑l

i¼11−yi f xið Þð Þþ þ ΥAαTKα

þ Υ In2αTK ∑

m

k¼1μγkL

pk

� Kα ð23Þ

The partial derivative of the objective function with respectto μk is same as Eq. (19).

Introducing Lagrange multiplier method with βi and ηi andadd an unregularized bias term b, we arrive at a convex dif-ferentiable objective function:

L α; ξ; b;β; ηð Þ ¼ 1l∑li¼1ξi þ

1

2αT 2ΥAK þ 2ΥAn2 K ∑

m

k¼1μγkL

pk

� K

� α

− ∑l

i¼1βi yi ∑

lþu

j¼1α jK xi; x j

� �þ b !

−1þ ξi !

− ∑l

i¼1ηiξi

We reduce the Lagrangian using ∂L∂b ¼ 0 and ∂L∂ξi ¼ 0 andtake partial derivative with respect to α

α* ¼ 2ΥAI þ 2Υ In2LK� −1

JTYβ* ð24Þ

where I is l ∗ l diagonal matrix; J is an l ∗ n diagonal matrixgiven by J = [I, 0]; Y is an l dimensional label vector given by:Y = diag(y1, … , yl). β

∗ is the n-dimensional variable given by

β* ¼ maxβ∈ℝl

∑l

i¼1βi−

1

2βTQβ subject to : ∑

l

i¼1βiyi

¼ 00≤βi≤1

l; i ¼ 1;⋯; l

where

Q ¼ YJK 2ΥAI þ 2Υ In2LK� −1

JTY

Then, the problem (23) can also be solved by the iterativesolution process in Table 1.

Fig. 11. mAP performance ofdifferent algorithms on SVMmethod of Scene 15 dataset

Cogn Comput (2019) 11:841854850

Experiments

In this section, to evaluate the effectiveness of the proposedEpLapR, we compare EpLapR with other local structure pre-serving algorithms including LapR, HesR, and pLapR on UC-Merced database and Scene 15 dataset. We apply the SVMand KLS for image classification. Figure 1 illustrates theframework of EpLapR for UC-Merced dataset.

UC-Merced dataset [27] consists of 2100 remote sens-ing images collected from aerial orthoimage with the pixelresolution of 1 ft. The original images were downloadedfrom the United States Geological Survey National Mapof different U.S. regions. There are totally 21 classes,including chaparral, dense residential, medium residential,sparse residential, forest, freeway, agricultural, airplane,baseball diamond, mobile home park, overpass, parkinglot, river, runway, beach, buildings, golf course, harbor,intersection, storage tanks, and tennis courts (see inFig. 2). It is worth noticing that this dataset has somehighly overlapped classes, e.g., sparse residential,medium-density residential, and dense residential; so, itis difficult to get a successful classification.

Scene 15 dataset is composed of 15 scene categories, totally4485 images. Each class has 200 to 400 images. The imagescontain indoor scenes and outdoor scenes, such as living room,kitchen, forest, mountain, and tall building (see in Fig. 3).

In our experiments, we extract high-level visual featuresusing the deep convolution neural network (CNN) [32] forUC-Merced dataset and extract SIFT feature for Scene 15dataset. For UC-Merced dataset, we randomly choose 50 im-ages per class as training samples and the rest as testing sam-ples. For Scene 15 dataset, 100 images per class are randomlyselected as the training data, and the rest for testing. In semi-supervised classification experiments, in particular, we select10, 20, 30, and 50% samples of training data as labeled data,and the rest as unlabeled data. To avoid any bias introduced bythe random partition of samples, the process is repeated fivetimes independently.

We conduct the experiments on the dataset to choose suit-able model parameters. The regularization parameters γA andγI are selected from the candidate set {10

i| i = − 10, − 9, − 8,⋯, 10} through cross-validation. For pLapR, the parameter pis chosen from {1, 1.1, 1.2,⋯, 3} through cross-validationwith 10% labeled samples on the training data. Figures 4

Fig. 12 AP performance ofdifferent KLS methods on severalclasses of Scene 15 dataset

Cogn Comput (2019) 11:841854 851

and 5 illustrate the mAP performance of pLapR on the vali-dation set when p varies. The x axis is the parameter p, and they axis is mAP for performance measure. From Fig. 4, we cansee that, for UC-Merced data set, the best mAP performancefor pLapR can be obtained when p = 2.8. Figure 5 is the per-formance of the Scene 15 database, and the best performanceis achieved when p is equal to 1. For EpLapR, we created twograph p-Laplacian sets on UC-Merced dataset. For the first set(EpLapR-3G), we choose p = {2.5, 2.7,2.8}, which led tothree graphs. For another one (EpLapR-5G), with five graphswhere p = {2.4, 2.5, 2.6, 2.7,2.8}. For Scene 15 dataset, theparameters p of EpLapR-3G is {1, 1.4, 2.1}, and the p ofEpLapR-5G is {1, 1.1, 1.4, 2, 2.1}. We verify classificationperformance by average precision (AP) performance for sin-gle class and mean average precision (mAP) [33] for overallclasses. The AP is defined as the mean precision at a set of 11equally spaced recall levels and can be expressed as follows:

AP ¼ 111

∑t

maxr ≥ t

pre rð Þ

�

; t∈ 0; 0:1; 0:2;⋯; 1:0f g

where pre(r) is the precision at recall r. The mAP is the meanAP over all image classes and can be written as:

mAP ¼∑c

i¼1APi

c

where c is the number of image classes.We compare our proposed EpLapR with the representative

LapR, HesR and pLapR. Figures 6 and 7 demonstrate themAP results of different algorithms on KLS methods andSVM methods on UC-Merced data set, respectively.Figures 8 and 9 show the mAP performance on Scene 15dataset. We can see that in most cases, the EpLapR outper-forms LapR, HesR, and pLapR, which shows the advantagesof EpLapR in local structure of preserving.

To evaluate the effectiveness of EpLapR for single class,Figs. 10 and 11 show the AP results of different methods onseveral selected remote sensing classes, including mediumresidential, parking lot, sparse residential, and tennis court ofUC-Merced dataset. Figure 8 reveals the KLS method, whileFig. 9 represents the SVM method. Figures 12 and 13 showthe AP results of Scene 15 dataset about several classes, in-cluding mountain, open country, tall building, and industrial.We can find that EpLapR performs better than LapR, HesR,and pLapR by sufficiently explore the complementation of

Fig. 13 AP performance ofdifferent SVM methods onseveral classes of Scene 15dataset

Cogn Comput (2019) 11:841854852

graph p-Laplacian. Moreover, what we can note is that theclassification results are not always improved with a largerproportion of labels or a fused graph of much more graph p-Laplacian. These observations suggest that it is critical to se-lect parameters for our proposed method.

Conclusion

As a nonlinear generalization of graph Laplacian, the p-Laplacian regularization precisely exploits the geometry ofthe probability distribution to leverage the learning perfor-mance. However, in practical, it is difficult to determine theoptimal graph p-Lapalcian because the parameter p usually ischosen by cross-validation method, which lacks the ability toapproximate the optimal solution. Therefore, we propose anensemble p-Laplacian regularization to better approximate thegeometry of the data distribution. EpLapR incorporates mul-tiple graphs into a fused graph by an optimization approach toassign suitable weights on different p value graphs. And then,we introduce the optimal fused graph as a regularizer for SSL.Finally, we construct the ensemble graph p-Laplacian regular-ized classifiers, including EpLapKLS and EpLapSVM forscene image recognition. Experimental results on the UC-Merced dataset show that our proposed EpLapR learner cangeneralize well than traditional ones.

Funding Information This work was supported in part by the NationalNatural Science Foundation of China under Grant 61671480, in part bythe Foundation of Shandong Province under Grant ZR2018MF017, theFundamental Research Funds for the Central Universities, ChinaUniversity of Petroleum (East China) under Grant 18CX07011A andGrant YCX2017059, the National Natural Science Foundation of Chinaunder Grant 61772455 andGrant U1713213, the Yunnan Natural ScienceFunds under Grant 2016FB105, the Program for Excellent Young Talentsof Yunnan University under Grant WX069051, the Macau Science andTechnology Development Fund under Grant FDCT/189/2017/A3, andthe Research Committee at University of Macau under GrantMYRG2016-00123-FST and Grant MYRG2018-00136-FST.

Compliance with Ethical Standards

Conflict of Interest The authors declare that they have no conflict ofinterest.

References

1. Subramanya, Bilmes J. Soft-supervised learning for text classifica-tion. EMNLP. 2008:1090–9.

2. Scardapane S, Uncini A. Semi-supervised echo state networks foraudio classification. Cogn Comput. 2017;9(1):125–35.

3. Zhou D, Bousquet O, Lal TN, et al. Learning with local and globalconsistency. Adv Neural Inf Proces Syst. 2004:321–8.

4. Zhao M, Zhang Z, Chow TWS. Trace ratio criterion based gener-alized discriminative learning for semi-supervised dimensionalityreduction. Pattern Recogn. 2012;45(4):1482–99.

5. Oneto L, Bisio F, Cambria E, Anguita D. Semi-supervised learningfor affective common-sense reasoning. Cogn Comput. 2017;9(1):18–42.

6. Ding S, Xi X, Liu Z, Qiao H, Zhang B. A novel manifold regular-ized online semi-supervised learning model. Cogn Comput.2018;10(1):49–61.

7. Zhao M, Chow TWS, Zhang Z, Li B. Automatic image annotationvia compact graph based semi-supervised learning. Knowl-BasedSyst. 2015;76:148–65.

8. Khan FH, Qamar U, Bashir S. Multi-objective model selection(MOMS)-based semi-supervised framework for sentiment analysis.Cogn Comput. 2016;8(4):614–28.

9. Belkin M, Niyogi P, Sindhwani V. Manifold regularization: a geo-metric framework for learning from labeled and unlabeled exam-ples. J Mach Learn Res. 2006;7(Nov):2399–434.

10. Liu W, Tao D. Multiview Hessian regularization for image annota-tion. IEEE Trans Image Process. 2013;22(7):2676–87.

11. Liu X, Shi J, Wang C. Hessian regularization based non-negativematrix factorization for gene expression data clustering.Engineering in Medicine and Biology Society (EMBC), 201537th Annual International Conference of the IEEE, IEEE, 2015:4130–4133.

12. Zhu J, Shi J. Hessian regularization based semi-supervised dimen-sionality reduction for neuroimaging data of Alzheimer’s disease.Biomedical Imaging (ISBI), 2014 IEEE 11th InternationalSymposium, IEEE, 2014: 165–168.

13. Bühler T, Hein M. Spectral clustering based on the graph p-Laplacian. In: Proceedings of the 26th Annual InternationalConference on Machine Learning: ACM; 2009. p. 81–8.

14. Kim KI, Steinke F, Hein M. Semi-supervised regression usingHessian energy with an application to semi-supervised dimension-ality reduction. Adv Neural Inf Proces Syst. 2009:979–87.

15. Takeuchi H. The spectrum of the p-Laplacian and p-harmonicmorphisms on graphs. Ill J Math. 2003;47(3):939–55.

16. Allegretto W, Xi HY. A Picone’s identity for the p-Laplacian andapplications. Nonlinear Anal: Theory, Methods Appl. 1998;32(7):819–30.

17. Geng B, Tao D, Xu C, et al. Ensemble manifold regularization.IEEE Trans Pattern Anal Mach Intell. 2012;34(6):1227–33.

18. Wang M, Hua XS, Hong R, et al. Unified video annotation viamultigraph learning. IEEE Trans Circuits Syst Video Technol.2009;19(5):733–46.

19. Hong C, Yu J, Tao D, Wang M. Image-based three-dimensionalhuman pose recovery bymultiview locality-sensitive sparse retriev-al. IEEE Trans Ind Electron. 2015;62(6):3742–51.

20. Zhang J, Han Y, Tang J, Hu Q, Jiang J. Semi-supervised image-to-video adaptation for video action recognition. IEEE Trans Cybern.2017;47(4):960–73.

21. Bian X, Zhang T, Zhang X, Yan LX, Li B. Clustering-based extrac-tion of near border data samples for remote sensing image classifi-cation. Cogn Comput. 2013;5(1):19–31.

22. Hu W, Cheung G, Li X, et al. Graph-based joint denoising andsuper-resolution of generalized piecewise smooth images. ImageProcessing (ICIP), 2014 IEEE International Conference. IEEE,2014: 2056–2060.

23. Luo Y, Tao D, Geng B, Xu C, Maybank SJ. Manifold regularizedmultitask learning for semi-supervised multilabel image classifica-tion. IEEE Trans Image Process. 2013;22(2):523–36.

24. Jiang J, Hu R, Wang Z, Cai Z. CDMMA: coupled discriminantmulti-manifold analysis for matching low-resolution face images.Signal Process. 2016;124:162–72.

25. Kohavi R. A study of cross-validation and bootstrap for accuracyestimation and model selection. IJCAI. 1995;14(2):1137–45.

26. Luo D, Huang H, Ding C, Nie F. On the eigenvectors of p-Laplacian. Mach Learn. 2010;81(1):37–51.

Cogn Comput (2019) 11:841854 853

27. Yang Y, Newsam S. Bag-of-visual-words and spatial extensions forland-use classification. Proceedings of the 18th SIGSPATIAL inter-national conference on advances in geographic information sys-tems. ACM, 2010: 270–279.

28. Zhou D, Schölkopf B. Regularization on discrete spaces. JointPattern Recognition Symposium. Berlin, Heidelberg: Springer;2005. p. 361–8.

29. LiuW, Zha ZJ,WangY, et al. p-Laplacian regularized sparse codingfor human activity recognition. IEEE Trans Ind Electron.2016;63(8):5120–9.

30. Liu W, Ma X, Zhou Y, Tao D, and Cheng J.. p-Laplacian regular-ization for scene recognition. IEEE Transactions on Cybernetics, tobe published, doi: https://doi.org/10.1109/TCYB.2018.2833843,2018.

31. Ma X, Liu W, Li S, Tao D, and Zhou Y. Hypergraph p-Laplacianregularization for remotely sensed image recognition. IEEE Trans

Geosci Remote Sens, to be published, doi: https://doi.org/10.1109/TGRS.2018.2867570, 2018.

32. Simonyan K, Zisserman A. Very deep convolutional networks forlarge-scale image recognition. arXiv. 2014:1409.1556.

33. EveringhamM, Van Gool L, Williams CKI, et al. The Pascal visualobject classes (VOC) challenge. Int J Comput Vis. 2010;88(2):303–38.

34. Lazebnik S, Schmid C, Ponce J. Beyond bags of features: spatialpyramid matching for recognizing natural scene categories. NewYork, NY, USA: 2006 IEEE Computer Society Conference onComputer Vision and Pattern Recognition (CVPR’06); 2006. p.2169–78.

Publisher’s Note Springer Nature remains neutral with regard to juris-dictional claims in published maps and institutional affiliations.

Cogn Comput (2019) 11:841854854

https://doi.org/10.1109/TCYB.2018.2833843https://doi.org/10.1109/TGRS.2018.2867570https://doi.org/10.1109/TGRS.2018.2867570

Ensemble p-Laplacian Regularization for Scene Image RecognitionAbstractIntroductionRelated WorkManifold Regularizationp-Laplacian Regularization

MethodApproximation of Graph p-LaplacianEpLapR

Example AlgorithmsEpLapR Kernel Least Squares (EpLapKLS)EpLapR Support Vector Machines (EpLapSVM)

ExperimentsConclusionReferences

ensemble p-laplacian regularization for scene image … · 2020. 4. 17. · parameter selection...

Documents