research article semisupervised particle swarm...

12
Research Article Semisupervised Particle Swarm Optimization for Classification Xiangrong Zhang, 1 Licheng Jiao, 1 Anand Paul, 2 Yongfu Yuan, 1 Zhengli Wei, 1 and Qiang Song 1 1 Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Xi’an 710071, China 2 School of Computer Science Engineering, Kyungpook National University, Daegu 702-701, Republic of Korea Correspondence should be addressed to Xiangrong Zhang; [email protected] and Anand Paul; [email protected] Received 14 February 2014; Accepted 29 April 2014; Published 28 May 2014 Academic Editor: Albert Victoire Copyright © 2014 Xiangrong Zhang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. A semisupervised classification method based on particle swarm optimization (PSO) is proposed. e semisupervised PSO simultaneously uses limited labeled samples and large amounts of unlabeled samples to find a collection of prototypes (or centroids) that are considered to precisely represent the patterns of the whole data, and then, in principle of the “nearest neighborhood,” the unlabeled data can be classified with the obtained prototypes. In order to validate the performance of the proposed method, we compare the classification accuracy of PSO classifier, k-nearest neighbor algorithm, and support vector machine on six UCI datasets, four typical artificial datasets, and the USPS handwritten dataset. Experimental results demonstrate that the proposed method has good performance even with very limited labeled samples due to the usage of both discriminant information provided by labeled samples and the structure information provided by unlabeled samples. 1. Introduction e particle swarm optimization (PSO) [1, 2], originally proposed by Eberhart and Kennedy, is a population-based stochastic search process. It is inspired by the social inter- action behavior of birds flocking and fish schooling. In the context of PSO, a swarm refers to a number of potential solutions to the optimization problem, where each particle represents a potential solution. e particles fly through the search space with a velocity that is dynamically adjusted according to its local information (the cognitive compo- nent) and neighbor information (the social component), and trend to fly toward better and better search areas [3]. PSO has been widely applied to acquire the solution to the machine learning problems involved in various fields [4, 5]. In the area of machine learning, traditional learning methods can be divided into two categories: supervised learning and unsupervised learning. In many of the real- world applications of machine learning, a certain number of labeled samples are usually needed to train a classifier so as to perform the learning process, which is usually named as the supervised learning, such as the decision tree [6], support vector machines (SVMs) [710], neural network classifier [11], and the PSO-based classifiers [1217]. As a matter of fact, labeled instances are difficult, expensive, and time consuming to obtain. In cases that only unlabeled samples are available, the learning can be achieved in an unsupervised way, such as K-means clustering and fuzzy C- means clustering. It is oſten the case that both labeled and unlabeled samples are available but the labeled samples are too limited to obtain a favorable performance in supervised way, while abundant unlabeled samples are easy to obtain. erefore, semisupervised learning [18] and reinforcement learning [19] are introduced and have been proved to be quite promising. Semisupervised learning tries to improve the performance via combining limited labeled samples and large amounts of unlabeled ones to perform the classification. It has recently become more and more popular among the variety of problems such as text classification [20], mail category [21], and human action recognition [22], for which the labeled samples are highly limited. e nearest neighbor (NN) classification is one of the popular classification methods. It is a “lazy” learning method because it does not train the classifier using labeled training Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2014, Article ID 832135, 11 pages http://dx.doi.org/10.1155/2014/832135

Upload: others

Post on 09-Jul-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

Research ArticleSemisupervised Particle Swarm Optimization for Classification

Xiangrong Zhang1 Licheng Jiao1 Anand Paul2 Yongfu Yuan1

Zhengli Wei1 and Qiang Song1

1 Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China Xidian UniversityXirsquoan 710071 China

2 School of Computer Science Engineering Kyungpook National University Daegu 702-701 Republic of Korea

Correspondence should be addressed to Xiangrong Zhang xrzhangmailxidianeducn and Anand Paul anandknuackr

Received 14 February 2014 Accepted 29 April 2014 Published 28 May 2014

Academic Editor Albert Victoire

Copyright copy 2014 Xiangrong Zhang et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

A semisupervised classification method based on particle swarm optimization (PSO) is proposed The semisupervised PSOsimultaneously uses limited labeled samples and large amounts of unlabeled samples to find a collection of prototypes (or centroids)that are considered to precisely represent the patterns of the whole data and then in principle of the ldquonearest neighborhoodrdquo theunlabeled data can be classified with the obtained prototypes In order to validate the performance of the proposed method wecompare the classification accuracy of PSO classifier k-nearest neighbor algorithm and support vectormachine on sixUCI datasetsfour typical artificial datasets and the USPS handwritten dataset Experimental results demonstrate that the proposed method hasgood performance even with very limited labeled samples due to the usage of both discriminant information provided by labeledsamples and the structure information provided by unlabeled samples

1 Introduction

The particle swarm optimization (PSO) [1 2] originallyproposed by Eberhart and Kennedy is a population-basedstochastic search process It is inspired by the social inter-action behavior of birds flocking and fish schooling In thecontext of PSO a swarm refers to a number of potentialsolutions to the optimization problem where each particlerepresents a potential solution The particles fly through thesearch space with a velocity that is dynamically adjustedaccording to its local information (the cognitive compo-nent) and neighbor information (the social component)and trend to fly toward better and better search areas [3]PSO has been widely applied to acquire the solution tothe machine learning problems involved in various fields[4 5] In the area of machine learning traditional learningmethods can be divided into two categories supervisedlearning and unsupervised learning In many of the real-world applications of machine learning a certain numberof labeled samples are usually needed to train a classifier soas to perform the learning process which is usually namedas the supervised learning such as the decision tree [6]

support vector machines (SVMs) [7ndash10] neural networkclassifier [11] and the PSO-based classifiers [12ndash17] As amatter of fact labeled instances are difficult expensive andtime consuming to obtain In cases that only unlabeledsamples are available the learning can be achieved in anunsupervised way such as K-means clustering and fuzzy C-means clustering It is often the case that both labeled andunlabeled samples are available but the labeled samples aretoo limited to obtain a favorable performance in supervisedway while abundant unlabeled samples are easy to obtainTherefore semisupervised learning [18] and reinforcementlearning [19] are introduced and have been proved to bequite promising Semisupervised learning tries to improvethe performance via combining limited labeled samples andlarge amounts of unlabeled ones to perform the classificationIt has recently become more and more popular among thevariety of problems such as text classification [20] mailcategory [21] and human action recognition [22] for whichthe labeled samples are highly limited

The nearest neighbor (NN) classification is one of thepopular classification methods It is a ldquolazyrdquo learning methodbecause it does not train the classifier using labeled training

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2014 Article ID 832135 11 pageshttpdxdoiorg1011552014832135

2 Mathematical Problems in Engineering

data in advance [23] The nearest neighbor decision ruleassigns an unknown input sample vector to the class labelof its nearest neighbor [24] which is measured in termsof a distance defined in the feature space In this spaceeach class defines a region which is called the Voronoiregion [25] When the distance is defined as the classicalEuclidean distance the Voronoi regions are delimited bylinear borders This method can be extended to the K-nearest neighbors when more than one nearest neighbor isconsidered In addition some other distance measures otherthan the Euclidean distance also can be used

A further improvement of NN method replaces theoriginal training data by a set of prototypes that correctlyldquorepresentrdquo the original data [23] Namely the classifierassigns class labels by calculating distances to the prototypesrather than to the original training data As the number ofprototypes is much smaller than the total number of originaltraining data sets classification of new sample is performedmuch faster due to the reduced computational complexityof the solution (measured by the number of prototypes)These nearest prototype algorithms are able to achieve betteraccuracy of the solution than the basic NN classifiers Anevolutionary approach to the prototype selection problemcan be found in [26]

PSO has been used in classification in many literaturesMost of the PSO-based classification methods combine PSOwith an existing machine learning or classification algorithmsuch as NN [12 27] neural network [13] and rough settheory [28] In [12] an unsupervised learning algorithm isproposed byminimizing the distances within clusters In [13]an evolutionary approach-based nearest prototype classifieris introduced In [27] PSO is applied to find the optimalpositions of class centroids in the feature space of datasetusing the examples contained in the training set

PSO has shown competitive performance in classificationproblem However it usually needs many labeled data pointsto obtain the optimal positions of class centroids Semisu-pervised learning provides a better solution by making fulluse of the abundant unlabeled data along with the limitedlabeled samples to improve the accuracy and robustness ofclass predictions [18] and [29ndash31] In this paper we propose asemisupervised classification method based on the standardPSO namely semisupervised PSO (SSPSO) in which someavailable supervised information and the wealth of unlabeleddata points are simultaneously used to search for the optimalpositions of class centroids The key point in SSPSO is tointroduce the unlabeled information to the fitness functionof PSO naturally The advantages of SSPSO can be concludedas follows firstly it is a semisupervised learning methodwhich can be applied with limited labeled samples secondlywith less number of prototypes the classification of newpatterns will be performed faster thirdly SSPSO is able toachieve competitive or even better accuracy than the basicNN classifier and SVM

The rest part of this paper is organized as followsIn Section 2 the theory of nearest neighbor classificationand PSO is presented The classification method based onPSO and our proposed SSPSO are described in Section 3Experimental results and analysis on UCI datasets some

typical artificial datasets and the USPS handwritten datasetare shown and discussed in Section 4 Finally Section 5concludes this paper

2 Review of Related Methods

21 K-Nearest Neighbor Algorithm In pattern recognitionthe k-nearest neighbor algorithm (KNN) is a simple methodfor classification KNN is a type of lazy learning where thefunction is only approximated locally and all computationis deferred until classification The simplest 1-NN algorithmassigns an unknown input sample to the class of its nearestneighbor from a stored labeled reference set Instead oflooking at the closest labeled sample the KNN algorithmseeks k samples in the labeled reference set that are closestto the unknown sample and applies a voting mechanism tomake a decision for label prediction

Suppose T = (x119894 119910119894) is the training set where x

119894isin R119889

denotes the training example in a continuous multidimen-sional feature space and 119910

119894isin R is class label of x

119894 For 1-NN

classification the class label of a test sample x isin R119889 can beobtained by finding the training example that is the nearest tox according to some distance metrics such as the Euclideandistance in (1) and assigning the class label of this trainingsample to it For KNN classification the class label of the testsample can be obtained with a method of majority votingConsider

1003817100381710038171003817

x minus x119894

1003817100381710038171003817

=radicsum

119895

(119909119895minus 119909119894119895)

2

(1)

22 Particle Swarm Optimization PSO is based on a swarmof119873 individuals called particles each representing a solutionto the problem with 119863 dimensions Its genotype consists of2119863 parameters with the first 119863 parameters representing thecoordinates of particlersquos position and the latter 119863 parametersbeing its velocity components in the119863-dimensional problemspace respectively Besides the two basic properties the fol-lowing properties exit a personal best position pbest

119894of each

particle in the searching space and the global best positiongbest of the whole swarm A fitness function correspondingto the problem is used to evaluate the ldquogoodnessrdquo of eachparticle Given a randomly initial position and velocity theparticles can be updated with the following

k119894

(119905+1)= 120596k(119905)119894+ 11988811199031(pbest(119905)

119894minus p(119905)119894)

+ 11988821199032(gbest(119905) minus p(119905)

119894)

(2)

p(119905+1)119894

= p(119905)119894+ k(119905+1)119894 (3)

where p(119905)119894

and k(119905)119894

are the position and velocity of the 119894thparticle at the 119905th iteration respectively The two positivefactors 119888

1and 1198882 known as the cognitive and social coef-

ficients control the contributions of the best local solutionpbest

119894(cognitive component) and the global best solution

gbest (social component) respectively 1199031and 119903

2are two

independent random variables within [0 1] The inertia

Mathematical Problems in Engineering 3

0 1 2 3 4 5

0

02

04

06

08

1

12

14

minus02

minus04minus4 minus3 minus2 minus1

(a)

0 5 10 15 20

0

5

10

15

20

minus10minus10

minus5

minus5

(b)

0 5 10 15 20

0

5

10

15

minus10minus10

minus5

minus5

(c)

0 5 10 15

0

2

4

6

8

10

12

minus5

minus2

minus4

minus6minus10

(d)

Figure 1 The distributions of the artificial datasets used in the experiment (a) long1 (b) sizes5 (c) square1 and (d) square4

weight factor 120596 is used to control the convergence of theswarm In this paper a nonlinear changing inertia factorfor PSO and SSPSO is used as [32] which is shown in thefollowing

120596 = 120596max minus(120596max minus 120596min) 119905

119879max (4)

where 119879max is the maximum number of iterations and 119905is the current number of iteration Note that during theiteration every dimension of the velocity is defined in therange [minus119881max 119881max] to limit the maximum distance that aparticle will move

3 Semisupervised Particle SwarmOptimization for Classification

In the context of PSO-based classification on the dataset Xwith 119862 classes and 119863 attributes classification problem canbe seen as that of searching for the optimal positions for the119862 centroids of data clusters in a 119863-dimensional space withthe labeled samples [23] Then NN method is applied as the

classifier that assigns class labels by calculating distances tothe centroids to classify the unlabeled instances

Data to be classified are a set of samples which aredefined by continuous attributes and the corresponding classis defined by a scalar value Different attributes may takevalues in different ranges To avoid one of attributes with largevalue dominating the distance measure all the attributes arenormalized to the range [0 1] before classification

As a particle demotes a full solution to the classificationof data with 119863 attributes and 119862 classes 119862 centroids areencoded in each particle A centroid corresponds to a classso it is defined by 119863 continuous values for the attributesTable 1 describes the structure of a single particlersquos positionCentroids are encoded sequentially in the particle and aseparate array determines the class of each centroid Namelythe class for each centroid is defined by its position insidethe particle The total dimension of the position vector is119863 lowast 119862 and similarly the velocity of the particle is madeup of 119863 lowast 119862 real numbers representing its 119863 lowast 119862 velocitycomponents in the problem space To simplify the notationfor representation we denote p

119896119894as the 119896th class centroid

vector p((119896 minus 1) lowast 119863 + 1 119896 lowast 119863) encoded in the 119894th particle

4 Mathematical Problems in Engineering

Table 1 Encoding of a set of centroids in a particle for PSO

Centroid 1 sdot sdot sdot Centroid k sdot sdot sdot Centroid CPosition 119901(1 119863) sdot sdot sdot 119901((119896 minus 1) lowast 119863 + 1 119896 lowast 119863) sdot sdot sdot 119901((119862 minus 1) lowast 119863 + 1 119862 lowast 119863)

Class 1 sdot sdot sdot k sdot sdot sdot C

5 10 15 20 25 3009992

09993

09994

09995

09996

09997

09998

09999

1

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

5 10 15 20 25 30092

093

094

095

096

097

098

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

5 10 15 20 25 30 35 40094

0945

095

0955

096

0965

097

0975

098

0985

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

5 10 15 20 25 30 35 40082

084

086

088

09

092

094

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 2 Classification results on the artificial dataset with different algorithms (a) long1 (b) sizes5 (c) square1 and (d) square4

Mathematical Problems in Engineering 5

0 5 10 15 20 25 30Number of labeled samples

066

068

07

072

074

076

078

08

082

084

Accu

racy

(a)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

098

Accu

racy

(b)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

Accu

racy

(c)

5 10 15 20 25 30Number of labeled samples

037

038

039

04

041

042

043

044

045

046

047

Accu

racy

(d)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

Accu

racy

SSPSOPSO

NNSVM

(e)

5 10 15 20 25 30Number of labeled samples

045

05

055

06

065

07

075

08

Accu

racy

SSPSOPSO

NNSVM

(f)

Figure 3 Classification results on the UCI datasets with different algorithms (a) Heart (2 classes) (b)Wine (3 classes) (c)Thyroid (3 classes)(d) Tae (3 classes) (e) SPECT (2 classes) (f) Wdbc (3 classes)

6 Mathematical Problems in Engineering

100 200 300 400 500 600 700 800 900 100002

03

04

05

06

07

08

09

1

Iterations

Fitn

ess v

alue

0 200 400 600 800 1000022

0225

023

0235

Average fitnessBest fitness

Figure 4 Typical behavior of best and average fitness of SSPSO as afunction of the number of iterations

Fitness function plays an important role in PSO A goodfitness function can quickly find the optimization positionsof the particles In [26] the fitness function 120595 of the classicalPSO classification method is computed as the sum of theEuclidean distances between all the training samples and theclass centroids encoded in the particle they belong to Thenthe sum is divided by 119897 which is the total number of trainingsamples The fitness of the 119894th particle is defined as

120595 (p119894) =

1

119897

sdot

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

) (5)

where 119862119871(119895) denotes the class label of the training sample x119895

p119862119871(119895)119894

denotes the centroid vector of the class119862119871(119895) encodedin the 119894th particle and 119889(x

119895 p119862119871(119895)119894

) is the Euclidean distancebetween the training sample x

119895and the class centroid p

119862119871(119895)119894

Equation (5) only considers the labeled samples which areused to provide the discriminant information However inthe case that the labeled samples are limited the labeledsamples are too few to represent the real distribution ofdataset while the abundant unlabeled samples are oftenavailable and may be helpful to capture the real geometricalstructure of the whole dataset To take full advantage of theexisting unlabeled samples wemodify the fitness function byintroducing the structure information of unlabeled samplesto the original fitness function of PSO classifier With theassumption of NN method that the neighborhood samplesshould have the same labels we propose to use a new fitnessfunction in our proposed SSPSO as follows

120595 (p119894) = 120573 (

1

119897

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

)) + (1 minus 120573)

times (

1

119906

119906

sum

119896=1

min 119889 (x119896 p1119894)

119889 (x119896 p2119894) 119889 (x

119896 p119862119894))

(6)

where 120595(p119894) is the fitness value of the 119894th particle 120573 is a

weight factor in the range between [0 1] which controls theratio of information obtained from the labeled and unlabeledsamples 119906 is the number of the unlabeled samples X

119880 and

119897 is the number of labeled samples X119871 The first term on the

left side of the fitness function is the discriminate constraintwhich means that a good classifier should have a better resulton the labeled samples The second term is the structureconstraint which is helpful to find the real distribution of thewhole dataset so as to improve the classification performanceWhen 120573 = 1 we obtain the standard PSO algorithm andwhen 120573 = 0 we can obtain an unsupervised PSO clusteringmethod

The detailed process of the proposed SSPSO is as follows

Input The labeled dataset is X119871= x

1 x2 x

119897 and

the corresponding labels set is Y119871= (1199101 1199101 119910

119897)

119879 theunlabeled dataset is X

119880= x1 x2 x

119906

Output The labels of the unlabeled samples

Step 1 Load training dataset and unlabeled samples

Step 2 Normalize the dataset

Step 3 Initialize the swarm with 119873 particles by randomlygenerating both the position and velocity vectors for eachparticle with the entry value between the range [0 1] It isnoticed that the dimension of each particle equals the productof the number of attributes119863 and the number of classes 119862

Step 4 Iterate until the maximum number of iterations isreached

Substep 1 Calculate the fitness value 120595(p(119905)119894) for each particle

with (6)

Substep 2 Update the best fitness value 120595(pbest(119905)119894) and the

best particle of 119894th particle pbest(119905)119894 that is if 120595(p(119905)

119894) lt

120595(pbest(119905)119894) then 120595(pbest(119905)

119894) = 120595(p(119905)

119894) and pbest(119905)

119894= p(119905)119894

Substep 3 If necessary update the global best particlegbest(119905) that is b(119905) = arg minp(119905)120595(p

(119905)

1) 120595(p(119905)

2) 120595(p(119905)

119873))

if 120595(b(119905)) lt 120595(gbest(119905)) then 120595(gbest(119905)) = 120595(b(119905)) andgbest(119905) = b(119905)

Substep 4 Update particlesrsquo velocity with (2)

Substep 5 Update particlesrsquo position with (3)

Substep 6 Update the inertia factor 120596 with (4)

Step 5 Use the NN method to obtain the labels of the unla-beled samplesX

119880with the optimum centroids represented by

gbest

Mathematical Problems in Engineering 7

0 5 10 15 20076

078

08

082

084

086

088

09

092

094

096

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

0 5 10 15 20 25 30

065

07

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

0 5 10 15 20 25 30065

07

075

08

085

09

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

0 5 10 15 20 25 30 35 4007

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 5 Digit recognition on the USPS dataset with different algorithms (a) Digits ldquo0rdquo and 8rdquo (b) digits ldquo3rdquo ldquo5rdquo and ldquo8rdquo (c) digits ldquo3rdquo ldquo8rdquoand ldquo9rdquo and (d) digits ldquo1rdquo ldquo2rdquo ldquo3rdquo and ldquo4rdquo

4 Experimental Results and Analysis

In this section we assess our proposed method SSPSOon six UCI datasets four artificial datasets and the USPShandwritten dataset The datasets have different attributesand classes involving different problems including balancedand unbalanced ones

To evaluate the performance of SSPSO we make com-parisons of the classification results with the PSO-basedclassifier the traditional NN classifier and the classical SVM

classifier In order to compare the algorithms reasonably allthe parameters of PSO and SSPSO are selected to make themobtain the best results The parameter settings are as followsThe inertia weight factor120596 used in PSO and SSPSO decreaseslinearly from09 to 04 Both 119888

1and 1198882are set to 2The velocity

is defined in the range [minus005 005] The swarm scale119873 is setto 20 and the maximum number of iterations 119879max is 1000The parameters of SVM with Gaussian kernel function areselected by using the gridding search method on the trainingdataset

8 Mathematical Problems in Engineering

06507

07508

08509

095

Accu

racy

5 10 15 20 25 30 35 40 45 50 55 60Number of labeled samples

SSPSO-10SSPSO-50SSPSO-100

SSPSO-200SSPSO-400SSPSO-600

Figure 6 3 5 and 8 digit recognition on the USPS dataset bySSPSO with different numbers of unlabeled samples

In addition we analyze the effect of the number ofunlabeled samples on the classification accuracy on USPSdataset In order to test the robustness of the parameter120573 in the fitness function to the classification performancewe conduct experiments on UCI datasets with differentvalues of 120573 and analyze the effect of 120573 on the classificationperformance

41 Artificial Two-Dimension Problems To test the feasibilityof SSPSO for classification the proposed method is firstconducted on four artificial two-dimension datasets that islong1 sizes5 square1 and square4 The details are shown inTable 2 and the distributions of the four datasets are shownin Figure 1

In the experiments for the first two datasets we randomlyselect 1sim30 labeled samples per class as the training data andfor the last two datasets we randomly select 5sim40 labeledsamples per class as the training set and the rest are usedas the test set Figure 2 plots the curves of accuracy withrespect to the number of labeled samples which showsthe average results over 100 runs of the proposed SSPSOcomparing with PSO NN and SVM on the four datasetsThe weight factor 120573 in the fitness function of SSPSO (in(6)) is selected as 05 From Figure 2 it can be observedthat SSPSO can obtain favorable classification results on thefour datasets which means that SSPSO is feasible for theclassification problem Among the four datasets long1 is theeasiest to classify on which all the four methods acquire100 classification accuracy when the number of labeledsamples per class exceeds 10 But when the labeled instancesare few for example only 3 instances per class are labeledPSO NN and SVM cannot classify all the test data correctlywhile SSPSO can still obtain 100 classification accuracyIn Figure 2(b) the performance difference among SSPSONN and SVM is not noticeable when the number of labeledsamples per class is up to 15 but when the number of labeledinstances is small for example less than 10 SSPSO can obtainobvious better accuracy than the other methods It is becauseSSPSO utilizes the information of unlabeled instances whichis helpful to capture the global structure For square1 and

Table 2 Artificial datasets used in experiments

Data Classnumber

Attributenumber Instance number Normalization

long1 2 2 500500 Yessizes5 3 2 77154769 Yessquare1 4 2 250250250250 Yessquare4 4 2 250250250250 Yes

Table 3 UCI datasets used in experiments

Data Class Attributes Instances NormalizationHeart 2 13 150120 YesWine 3 13 597148 YesThyroid 3 5 1503530 YesTae 3 5 495052 YesSPECT 2 22 267 YesWdbc 3 30 357212 Yes

square4 datasets the superiority of SSPSO is more apparentthat is for all the training scenarios the best performance isachieved by the proposed method

42 UCI Dataset To further investigate the effectiveness ofSSPSO for classification we also conduct the experiments onsix real-life datasets with different numbers of attributes andclasses from the UCI machine learning repository [33] Thedescription of the datasets used in experiments is given inTable 3

For datasets with 2 classes we randomly select 1sim15labeled samples per class as the training data and for datasetswith 3 classes we randomly select 1sim10 labeled samples perclass as the training data and the rest are used as the testset The results are averaged over 100 runs The weight factor120573 in the fitness function of SSPSO (in (6)) is selected as05 Figure 3 shows the classification accuracy with differentnumbers of training samples on the 6 datasets

From Figures 3(a) 3(b) 3(c) and 3(e) it can be observedthat the proposed SSPSOmethodoutperforms the other threemethods on the Heart Wine Thyroid and SPECT datasetsespecially when the number of the labeled samples per classis small It is because that SSPSO uses the information ofavailable unlabeled data which is of benefit to the classifica-tionWith the increase of the labeled samples the superioritybecomes weak From Figure 3(d) it is seen that SSPSO canobtain comparative accuracy with the other three methodsFrom Figure 3(f) SSPSO is slightly better than SVM butit is much better than PSO and NN methods Thereforeit can be concluded that SSPSO works well for some real-life classification tasks especially in the case that the labeledsamples are highly limited

From an evolutionary point of view in Figure 4 we reportthe behavior of a typical run of SSPSO in terms of the bestindividual fitness and average fitness in the population as afunction of the number of iterations It is carried out on theThyroid database As can be seen SSPSO shows two phasesIn the first phase with about 50 iterations the fitness value

Mathematical Problems in Engineering 9

0201 03 04 05 06 07 08 0904

05

06

07

08

09

1Ac

cura

cy

SSPSONNSVM

120573

(a)

SSPSONNSVM

01 02 03 04 05 06 07 08 09055

06

065

07

075

08

Accu

racy

120573

(b)

Figure 7 Classification accuracy as a function of 120573 in SSPSO on (a) Thyroid and (b) Heart

decreases sharply starting from 06712 for the best and 09192for the average and reaching about 02305 for the best and02356 for the averageThen the second phase follows lastingabout 50 iterations in which both the best and the averagefitness values decrease slowly and tend to become closer andcloser until they reach 02231 and 02247 respectively Andthen the average and the best fitness values becomemore andmore similar Finally both the two values get to 02230

43 USPS Digital Recognition We also conduct experimentson the USPS handwritten digits dataset to test the perfor-mance of SSPSO This dataset consists of 9298 samples with10 classes and each sample has the size of 16times16 pixels Firstlywe apply the principal component analysis on the dataset forfeature extraction and select the first 10 principle componentsas the new features

We consider four subsets of the dataset in the experimentthat is the images of digits 0 and 8 with 2261 examples intotal the images of digits 3 5 and 8 with 2248 examples intotal the images of digits 3 8 and 9 with a total numberof 2429 examples and the images of digits 1 2 3 and 4with a total number of 3874 examples We randomly select1sim10 samples per class respectively as the training dataand randomly select 200 unlabeled samples to construct theunlabeled sample set X

119880 which is used for semisupervised

learningThe weight factor 120573 in the fitness function of SSPSO(in (6)) is selected as 07

The recognition results averaged over 100 independenttrials are summarized in Figure 5 where the horizontal axisrepresents the number of randomly labeled digital imagesper class in the subset and the vertical axis represents

the classification accuracy From Figure 5(a) it is shown thatwhen the number of labeled samples per class is below 14SSPSO can obtain comparable performance with SVM andKNN and be better than PSO In particular SSPSO canoutperform the other methods when the labeled samples arefew For the results on the USPS subset of digits 3 5 and 8and the subset of digits 3 8 and 9 shown in Figures 5(b)and 5(c) respectively one can clearly see that SSPSOmethodoutperforms SVM and is much better than KNN and PSOmethods when the number of labeled samples is small InFigure 5(d) SSPSO still works better than the other methodsbut the superiority of the proposed SSPSO over the othermethods decrease with the increase of labeled samples

44 The Sensitivity Analysis of the Number of UnlabeledSamples In this section we validate the effect of the numberof the unlabeled samples on the classification accuracy Thisexperiment is carried on the subset of the USPS datasetwith the digit images of 3 5 and 8 We vary the size ofthe unlabeled set X

119880to be 10 50 100 200 400 and 600

Figure 6 illustrates the classification accuracy as the functionof the size of the unlabeled set and the number of labeledsamples From Figure 6 one can see that the number ofthe unlabeled samples affects the accuracy slightly when thenumber of the labeled samples is small The plot with 10unlabeled samples gets much lower accuracy than the otherplots which indicates that the number of unlabeled samplesused should not be too small With the increase of the size ofX119880 SSPSO can obtain better classification accuracy because

the proposed method can capture the real structure of thewhole dataset more precisely with more unlabeled samples

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

2 Mathematical Problems in Engineering

data in advance [23] The nearest neighbor decision ruleassigns an unknown input sample vector to the class labelof its nearest neighbor [24] which is measured in termsof a distance defined in the feature space In this spaceeach class defines a region which is called the Voronoiregion [25] When the distance is defined as the classicalEuclidean distance the Voronoi regions are delimited bylinear borders This method can be extended to the K-nearest neighbors when more than one nearest neighbor isconsidered In addition some other distance measures otherthan the Euclidean distance also can be used

A further improvement of NN method replaces theoriginal training data by a set of prototypes that correctlyldquorepresentrdquo the original data [23] Namely the classifierassigns class labels by calculating distances to the prototypesrather than to the original training data As the number ofprototypes is much smaller than the total number of originaltraining data sets classification of new sample is performedmuch faster due to the reduced computational complexityof the solution (measured by the number of prototypes)These nearest prototype algorithms are able to achieve betteraccuracy of the solution than the basic NN classifiers Anevolutionary approach to the prototype selection problemcan be found in [26]

PSO has been used in classification in many literaturesMost of the PSO-based classification methods combine PSOwith an existing machine learning or classification algorithmsuch as NN [12 27] neural network [13] and rough settheory [28] In [12] an unsupervised learning algorithm isproposed byminimizing the distances within clusters In [13]an evolutionary approach-based nearest prototype classifieris introduced In [27] PSO is applied to find the optimalpositions of class centroids in the feature space of datasetusing the examples contained in the training set

PSO has shown competitive performance in classificationproblem However it usually needs many labeled data pointsto obtain the optimal positions of class centroids Semisu-pervised learning provides a better solution by making fulluse of the abundant unlabeled data along with the limitedlabeled samples to improve the accuracy and robustness ofclass predictions [18] and [29ndash31] In this paper we propose asemisupervised classification method based on the standardPSO namely semisupervised PSO (SSPSO) in which someavailable supervised information and the wealth of unlabeleddata points are simultaneously used to search for the optimalpositions of class centroids The key point in SSPSO is tointroduce the unlabeled information to the fitness functionof PSO naturally The advantages of SSPSO can be concludedas follows firstly it is a semisupervised learning methodwhich can be applied with limited labeled samples secondlywith less number of prototypes the classification of newpatterns will be performed faster thirdly SSPSO is able toachieve competitive or even better accuracy than the basicNN classifier and SVM

The rest part of this paper is organized as followsIn Section 2 the theory of nearest neighbor classificationand PSO is presented The classification method based onPSO and our proposed SSPSO are described in Section 3Experimental results and analysis on UCI datasets some

typical artificial datasets and the USPS handwritten datasetare shown and discussed in Section 4 Finally Section 5concludes this paper

2 Review of Related Methods

21 K-Nearest Neighbor Algorithm In pattern recognitionthe k-nearest neighbor algorithm (KNN) is a simple methodfor classification KNN is a type of lazy learning where thefunction is only approximated locally and all computationis deferred until classification The simplest 1-NN algorithmassigns an unknown input sample to the class of its nearestneighbor from a stored labeled reference set Instead oflooking at the closest labeled sample the KNN algorithmseeks k samples in the labeled reference set that are closestto the unknown sample and applies a voting mechanism tomake a decision for label prediction

Suppose T = (x119894 119910119894) is the training set where x

119894isin R119889

denotes the training example in a continuous multidimen-sional feature space and 119910

119894isin R is class label of x

119894 For 1-NN

classification the class label of a test sample x isin R119889 can beobtained by finding the training example that is the nearest tox according to some distance metrics such as the Euclideandistance in (1) and assigning the class label of this trainingsample to it For KNN classification the class label of the testsample can be obtained with a method of majority votingConsider

1003817100381710038171003817

x minus x119894

1003817100381710038171003817

=radicsum

119895

(119909119895minus 119909119894119895)

2

(1)

22 Particle Swarm Optimization PSO is based on a swarmof119873 individuals called particles each representing a solutionto the problem with 119863 dimensions Its genotype consists of2119863 parameters with the first 119863 parameters representing thecoordinates of particlersquos position and the latter 119863 parametersbeing its velocity components in the119863-dimensional problemspace respectively Besides the two basic properties the fol-lowing properties exit a personal best position pbest

119894of each

particle in the searching space and the global best positiongbest of the whole swarm A fitness function correspondingto the problem is used to evaluate the ldquogoodnessrdquo of eachparticle Given a randomly initial position and velocity theparticles can be updated with the following

k119894

(119905+1)= 120596k(119905)119894+ 11988811199031(pbest(119905)

119894minus p(119905)119894)

+ 11988821199032(gbest(119905) minus p(119905)

119894)

(2)

p(119905+1)119894

= p(119905)119894+ k(119905+1)119894 (3)

where p(119905)119894

and k(119905)119894

are the position and velocity of the 119894thparticle at the 119905th iteration respectively The two positivefactors 119888

1and 1198882 known as the cognitive and social coef-

ficients control the contributions of the best local solutionpbest

119894(cognitive component) and the global best solution

gbest (social component) respectively 1199031and 119903

2are two

independent random variables within [0 1] The inertia

Mathematical Problems in Engineering 3

0 1 2 3 4 5

0

02

04

06

08

1

12

14

minus02

minus04minus4 minus3 minus2 minus1

(a)

0 5 10 15 20

0

5

10

15

20

minus10minus10

minus5

minus5

(b)

0 5 10 15 20

0

5

10

15

minus10minus10

minus5

minus5

(c)

0 5 10 15

0

2

4

6

8

10

12

minus5

minus2

minus4

minus6minus10

(d)

Figure 1 The distributions of the artificial datasets used in the experiment (a) long1 (b) sizes5 (c) square1 and (d) square4

weight factor 120596 is used to control the convergence of theswarm In this paper a nonlinear changing inertia factorfor PSO and SSPSO is used as [32] which is shown in thefollowing

120596 = 120596max minus(120596max minus 120596min) 119905

119879max (4)

where 119879max is the maximum number of iterations and 119905is the current number of iteration Note that during theiteration every dimension of the velocity is defined in therange [minus119881max 119881max] to limit the maximum distance that aparticle will move

3 Semisupervised Particle SwarmOptimization for Classification

In the context of PSO-based classification on the dataset Xwith 119862 classes and 119863 attributes classification problem canbe seen as that of searching for the optimal positions for the119862 centroids of data clusters in a 119863-dimensional space withthe labeled samples [23] Then NN method is applied as the

classifier that assigns class labels by calculating distances tothe centroids to classify the unlabeled instances

Data to be classified are a set of samples which aredefined by continuous attributes and the corresponding classis defined by a scalar value Different attributes may takevalues in different ranges To avoid one of attributes with largevalue dominating the distance measure all the attributes arenormalized to the range [0 1] before classification

As a particle demotes a full solution to the classificationof data with 119863 attributes and 119862 classes 119862 centroids areencoded in each particle A centroid corresponds to a classso it is defined by 119863 continuous values for the attributesTable 1 describes the structure of a single particlersquos positionCentroids are encoded sequentially in the particle and aseparate array determines the class of each centroid Namelythe class for each centroid is defined by its position insidethe particle The total dimension of the position vector is119863 lowast 119862 and similarly the velocity of the particle is madeup of 119863 lowast 119862 real numbers representing its 119863 lowast 119862 velocitycomponents in the problem space To simplify the notationfor representation we denote p

119896119894as the 119896th class centroid

vector p((119896 minus 1) lowast 119863 + 1 119896 lowast 119863) encoded in the 119894th particle

4 Mathematical Problems in Engineering

Table 1 Encoding of a set of centroids in a particle for PSO

Centroid 1 sdot sdot sdot Centroid k sdot sdot sdot Centroid CPosition 119901(1 119863) sdot sdot sdot 119901((119896 minus 1) lowast 119863 + 1 119896 lowast 119863) sdot sdot sdot 119901((119862 minus 1) lowast 119863 + 1 119862 lowast 119863)

Class 1 sdot sdot sdot k sdot sdot sdot C

5 10 15 20 25 3009992

09993

09994

09995

09996

09997

09998

09999

1

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

5 10 15 20 25 30092

093

094

095

096

097

098

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

5 10 15 20 25 30 35 40094

0945

095

0955

096

0965

097

0975

098

0985

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

5 10 15 20 25 30 35 40082

084

086

088

09

092

094

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 2 Classification results on the artificial dataset with different algorithms (a) long1 (b) sizes5 (c) square1 and (d) square4

Mathematical Problems in Engineering 5

0 5 10 15 20 25 30Number of labeled samples

066

068

07

072

074

076

078

08

082

084

Accu

racy

(a)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

098

Accu

racy

(b)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

Accu

racy

(c)

5 10 15 20 25 30Number of labeled samples

037

038

039

04

041

042

043

044

045

046

047

Accu

racy

(d)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

Accu

racy

SSPSOPSO

NNSVM

(e)

5 10 15 20 25 30Number of labeled samples

045

05

055

06

065

07

075

08

Accu

racy

SSPSOPSO

NNSVM

(f)

Figure 3 Classification results on the UCI datasets with different algorithms (a) Heart (2 classes) (b)Wine (3 classes) (c)Thyroid (3 classes)(d) Tae (3 classes) (e) SPECT (2 classes) (f) Wdbc (3 classes)

6 Mathematical Problems in Engineering

100 200 300 400 500 600 700 800 900 100002

03

04

05

06

07

08

09

1

Iterations

Fitn

ess v

alue

0 200 400 600 800 1000022

0225

023

0235

Average fitnessBest fitness

Figure 4 Typical behavior of best and average fitness of SSPSO as afunction of the number of iterations

Fitness function plays an important role in PSO A goodfitness function can quickly find the optimization positionsof the particles In [26] the fitness function 120595 of the classicalPSO classification method is computed as the sum of theEuclidean distances between all the training samples and theclass centroids encoded in the particle they belong to Thenthe sum is divided by 119897 which is the total number of trainingsamples The fitness of the 119894th particle is defined as

120595 (p119894) =

1

119897

sdot

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

) (5)

where 119862119871(119895) denotes the class label of the training sample x119895

p119862119871(119895)119894

denotes the centroid vector of the class119862119871(119895) encodedin the 119894th particle and 119889(x

119895 p119862119871(119895)119894

) is the Euclidean distancebetween the training sample x

119895and the class centroid p

119862119871(119895)119894

Equation (5) only considers the labeled samples which areused to provide the discriminant information However inthe case that the labeled samples are limited the labeledsamples are too few to represent the real distribution ofdataset while the abundant unlabeled samples are oftenavailable and may be helpful to capture the real geometricalstructure of the whole dataset To take full advantage of theexisting unlabeled samples wemodify the fitness function byintroducing the structure information of unlabeled samplesto the original fitness function of PSO classifier With theassumption of NN method that the neighborhood samplesshould have the same labels we propose to use a new fitnessfunction in our proposed SSPSO as follows

120595 (p119894) = 120573 (

1

119897

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

)) + (1 minus 120573)

times (

1

119906

119906

sum

119896=1

min 119889 (x119896 p1119894)

119889 (x119896 p2119894) 119889 (x

119896 p119862119894))

(6)

where 120595(p119894) is the fitness value of the 119894th particle 120573 is a

weight factor in the range between [0 1] which controls theratio of information obtained from the labeled and unlabeledsamples 119906 is the number of the unlabeled samples X

119880 and

119897 is the number of labeled samples X119871 The first term on the

left side of the fitness function is the discriminate constraintwhich means that a good classifier should have a better resulton the labeled samples The second term is the structureconstraint which is helpful to find the real distribution of thewhole dataset so as to improve the classification performanceWhen 120573 = 1 we obtain the standard PSO algorithm andwhen 120573 = 0 we can obtain an unsupervised PSO clusteringmethod

The detailed process of the proposed SSPSO is as follows

Input The labeled dataset is X119871= x

1 x2 x

119897 and

the corresponding labels set is Y119871= (1199101 1199101 119910

119897)

119879 theunlabeled dataset is X

119880= x1 x2 x

119906

Output The labels of the unlabeled samples

Step 1 Load training dataset and unlabeled samples

Step 2 Normalize the dataset

Step 3 Initialize the swarm with 119873 particles by randomlygenerating both the position and velocity vectors for eachparticle with the entry value between the range [0 1] It isnoticed that the dimension of each particle equals the productof the number of attributes119863 and the number of classes 119862

Step 4 Iterate until the maximum number of iterations isreached

Substep 1 Calculate the fitness value 120595(p(119905)119894) for each particle

with (6)

Substep 2 Update the best fitness value 120595(pbest(119905)119894) and the

best particle of 119894th particle pbest(119905)119894 that is if 120595(p(119905)

119894) lt

120595(pbest(119905)119894) then 120595(pbest(119905)

119894) = 120595(p(119905)

119894) and pbest(119905)

119894= p(119905)119894

Substep 3 If necessary update the global best particlegbest(119905) that is b(119905) = arg minp(119905)120595(p

(119905)

1) 120595(p(119905)

2) 120595(p(119905)

119873))

if 120595(b(119905)) lt 120595(gbest(119905)) then 120595(gbest(119905)) = 120595(b(119905)) andgbest(119905) = b(119905)

Substep 4 Update particlesrsquo velocity with (2)

Substep 5 Update particlesrsquo position with (3)

Substep 6 Update the inertia factor 120596 with (4)

Step 5 Use the NN method to obtain the labels of the unla-beled samplesX

119880with the optimum centroids represented by

gbest

Mathematical Problems in Engineering 7

0 5 10 15 20076

078

08

082

084

086

088

09

092

094

096

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

0 5 10 15 20 25 30

065

07

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

0 5 10 15 20 25 30065

07

075

08

085

09

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

0 5 10 15 20 25 30 35 4007

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 5 Digit recognition on the USPS dataset with different algorithms (a) Digits ldquo0rdquo and 8rdquo (b) digits ldquo3rdquo ldquo5rdquo and ldquo8rdquo (c) digits ldquo3rdquo ldquo8rdquoand ldquo9rdquo and (d) digits ldquo1rdquo ldquo2rdquo ldquo3rdquo and ldquo4rdquo

4 Experimental Results and Analysis

In this section we assess our proposed method SSPSOon six UCI datasets four artificial datasets and the USPShandwritten dataset The datasets have different attributesand classes involving different problems including balancedand unbalanced ones

To evaluate the performance of SSPSO we make com-parisons of the classification results with the PSO-basedclassifier the traditional NN classifier and the classical SVM

classifier In order to compare the algorithms reasonably allthe parameters of PSO and SSPSO are selected to make themobtain the best results The parameter settings are as followsThe inertia weight factor120596 used in PSO and SSPSO decreaseslinearly from09 to 04 Both 119888

1and 1198882are set to 2The velocity

is defined in the range [minus005 005] The swarm scale119873 is setto 20 and the maximum number of iterations 119879max is 1000The parameters of SVM with Gaussian kernel function areselected by using the gridding search method on the trainingdataset

8 Mathematical Problems in Engineering

06507

07508

08509

095

Accu

racy

5 10 15 20 25 30 35 40 45 50 55 60Number of labeled samples

SSPSO-10SSPSO-50SSPSO-100

SSPSO-200SSPSO-400SSPSO-600

Figure 6 3 5 and 8 digit recognition on the USPS dataset bySSPSO with different numbers of unlabeled samples

In addition we analyze the effect of the number ofunlabeled samples on the classification accuracy on USPSdataset In order to test the robustness of the parameter120573 in the fitness function to the classification performancewe conduct experiments on UCI datasets with differentvalues of 120573 and analyze the effect of 120573 on the classificationperformance

41 Artificial Two-Dimension Problems To test the feasibilityof SSPSO for classification the proposed method is firstconducted on four artificial two-dimension datasets that islong1 sizes5 square1 and square4 The details are shown inTable 2 and the distributions of the four datasets are shownin Figure 1

In the experiments for the first two datasets we randomlyselect 1sim30 labeled samples per class as the training data andfor the last two datasets we randomly select 5sim40 labeledsamples per class as the training set and the rest are usedas the test set Figure 2 plots the curves of accuracy withrespect to the number of labeled samples which showsthe average results over 100 runs of the proposed SSPSOcomparing with PSO NN and SVM on the four datasetsThe weight factor 120573 in the fitness function of SSPSO (in(6)) is selected as 05 From Figure 2 it can be observedthat SSPSO can obtain favorable classification results on thefour datasets which means that SSPSO is feasible for theclassification problem Among the four datasets long1 is theeasiest to classify on which all the four methods acquire100 classification accuracy when the number of labeledsamples per class exceeds 10 But when the labeled instancesare few for example only 3 instances per class are labeledPSO NN and SVM cannot classify all the test data correctlywhile SSPSO can still obtain 100 classification accuracyIn Figure 2(b) the performance difference among SSPSONN and SVM is not noticeable when the number of labeledsamples per class is up to 15 but when the number of labeledinstances is small for example less than 10 SSPSO can obtainobvious better accuracy than the other methods It is becauseSSPSO utilizes the information of unlabeled instances whichis helpful to capture the global structure For square1 and

Table 2 Artificial datasets used in experiments

Data Classnumber

Attributenumber Instance number Normalization

long1 2 2 500500 Yessizes5 3 2 77154769 Yessquare1 4 2 250250250250 Yessquare4 4 2 250250250250 Yes

Table 3 UCI datasets used in experiments

Data Class Attributes Instances NormalizationHeart 2 13 150120 YesWine 3 13 597148 YesThyroid 3 5 1503530 YesTae 3 5 495052 YesSPECT 2 22 267 YesWdbc 3 30 357212 Yes

square4 datasets the superiority of SSPSO is more apparentthat is for all the training scenarios the best performance isachieved by the proposed method

42 UCI Dataset To further investigate the effectiveness ofSSPSO for classification we also conduct the experiments onsix real-life datasets with different numbers of attributes andclasses from the UCI machine learning repository [33] Thedescription of the datasets used in experiments is given inTable 3

For datasets with 2 classes we randomly select 1sim15labeled samples per class as the training data and for datasetswith 3 classes we randomly select 1sim10 labeled samples perclass as the training data and the rest are used as the testset The results are averaged over 100 runs The weight factor120573 in the fitness function of SSPSO (in (6)) is selected as05 Figure 3 shows the classification accuracy with differentnumbers of training samples on the 6 datasets

From Figures 3(a) 3(b) 3(c) and 3(e) it can be observedthat the proposed SSPSOmethodoutperforms the other threemethods on the Heart Wine Thyroid and SPECT datasetsespecially when the number of the labeled samples per classis small It is because that SSPSO uses the information ofavailable unlabeled data which is of benefit to the classifica-tionWith the increase of the labeled samples the superioritybecomes weak From Figure 3(d) it is seen that SSPSO canobtain comparative accuracy with the other three methodsFrom Figure 3(f) SSPSO is slightly better than SVM butit is much better than PSO and NN methods Thereforeit can be concluded that SSPSO works well for some real-life classification tasks especially in the case that the labeledsamples are highly limited

From an evolutionary point of view in Figure 4 we reportthe behavior of a typical run of SSPSO in terms of the bestindividual fitness and average fitness in the population as afunction of the number of iterations It is carried out on theThyroid database As can be seen SSPSO shows two phasesIn the first phase with about 50 iterations the fitness value

Mathematical Problems in Engineering 9

0201 03 04 05 06 07 08 0904

05

06

07

08

09

1Ac

cura

cy

SSPSONNSVM

120573

(a)

SSPSONNSVM

01 02 03 04 05 06 07 08 09055

06

065

07

075

08

Accu

racy

120573

(b)

Figure 7 Classification accuracy as a function of 120573 in SSPSO on (a) Thyroid and (b) Heart

decreases sharply starting from 06712 for the best and 09192for the average and reaching about 02305 for the best and02356 for the averageThen the second phase follows lastingabout 50 iterations in which both the best and the averagefitness values decrease slowly and tend to become closer andcloser until they reach 02231 and 02247 respectively Andthen the average and the best fitness values becomemore andmore similar Finally both the two values get to 02230

43 USPS Digital Recognition We also conduct experimentson the USPS handwritten digits dataset to test the perfor-mance of SSPSO This dataset consists of 9298 samples with10 classes and each sample has the size of 16times16 pixels Firstlywe apply the principal component analysis on the dataset forfeature extraction and select the first 10 principle componentsas the new features

We consider four subsets of the dataset in the experimentthat is the images of digits 0 and 8 with 2261 examples intotal the images of digits 3 5 and 8 with 2248 examples intotal the images of digits 3 8 and 9 with a total numberof 2429 examples and the images of digits 1 2 3 and 4with a total number of 3874 examples We randomly select1sim10 samples per class respectively as the training dataand randomly select 200 unlabeled samples to construct theunlabeled sample set X

119880 which is used for semisupervised

learningThe weight factor 120573 in the fitness function of SSPSO(in (6)) is selected as 07

The recognition results averaged over 100 independenttrials are summarized in Figure 5 where the horizontal axisrepresents the number of randomly labeled digital imagesper class in the subset and the vertical axis represents

the classification accuracy From Figure 5(a) it is shown thatwhen the number of labeled samples per class is below 14SSPSO can obtain comparable performance with SVM andKNN and be better than PSO In particular SSPSO canoutperform the other methods when the labeled samples arefew For the results on the USPS subset of digits 3 5 and 8and the subset of digits 3 8 and 9 shown in Figures 5(b)and 5(c) respectively one can clearly see that SSPSOmethodoutperforms SVM and is much better than KNN and PSOmethods when the number of labeled samples is small InFigure 5(d) SSPSO still works better than the other methodsbut the superiority of the proposed SSPSO over the othermethods decrease with the increase of labeled samples

44 The Sensitivity Analysis of the Number of UnlabeledSamples In this section we validate the effect of the numberof the unlabeled samples on the classification accuracy Thisexperiment is carried on the subset of the USPS datasetwith the digit images of 3 5 and 8 We vary the size ofthe unlabeled set X

119880to be 10 50 100 200 400 and 600

Figure 6 illustrates the classification accuracy as the functionof the size of the unlabeled set and the number of labeledsamples From Figure 6 one can see that the number ofthe unlabeled samples affects the accuracy slightly when thenumber of the labeled samples is small The plot with 10unlabeled samples gets much lower accuracy than the otherplots which indicates that the number of unlabeled samplesused should not be too small With the increase of the size ofX119880 SSPSO can obtain better classification accuracy because

the proposed method can capture the real structure of thewhole dataset more precisely with more unlabeled samples

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

Mathematical Problems in Engineering 3

0 1 2 3 4 5

0

02

04

06

08

1

12

14

minus02

minus04minus4 minus3 minus2 minus1

(a)

0 5 10 15 20

0

5

10

15

20

minus10minus10

minus5

minus5

(b)

0 5 10 15 20

0

5

10

15

minus10minus10

minus5

minus5

(c)

0 5 10 15

0

2

4

6

8

10

12

minus5

minus2

minus4

minus6minus10

(d)

Figure 1 The distributions of the artificial datasets used in the experiment (a) long1 (b) sizes5 (c) square1 and (d) square4

weight factor 120596 is used to control the convergence of theswarm In this paper a nonlinear changing inertia factorfor PSO and SSPSO is used as [32] which is shown in thefollowing

120596 = 120596max minus(120596max minus 120596min) 119905

119879max (4)

where 119879max is the maximum number of iterations and 119905is the current number of iteration Note that during theiteration every dimension of the velocity is defined in therange [minus119881max 119881max] to limit the maximum distance that aparticle will move

3 Semisupervised Particle SwarmOptimization for Classification

In the context of PSO-based classification on the dataset Xwith 119862 classes and 119863 attributes classification problem canbe seen as that of searching for the optimal positions for the119862 centroids of data clusters in a 119863-dimensional space withthe labeled samples [23] Then NN method is applied as the

classifier that assigns class labels by calculating distances tothe centroids to classify the unlabeled instances

Data to be classified are a set of samples which aredefined by continuous attributes and the corresponding classis defined by a scalar value Different attributes may takevalues in different ranges To avoid one of attributes with largevalue dominating the distance measure all the attributes arenormalized to the range [0 1] before classification

As a particle demotes a full solution to the classificationof data with 119863 attributes and 119862 classes 119862 centroids areencoded in each particle A centroid corresponds to a classso it is defined by 119863 continuous values for the attributesTable 1 describes the structure of a single particlersquos positionCentroids are encoded sequentially in the particle and aseparate array determines the class of each centroid Namelythe class for each centroid is defined by its position insidethe particle The total dimension of the position vector is119863 lowast 119862 and similarly the velocity of the particle is madeup of 119863 lowast 119862 real numbers representing its 119863 lowast 119862 velocitycomponents in the problem space To simplify the notationfor representation we denote p

119896119894as the 119896th class centroid

vector p((119896 minus 1) lowast 119863 + 1 119896 lowast 119863) encoded in the 119894th particle

4 Mathematical Problems in Engineering

Table 1 Encoding of a set of centroids in a particle for PSO

Centroid 1 sdot sdot sdot Centroid k sdot sdot sdot Centroid CPosition 119901(1 119863) sdot sdot sdot 119901((119896 minus 1) lowast 119863 + 1 119896 lowast 119863) sdot sdot sdot 119901((119862 minus 1) lowast 119863 + 1 119862 lowast 119863)

Class 1 sdot sdot sdot k sdot sdot sdot C

5 10 15 20 25 3009992

09993

09994

09995

09996

09997

09998

09999

1

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

5 10 15 20 25 30092

093

094

095

096

097

098

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

5 10 15 20 25 30 35 40094

0945

095

0955

096

0965

097

0975

098

0985

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

5 10 15 20 25 30 35 40082

084

086

088

09

092

094

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 2 Classification results on the artificial dataset with different algorithms (a) long1 (b) sizes5 (c) square1 and (d) square4

Mathematical Problems in Engineering 5

0 5 10 15 20 25 30Number of labeled samples

066

068

07

072

074

076

078

08

082

084

Accu

racy

(a)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

098

Accu

racy

(b)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

Accu

racy

(c)

5 10 15 20 25 30Number of labeled samples

037

038

039

04

041

042

043

044

045

046

047

Accu

racy

(d)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

Accu

racy

SSPSOPSO

NNSVM

(e)

5 10 15 20 25 30Number of labeled samples

045

05

055

06

065

07

075

08

Accu

racy

SSPSOPSO

NNSVM

(f)

Figure 3 Classification results on the UCI datasets with different algorithms (a) Heart (2 classes) (b)Wine (3 classes) (c)Thyroid (3 classes)(d) Tae (3 classes) (e) SPECT (2 classes) (f) Wdbc (3 classes)

6 Mathematical Problems in Engineering

100 200 300 400 500 600 700 800 900 100002

03

04

05

06

07

08

09

1

Iterations

Fitn

ess v

alue

0 200 400 600 800 1000022

0225

023

0235

Average fitnessBest fitness

Figure 4 Typical behavior of best and average fitness of SSPSO as afunction of the number of iterations

Fitness function plays an important role in PSO A goodfitness function can quickly find the optimization positionsof the particles In [26] the fitness function 120595 of the classicalPSO classification method is computed as the sum of theEuclidean distances between all the training samples and theclass centroids encoded in the particle they belong to Thenthe sum is divided by 119897 which is the total number of trainingsamples The fitness of the 119894th particle is defined as

120595 (p119894) =

1

119897

sdot

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

) (5)

where 119862119871(119895) denotes the class label of the training sample x119895

p119862119871(119895)119894

denotes the centroid vector of the class119862119871(119895) encodedin the 119894th particle and 119889(x

119895 p119862119871(119895)119894

) is the Euclidean distancebetween the training sample x

119895and the class centroid p

119862119871(119895)119894

Equation (5) only considers the labeled samples which areused to provide the discriminant information However inthe case that the labeled samples are limited the labeledsamples are too few to represent the real distribution ofdataset while the abundant unlabeled samples are oftenavailable and may be helpful to capture the real geometricalstructure of the whole dataset To take full advantage of theexisting unlabeled samples wemodify the fitness function byintroducing the structure information of unlabeled samplesto the original fitness function of PSO classifier With theassumption of NN method that the neighborhood samplesshould have the same labels we propose to use a new fitnessfunction in our proposed SSPSO as follows

120595 (p119894) = 120573 (

1

119897

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

)) + (1 minus 120573)

times (

1

119906

119906

sum

119896=1

min 119889 (x119896 p1119894)

119889 (x119896 p2119894) 119889 (x

119896 p119862119894))

(6)

where 120595(p119894) is the fitness value of the 119894th particle 120573 is a

weight factor in the range between [0 1] which controls theratio of information obtained from the labeled and unlabeledsamples 119906 is the number of the unlabeled samples X

119880 and

119897 is the number of labeled samples X119871 The first term on the

left side of the fitness function is the discriminate constraintwhich means that a good classifier should have a better resulton the labeled samples The second term is the structureconstraint which is helpful to find the real distribution of thewhole dataset so as to improve the classification performanceWhen 120573 = 1 we obtain the standard PSO algorithm andwhen 120573 = 0 we can obtain an unsupervised PSO clusteringmethod

The detailed process of the proposed SSPSO is as follows

Input The labeled dataset is X119871= x

1 x2 x

119897 and

the corresponding labels set is Y119871= (1199101 1199101 119910

119897)

119879 theunlabeled dataset is X

119880= x1 x2 x

119906

Output The labels of the unlabeled samples

Step 1 Load training dataset and unlabeled samples

Step 2 Normalize the dataset

Step 3 Initialize the swarm with 119873 particles by randomlygenerating both the position and velocity vectors for eachparticle with the entry value between the range [0 1] It isnoticed that the dimension of each particle equals the productof the number of attributes119863 and the number of classes 119862

Step 4 Iterate until the maximum number of iterations isreached

Substep 1 Calculate the fitness value 120595(p(119905)119894) for each particle

with (6)

Substep 2 Update the best fitness value 120595(pbest(119905)119894) and the

best particle of 119894th particle pbest(119905)119894 that is if 120595(p(119905)

119894) lt

120595(pbest(119905)119894) then 120595(pbest(119905)

119894) = 120595(p(119905)

119894) and pbest(119905)

119894= p(119905)119894

Substep 3 If necessary update the global best particlegbest(119905) that is b(119905) = arg minp(119905)120595(p

(119905)

1) 120595(p(119905)

2) 120595(p(119905)

119873))

if 120595(b(119905)) lt 120595(gbest(119905)) then 120595(gbest(119905)) = 120595(b(119905)) andgbest(119905) = b(119905)

Substep 4 Update particlesrsquo velocity with (2)

Substep 5 Update particlesrsquo position with (3)

Substep 6 Update the inertia factor 120596 with (4)

Step 5 Use the NN method to obtain the labels of the unla-beled samplesX

119880with the optimum centroids represented by

gbest

Mathematical Problems in Engineering 7

0 5 10 15 20076

078

08

082

084

086

088

09

092

094

096

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

0 5 10 15 20 25 30

065

07

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

0 5 10 15 20 25 30065

07

075

08

085

09

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

0 5 10 15 20 25 30 35 4007

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 5 Digit recognition on the USPS dataset with different algorithms (a) Digits ldquo0rdquo and 8rdquo (b) digits ldquo3rdquo ldquo5rdquo and ldquo8rdquo (c) digits ldquo3rdquo ldquo8rdquoand ldquo9rdquo and (d) digits ldquo1rdquo ldquo2rdquo ldquo3rdquo and ldquo4rdquo

4 Experimental Results and Analysis

In this section we assess our proposed method SSPSOon six UCI datasets four artificial datasets and the USPShandwritten dataset The datasets have different attributesand classes involving different problems including balancedand unbalanced ones

To evaluate the performance of SSPSO we make com-parisons of the classification results with the PSO-basedclassifier the traditional NN classifier and the classical SVM

classifier In order to compare the algorithms reasonably allthe parameters of PSO and SSPSO are selected to make themobtain the best results The parameter settings are as followsThe inertia weight factor120596 used in PSO and SSPSO decreaseslinearly from09 to 04 Both 119888

1and 1198882are set to 2The velocity

is defined in the range [minus005 005] The swarm scale119873 is setto 20 and the maximum number of iterations 119879max is 1000The parameters of SVM with Gaussian kernel function areselected by using the gridding search method on the trainingdataset

8 Mathematical Problems in Engineering

06507

07508

08509

095

Accu

racy

5 10 15 20 25 30 35 40 45 50 55 60Number of labeled samples

SSPSO-10SSPSO-50SSPSO-100

SSPSO-200SSPSO-400SSPSO-600

Figure 6 3 5 and 8 digit recognition on the USPS dataset bySSPSO with different numbers of unlabeled samples

In addition we analyze the effect of the number ofunlabeled samples on the classification accuracy on USPSdataset In order to test the robustness of the parameter120573 in the fitness function to the classification performancewe conduct experiments on UCI datasets with differentvalues of 120573 and analyze the effect of 120573 on the classificationperformance

41 Artificial Two-Dimension Problems To test the feasibilityof SSPSO for classification the proposed method is firstconducted on four artificial two-dimension datasets that islong1 sizes5 square1 and square4 The details are shown inTable 2 and the distributions of the four datasets are shownin Figure 1

In the experiments for the first two datasets we randomlyselect 1sim30 labeled samples per class as the training data andfor the last two datasets we randomly select 5sim40 labeledsamples per class as the training set and the rest are usedas the test set Figure 2 plots the curves of accuracy withrespect to the number of labeled samples which showsthe average results over 100 runs of the proposed SSPSOcomparing with PSO NN and SVM on the four datasetsThe weight factor 120573 in the fitness function of SSPSO (in(6)) is selected as 05 From Figure 2 it can be observedthat SSPSO can obtain favorable classification results on thefour datasets which means that SSPSO is feasible for theclassification problem Among the four datasets long1 is theeasiest to classify on which all the four methods acquire100 classification accuracy when the number of labeledsamples per class exceeds 10 But when the labeled instancesare few for example only 3 instances per class are labeledPSO NN and SVM cannot classify all the test data correctlywhile SSPSO can still obtain 100 classification accuracyIn Figure 2(b) the performance difference among SSPSONN and SVM is not noticeable when the number of labeledsamples per class is up to 15 but when the number of labeledinstances is small for example less than 10 SSPSO can obtainobvious better accuracy than the other methods It is becauseSSPSO utilizes the information of unlabeled instances whichis helpful to capture the global structure For square1 and

Table 2 Artificial datasets used in experiments

Data Classnumber

Attributenumber Instance number Normalization

long1 2 2 500500 Yessizes5 3 2 77154769 Yessquare1 4 2 250250250250 Yessquare4 4 2 250250250250 Yes

Table 3 UCI datasets used in experiments

Data Class Attributes Instances NormalizationHeart 2 13 150120 YesWine 3 13 597148 YesThyroid 3 5 1503530 YesTae 3 5 495052 YesSPECT 2 22 267 YesWdbc 3 30 357212 Yes

square4 datasets the superiority of SSPSO is more apparentthat is for all the training scenarios the best performance isachieved by the proposed method

42 UCI Dataset To further investigate the effectiveness ofSSPSO for classification we also conduct the experiments onsix real-life datasets with different numbers of attributes andclasses from the UCI machine learning repository [33] Thedescription of the datasets used in experiments is given inTable 3

For datasets with 2 classes we randomly select 1sim15labeled samples per class as the training data and for datasetswith 3 classes we randomly select 1sim10 labeled samples perclass as the training data and the rest are used as the testset The results are averaged over 100 runs The weight factor120573 in the fitness function of SSPSO (in (6)) is selected as05 Figure 3 shows the classification accuracy with differentnumbers of training samples on the 6 datasets

From Figures 3(a) 3(b) 3(c) and 3(e) it can be observedthat the proposed SSPSOmethodoutperforms the other threemethods on the Heart Wine Thyroid and SPECT datasetsespecially when the number of the labeled samples per classis small It is because that SSPSO uses the information ofavailable unlabeled data which is of benefit to the classifica-tionWith the increase of the labeled samples the superioritybecomes weak From Figure 3(d) it is seen that SSPSO canobtain comparative accuracy with the other three methodsFrom Figure 3(f) SSPSO is slightly better than SVM butit is much better than PSO and NN methods Thereforeit can be concluded that SSPSO works well for some real-life classification tasks especially in the case that the labeledsamples are highly limited

From an evolutionary point of view in Figure 4 we reportthe behavior of a typical run of SSPSO in terms of the bestindividual fitness and average fitness in the population as afunction of the number of iterations It is carried out on theThyroid database As can be seen SSPSO shows two phasesIn the first phase with about 50 iterations the fitness value

Mathematical Problems in Engineering 9

0201 03 04 05 06 07 08 0904

05

06

07

08

09

1Ac

cura

cy

SSPSONNSVM

120573

(a)

SSPSONNSVM

01 02 03 04 05 06 07 08 09055

06

065

07

075

08

Accu

racy

120573

(b)

Figure 7 Classification accuracy as a function of 120573 in SSPSO on (a) Thyroid and (b) Heart

decreases sharply starting from 06712 for the best and 09192for the average and reaching about 02305 for the best and02356 for the averageThen the second phase follows lastingabout 50 iterations in which both the best and the averagefitness values decrease slowly and tend to become closer andcloser until they reach 02231 and 02247 respectively Andthen the average and the best fitness values becomemore andmore similar Finally both the two values get to 02230

43 USPS Digital Recognition We also conduct experimentson the USPS handwritten digits dataset to test the perfor-mance of SSPSO This dataset consists of 9298 samples with10 classes and each sample has the size of 16times16 pixels Firstlywe apply the principal component analysis on the dataset forfeature extraction and select the first 10 principle componentsas the new features

We consider four subsets of the dataset in the experimentthat is the images of digits 0 and 8 with 2261 examples intotal the images of digits 3 5 and 8 with 2248 examples intotal the images of digits 3 8 and 9 with a total numberof 2429 examples and the images of digits 1 2 3 and 4with a total number of 3874 examples We randomly select1sim10 samples per class respectively as the training dataand randomly select 200 unlabeled samples to construct theunlabeled sample set X

119880 which is used for semisupervised

learningThe weight factor 120573 in the fitness function of SSPSO(in (6)) is selected as 07

The recognition results averaged over 100 independenttrials are summarized in Figure 5 where the horizontal axisrepresents the number of randomly labeled digital imagesper class in the subset and the vertical axis represents

the classification accuracy From Figure 5(a) it is shown thatwhen the number of labeled samples per class is below 14SSPSO can obtain comparable performance with SVM andKNN and be better than PSO In particular SSPSO canoutperform the other methods when the labeled samples arefew For the results on the USPS subset of digits 3 5 and 8and the subset of digits 3 8 and 9 shown in Figures 5(b)and 5(c) respectively one can clearly see that SSPSOmethodoutperforms SVM and is much better than KNN and PSOmethods when the number of labeled samples is small InFigure 5(d) SSPSO still works better than the other methodsbut the superiority of the proposed SSPSO over the othermethods decrease with the increase of labeled samples

44 The Sensitivity Analysis of the Number of UnlabeledSamples In this section we validate the effect of the numberof the unlabeled samples on the classification accuracy Thisexperiment is carried on the subset of the USPS datasetwith the digit images of 3 5 and 8 We vary the size ofthe unlabeled set X

119880to be 10 50 100 200 400 and 600

Figure 6 illustrates the classification accuracy as the functionof the size of the unlabeled set and the number of labeledsamples From Figure 6 one can see that the number ofthe unlabeled samples affects the accuracy slightly when thenumber of the labeled samples is small The plot with 10unlabeled samples gets much lower accuracy than the otherplots which indicates that the number of unlabeled samplesused should not be too small With the increase of the size ofX119880 SSPSO can obtain better classification accuracy because

the proposed method can capture the real structure of thewhole dataset more precisely with more unlabeled samples

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

4 Mathematical Problems in Engineering

Table 1 Encoding of a set of centroids in a particle for PSO

Centroid 1 sdot sdot sdot Centroid k sdot sdot sdot Centroid CPosition 119901(1 119863) sdot sdot sdot 119901((119896 minus 1) lowast 119863 + 1 119896 lowast 119863) sdot sdot sdot 119901((119862 minus 1) lowast 119863 + 1 119862 lowast 119863)

Class 1 sdot sdot sdot k sdot sdot sdot C

5 10 15 20 25 3009992

09993

09994

09995

09996

09997

09998

09999

1

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

5 10 15 20 25 30092

093

094

095

096

097

098

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

5 10 15 20 25 30 35 40094

0945

095

0955

096

0965

097

0975

098

0985

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

5 10 15 20 25 30 35 40082

084

086

088

09

092

094

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 2 Classification results on the artificial dataset with different algorithms (a) long1 (b) sizes5 (c) square1 and (d) square4

Mathematical Problems in Engineering 5

0 5 10 15 20 25 30Number of labeled samples

066

068

07

072

074

076

078

08

082

084

Accu

racy

(a)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

098

Accu

racy

(b)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

Accu

racy

(c)

5 10 15 20 25 30Number of labeled samples

037

038

039

04

041

042

043

044

045

046

047

Accu

racy

(d)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

Accu

racy

SSPSOPSO

NNSVM

(e)

5 10 15 20 25 30Number of labeled samples

045

05

055

06

065

07

075

08

Accu

racy

SSPSOPSO

NNSVM

(f)

Figure 3 Classification results on the UCI datasets with different algorithms (a) Heart (2 classes) (b)Wine (3 classes) (c)Thyroid (3 classes)(d) Tae (3 classes) (e) SPECT (2 classes) (f) Wdbc (3 classes)

6 Mathematical Problems in Engineering

100 200 300 400 500 600 700 800 900 100002

03

04

05

06

07

08

09

1

Iterations

Fitn

ess v

alue

0 200 400 600 800 1000022

0225

023

0235

Average fitnessBest fitness

Figure 4 Typical behavior of best and average fitness of SSPSO as afunction of the number of iterations

Fitness function plays an important role in PSO A goodfitness function can quickly find the optimization positionsof the particles In [26] the fitness function 120595 of the classicalPSO classification method is computed as the sum of theEuclidean distances between all the training samples and theclass centroids encoded in the particle they belong to Thenthe sum is divided by 119897 which is the total number of trainingsamples The fitness of the 119894th particle is defined as

120595 (p119894) =

1

119897

sdot

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

) (5)

where 119862119871(119895) denotes the class label of the training sample x119895

p119862119871(119895)119894

denotes the centroid vector of the class119862119871(119895) encodedin the 119894th particle and 119889(x

119895 p119862119871(119895)119894

) is the Euclidean distancebetween the training sample x

119895and the class centroid p

119862119871(119895)119894

Equation (5) only considers the labeled samples which areused to provide the discriminant information However inthe case that the labeled samples are limited the labeledsamples are too few to represent the real distribution ofdataset while the abundant unlabeled samples are oftenavailable and may be helpful to capture the real geometricalstructure of the whole dataset To take full advantage of theexisting unlabeled samples wemodify the fitness function byintroducing the structure information of unlabeled samplesto the original fitness function of PSO classifier With theassumption of NN method that the neighborhood samplesshould have the same labels we propose to use a new fitnessfunction in our proposed SSPSO as follows

120595 (p119894) = 120573 (

1

119897

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

)) + (1 minus 120573)

times (

1

119906

119906

sum

119896=1

min 119889 (x119896 p1119894)

119889 (x119896 p2119894) 119889 (x

119896 p119862119894))

(6)

where 120595(p119894) is the fitness value of the 119894th particle 120573 is a

weight factor in the range between [0 1] which controls theratio of information obtained from the labeled and unlabeledsamples 119906 is the number of the unlabeled samples X

119880 and

119897 is the number of labeled samples X119871 The first term on the

left side of the fitness function is the discriminate constraintwhich means that a good classifier should have a better resulton the labeled samples The second term is the structureconstraint which is helpful to find the real distribution of thewhole dataset so as to improve the classification performanceWhen 120573 = 1 we obtain the standard PSO algorithm andwhen 120573 = 0 we can obtain an unsupervised PSO clusteringmethod

The detailed process of the proposed SSPSO is as follows

Input The labeled dataset is X119871= x

1 x2 x

119897 and

the corresponding labels set is Y119871= (1199101 1199101 119910

119897)

119879 theunlabeled dataset is X

119880= x1 x2 x

119906

Output The labels of the unlabeled samples

Step 1 Load training dataset and unlabeled samples

Step 2 Normalize the dataset

Step 3 Initialize the swarm with 119873 particles by randomlygenerating both the position and velocity vectors for eachparticle with the entry value between the range [0 1] It isnoticed that the dimension of each particle equals the productof the number of attributes119863 and the number of classes 119862

Step 4 Iterate until the maximum number of iterations isreached

Substep 1 Calculate the fitness value 120595(p(119905)119894) for each particle

with (6)

Substep 2 Update the best fitness value 120595(pbest(119905)119894) and the

best particle of 119894th particle pbest(119905)119894 that is if 120595(p(119905)

119894) lt

120595(pbest(119905)119894) then 120595(pbest(119905)

119894) = 120595(p(119905)

119894) and pbest(119905)

119894= p(119905)119894

Substep 3 If necessary update the global best particlegbest(119905) that is b(119905) = arg minp(119905)120595(p

(119905)

1) 120595(p(119905)

2) 120595(p(119905)

119873))

if 120595(b(119905)) lt 120595(gbest(119905)) then 120595(gbest(119905)) = 120595(b(119905)) andgbest(119905) = b(119905)

Substep 4 Update particlesrsquo velocity with (2)

Substep 5 Update particlesrsquo position with (3)

Substep 6 Update the inertia factor 120596 with (4)

Step 5 Use the NN method to obtain the labels of the unla-beled samplesX

119880with the optimum centroids represented by

gbest

Mathematical Problems in Engineering 7

0 5 10 15 20076

078

08

082

084

086

088

09

092

094

096

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

0 5 10 15 20 25 30

065

07

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

0 5 10 15 20 25 30065

07

075

08

085

09

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

0 5 10 15 20 25 30 35 4007

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 5 Digit recognition on the USPS dataset with different algorithms (a) Digits ldquo0rdquo and 8rdquo (b) digits ldquo3rdquo ldquo5rdquo and ldquo8rdquo (c) digits ldquo3rdquo ldquo8rdquoand ldquo9rdquo and (d) digits ldquo1rdquo ldquo2rdquo ldquo3rdquo and ldquo4rdquo

4 Experimental Results and Analysis

In this section we assess our proposed method SSPSOon six UCI datasets four artificial datasets and the USPShandwritten dataset The datasets have different attributesand classes involving different problems including balancedand unbalanced ones

To evaluate the performance of SSPSO we make com-parisons of the classification results with the PSO-basedclassifier the traditional NN classifier and the classical SVM

classifier In order to compare the algorithms reasonably allthe parameters of PSO and SSPSO are selected to make themobtain the best results The parameter settings are as followsThe inertia weight factor120596 used in PSO and SSPSO decreaseslinearly from09 to 04 Both 119888

1and 1198882are set to 2The velocity

is defined in the range [minus005 005] The swarm scale119873 is setto 20 and the maximum number of iterations 119879max is 1000The parameters of SVM with Gaussian kernel function areselected by using the gridding search method on the trainingdataset

8 Mathematical Problems in Engineering

06507

07508

08509

095

Accu

racy

5 10 15 20 25 30 35 40 45 50 55 60Number of labeled samples

SSPSO-10SSPSO-50SSPSO-100

SSPSO-200SSPSO-400SSPSO-600

Figure 6 3 5 and 8 digit recognition on the USPS dataset bySSPSO with different numbers of unlabeled samples

In addition we analyze the effect of the number ofunlabeled samples on the classification accuracy on USPSdataset In order to test the robustness of the parameter120573 in the fitness function to the classification performancewe conduct experiments on UCI datasets with differentvalues of 120573 and analyze the effect of 120573 on the classificationperformance

41 Artificial Two-Dimension Problems To test the feasibilityof SSPSO for classification the proposed method is firstconducted on four artificial two-dimension datasets that islong1 sizes5 square1 and square4 The details are shown inTable 2 and the distributions of the four datasets are shownin Figure 1

In the experiments for the first two datasets we randomlyselect 1sim30 labeled samples per class as the training data andfor the last two datasets we randomly select 5sim40 labeledsamples per class as the training set and the rest are usedas the test set Figure 2 plots the curves of accuracy withrespect to the number of labeled samples which showsthe average results over 100 runs of the proposed SSPSOcomparing with PSO NN and SVM on the four datasetsThe weight factor 120573 in the fitness function of SSPSO (in(6)) is selected as 05 From Figure 2 it can be observedthat SSPSO can obtain favorable classification results on thefour datasets which means that SSPSO is feasible for theclassification problem Among the four datasets long1 is theeasiest to classify on which all the four methods acquire100 classification accuracy when the number of labeledsamples per class exceeds 10 But when the labeled instancesare few for example only 3 instances per class are labeledPSO NN and SVM cannot classify all the test data correctlywhile SSPSO can still obtain 100 classification accuracyIn Figure 2(b) the performance difference among SSPSONN and SVM is not noticeable when the number of labeledsamples per class is up to 15 but when the number of labeledinstances is small for example less than 10 SSPSO can obtainobvious better accuracy than the other methods It is becauseSSPSO utilizes the information of unlabeled instances whichis helpful to capture the global structure For square1 and

Table 2 Artificial datasets used in experiments

Data Classnumber

Attributenumber Instance number Normalization

long1 2 2 500500 Yessizes5 3 2 77154769 Yessquare1 4 2 250250250250 Yessquare4 4 2 250250250250 Yes

Table 3 UCI datasets used in experiments

Data Class Attributes Instances NormalizationHeart 2 13 150120 YesWine 3 13 597148 YesThyroid 3 5 1503530 YesTae 3 5 495052 YesSPECT 2 22 267 YesWdbc 3 30 357212 Yes

square4 datasets the superiority of SSPSO is more apparentthat is for all the training scenarios the best performance isachieved by the proposed method

42 UCI Dataset To further investigate the effectiveness ofSSPSO for classification we also conduct the experiments onsix real-life datasets with different numbers of attributes andclasses from the UCI machine learning repository [33] Thedescription of the datasets used in experiments is given inTable 3

For datasets with 2 classes we randomly select 1sim15labeled samples per class as the training data and for datasetswith 3 classes we randomly select 1sim10 labeled samples perclass as the training data and the rest are used as the testset The results are averaged over 100 runs The weight factor120573 in the fitness function of SSPSO (in (6)) is selected as05 Figure 3 shows the classification accuracy with differentnumbers of training samples on the 6 datasets

From Figures 3(a) 3(b) 3(c) and 3(e) it can be observedthat the proposed SSPSOmethodoutperforms the other threemethods on the Heart Wine Thyroid and SPECT datasetsespecially when the number of the labeled samples per classis small It is because that SSPSO uses the information ofavailable unlabeled data which is of benefit to the classifica-tionWith the increase of the labeled samples the superioritybecomes weak From Figure 3(d) it is seen that SSPSO canobtain comparative accuracy with the other three methodsFrom Figure 3(f) SSPSO is slightly better than SVM butit is much better than PSO and NN methods Thereforeit can be concluded that SSPSO works well for some real-life classification tasks especially in the case that the labeledsamples are highly limited

From an evolutionary point of view in Figure 4 we reportthe behavior of a typical run of SSPSO in terms of the bestindividual fitness and average fitness in the population as afunction of the number of iterations It is carried out on theThyroid database As can be seen SSPSO shows two phasesIn the first phase with about 50 iterations the fitness value

Mathematical Problems in Engineering 9

0201 03 04 05 06 07 08 0904

05

06

07

08

09

1Ac

cura

cy

SSPSONNSVM

120573

(a)

SSPSONNSVM

01 02 03 04 05 06 07 08 09055

06

065

07

075

08

Accu

racy

120573

(b)

Figure 7 Classification accuracy as a function of 120573 in SSPSO on (a) Thyroid and (b) Heart

decreases sharply starting from 06712 for the best and 09192for the average and reaching about 02305 for the best and02356 for the averageThen the second phase follows lastingabout 50 iterations in which both the best and the averagefitness values decrease slowly and tend to become closer andcloser until they reach 02231 and 02247 respectively Andthen the average and the best fitness values becomemore andmore similar Finally both the two values get to 02230

43 USPS Digital Recognition We also conduct experimentson the USPS handwritten digits dataset to test the perfor-mance of SSPSO This dataset consists of 9298 samples with10 classes and each sample has the size of 16times16 pixels Firstlywe apply the principal component analysis on the dataset forfeature extraction and select the first 10 principle componentsas the new features

We consider four subsets of the dataset in the experimentthat is the images of digits 0 and 8 with 2261 examples intotal the images of digits 3 5 and 8 with 2248 examples intotal the images of digits 3 8 and 9 with a total numberof 2429 examples and the images of digits 1 2 3 and 4with a total number of 3874 examples We randomly select1sim10 samples per class respectively as the training dataand randomly select 200 unlabeled samples to construct theunlabeled sample set X

119880 which is used for semisupervised

learningThe weight factor 120573 in the fitness function of SSPSO(in (6)) is selected as 07

The recognition results averaged over 100 independenttrials are summarized in Figure 5 where the horizontal axisrepresents the number of randomly labeled digital imagesper class in the subset and the vertical axis represents

the classification accuracy From Figure 5(a) it is shown thatwhen the number of labeled samples per class is below 14SSPSO can obtain comparable performance with SVM andKNN and be better than PSO In particular SSPSO canoutperform the other methods when the labeled samples arefew For the results on the USPS subset of digits 3 5 and 8and the subset of digits 3 8 and 9 shown in Figures 5(b)and 5(c) respectively one can clearly see that SSPSOmethodoutperforms SVM and is much better than KNN and PSOmethods when the number of labeled samples is small InFigure 5(d) SSPSO still works better than the other methodsbut the superiority of the proposed SSPSO over the othermethods decrease with the increase of labeled samples

44 The Sensitivity Analysis of the Number of UnlabeledSamples In this section we validate the effect of the numberof the unlabeled samples on the classification accuracy Thisexperiment is carried on the subset of the USPS datasetwith the digit images of 3 5 and 8 We vary the size ofthe unlabeled set X

119880to be 10 50 100 200 400 and 600

Figure 6 illustrates the classification accuracy as the functionof the size of the unlabeled set and the number of labeledsamples From Figure 6 one can see that the number ofthe unlabeled samples affects the accuracy slightly when thenumber of the labeled samples is small The plot with 10unlabeled samples gets much lower accuracy than the otherplots which indicates that the number of unlabeled samplesused should not be too small With the increase of the size ofX119880 SSPSO can obtain better classification accuracy because

the proposed method can capture the real structure of thewhole dataset more precisely with more unlabeled samples

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

Mathematical Problems in Engineering 5

0 5 10 15 20 25 30Number of labeled samples

066

068

07

072

074

076

078

08

082

084

Accu

racy

(a)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

098

Accu

racy

(b)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

096

Accu

racy

(c)

5 10 15 20 25 30Number of labeled samples

037

038

039

04

041

042

043

044

045

046

047

Accu

racy

(d)

5 10 15 20 25 30Number of labeled samples

08

082

084

086

088

09

092

094

Accu

racy

SSPSOPSO

NNSVM

(e)

5 10 15 20 25 30Number of labeled samples

045

05

055

06

065

07

075

08

Accu

racy

SSPSOPSO

NNSVM

(f)

Figure 3 Classification results on the UCI datasets with different algorithms (a) Heart (2 classes) (b)Wine (3 classes) (c)Thyroid (3 classes)(d) Tae (3 classes) (e) SPECT (2 classes) (f) Wdbc (3 classes)

6 Mathematical Problems in Engineering

100 200 300 400 500 600 700 800 900 100002

03

04

05

06

07

08

09

1

Iterations

Fitn

ess v

alue

0 200 400 600 800 1000022

0225

023

0235

Average fitnessBest fitness

Figure 4 Typical behavior of best and average fitness of SSPSO as afunction of the number of iterations

Fitness function plays an important role in PSO A goodfitness function can quickly find the optimization positionsof the particles In [26] the fitness function 120595 of the classicalPSO classification method is computed as the sum of theEuclidean distances between all the training samples and theclass centroids encoded in the particle they belong to Thenthe sum is divided by 119897 which is the total number of trainingsamples The fitness of the 119894th particle is defined as

120595 (p119894) =

1

119897

sdot

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

) (5)

where 119862119871(119895) denotes the class label of the training sample x119895

p119862119871(119895)119894

denotes the centroid vector of the class119862119871(119895) encodedin the 119894th particle and 119889(x

119895 p119862119871(119895)119894

) is the Euclidean distancebetween the training sample x

119895and the class centroid p

119862119871(119895)119894

Equation (5) only considers the labeled samples which areused to provide the discriminant information However inthe case that the labeled samples are limited the labeledsamples are too few to represent the real distribution ofdataset while the abundant unlabeled samples are oftenavailable and may be helpful to capture the real geometricalstructure of the whole dataset To take full advantage of theexisting unlabeled samples wemodify the fitness function byintroducing the structure information of unlabeled samplesto the original fitness function of PSO classifier With theassumption of NN method that the neighborhood samplesshould have the same labels we propose to use a new fitnessfunction in our proposed SSPSO as follows

120595 (p119894) = 120573 (

1

119897

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

)) + (1 minus 120573)

times (

1

119906

119906

sum

119896=1

min 119889 (x119896 p1119894)

119889 (x119896 p2119894) 119889 (x

119896 p119862119894))

(6)

where 120595(p119894) is the fitness value of the 119894th particle 120573 is a

weight factor in the range between [0 1] which controls theratio of information obtained from the labeled and unlabeledsamples 119906 is the number of the unlabeled samples X

119880 and

119897 is the number of labeled samples X119871 The first term on the

left side of the fitness function is the discriminate constraintwhich means that a good classifier should have a better resulton the labeled samples The second term is the structureconstraint which is helpful to find the real distribution of thewhole dataset so as to improve the classification performanceWhen 120573 = 1 we obtain the standard PSO algorithm andwhen 120573 = 0 we can obtain an unsupervised PSO clusteringmethod

The detailed process of the proposed SSPSO is as follows

Input The labeled dataset is X119871= x

1 x2 x

119897 and

the corresponding labels set is Y119871= (1199101 1199101 119910

119897)

119879 theunlabeled dataset is X

119880= x1 x2 x

119906

Output The labels of the unlabeled samples

Step 1 Load training dataset and unlabeled samples

Step 2 Normalize the dataset

Step 3 Initialize the swarm with 119873 particles by randomlygenerating both the position and velocity vectors for eachparticle with the entry value between the range [0 1] It isnoticed that the dimension of each particle equals the productof the number of attributes119863 and the number of classes 119862

Step 4 Iterate until the maximum number of iterations isreached

Substep 1 Calculate the fitness value 120595(p(119905)119894) for each particle

with (6)

Substep 2 Update the best fitness value 120595(pbest(119905)119894) and the

best particle of 119894th particle pbest(119905)119894 that is if 120595(p(119905)

119894) lt

120595(pbest(119905)119894) then 120595(pbest(119905)

119894) = 120595(p(119905)

119894) and pbest(119905)

119894= p(119905)119894

Substep 3 If necessary update the global best particlegbest(119905) that is b(119905) = arg minp(119905)120595(p

(119905)

1) 120595(p(119905)

2) 120595(p(119905)

119873))

if 120595(b(119905)) lt 120595(gbest(119905)) then 120595(gbest(119905)) = 120595(b(119905)) andgbest(119905) = b(119905)

Substep 4 Update particlesrsquo velocity with (2)

Substep 5 Update particlesrsquo position with (3)

Substep 6 Update the inertia factor 120596 with (4)

Step 5 Use the NN method to obtain the labels of the unla-beled samplesX

119880with the optimum centroids represented by

gbest

Mathematical Problems in Engineering 7

0 5 10 15 20076

078

08

082

084

086

088

09

092

094

096

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

0 5 10 15 20 25 30

065

07

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

0 5 10 15 20 25 30065

07

075

08

085

09

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

0 5 10 15 20 25 30 35 4007

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 5 Digit recognition on the USPS dataset with different algorithms (a) Digits ldquo0rdquo and 8rdquo (b) digits ldquo3rdquo ldquo5rdquo and ldquo8rdquo (c) digits ldquo3rdquo ldquo8rdquoand ldquo9rdquo and (d) digits ldquo1rdquo ldquo2rdquo ldquo3rdquo and ldquo4rdquo

4 Experimental Results and Analysis

In this section we assess our proposed method SSPSOon six UCI datasets four artificial datasets and the USPShandwritten dataset The datasets have different attributesand classes involving different problems including balancedand unbalanced ones

To evaluate the performance of SSPSO we make com-parisons of the classification results with the PSO-basedclassifier the traditional NN classifier and the classical SVM

classifier In order to compare the algorithms reasonably allthe parameters of PSO and SSPSO are selected to make themobtain the best results The parameter settings are as followsThe inertia weight factor120596 used in PSO and SSPSO decreaseslinearly from09 to 04 Both 119888

1and 1198882are set to 2The velocity

is defined in the range [minus005 005] The swarm scale119873 is setto 20 and the maximum number of iterations 119879max is 1000The parameters of SVM with Gaussian kernel function areselected by using the gridding search method on the trainingdataset

8 Mathematical Problems in Engineering

06507

07508

08509

095

Accu

racy

5 10 15 20 25 30 35 40 45 50 55 60Number of labeled samples

SSPSO-10SSPSO-50SSPSO-100

SSPSO-200SSPSO-400SSPSO-600

Figure 6 3 5 and 8 digit recognition on the USPS dataset bySSPSO with different numbers of unlabeled samples

In addition we analyze the effect of the number ofunlabeled samples on the classification accuracy on USPSdataset In order to test the robustness of the parameter120573 in the fitness function to the classification performancewe conduct experiments on UCI datasets with differentvalues of 120573 and analyze the effect of 120573 on the classificationperformance

41 Artificial Two-Dimension Problems To test the feasibilityof SSPSO for classification the proposed method is firstconducted on four artificial two-dimension datasets that islong1 sizes5 square1 and square4 The details are shown inTable 2 and the distributions of the four datasets are shownin Figure 1

In the experiments for the first two datasets we randomlyselect 1sim30 labeled samples per class as the training data andfor the last two datasets we randomly select 5sim40 labeledsamples per class as the training set and the rest are usedas the test set Figure 2 plots the curves of accuracy withrespect to the number of labeled samples which showsthe average results over 100 runs of the proposed SSPSOcomparing with PSO NN and SVM on the four datasetsThe weight factor 120573 in the fitness function of SSPSO (in(6)) is selected as 05 From Figure 2 it can be observedthat SSPSO can obtain favorable classification results on thefour datasets which means that SSPSO is feasible for theclassification problem Among the four datasets long1 is theeasiest to classify on which all the four methods acquire100 classification accuracy when the number of labeledsamples per class exceeds 10 But when the labeled instancesare few for example only 3 instances per class are labeledPSO NN and SVM cannot classify all the test data correctlywhile SSPSO can still obtain 100 classification accuracyIn Figure 2(b) the performance difference among SSPSONN and SVM is not noticeable when the number of labeledsamples per class is up to 15 but when the number of labeledinstances is small for example less than 10 SSPSO can obtainobvious better accuracy than the other methods It is becauseSSPSO utilizes the information of unlabeled instances whichis helpful to capture the global structure For square1 and

Table 2 Artificial datasets used in experiments

Data Classnumber

Attributenumber Instance number Normalization

long1 2 2 500500 Yessizes5 3 2 77154769 Yessquare1 4 2 250250250250 Yessquare4 4 2 250250250250 Yes

Table 3 UCI datasets used in experiments

Data Class Attributes Instances NormalizationHeart 2 13 150120 YesWine 3 13 597148 YesThyroid 3 5 1503530 YesTae 3 5 495052 YesSPECT 2 22 267 YesWdbc 3 30 357212 Yes

square4 datasets the superiority of SSPSO is more apparentthat is for all the training scenarios the best performance isachieved by the proposed method

42 UCI Dataset To further investigate the effectiveness ofSSPSO for classification we also conduct the experiments onsix real-life datasets with different numbers of attributes andclasses from the UCI machine learning repository [33] Thedescription of the datasets used in experiments is given inTable 3

For datasets with 2 classes we randomly select 1sim15labeled samples per class as the training data and for datasetswith 3 classes we randomly select 1sim10 labeled samples perclass as the training data and the rest are used as the testset The results are averaged over 100 runs The weight factor120573 in the fitness function of SSPSO (in (6)) is selected as05 Figure 3 shows the classification accuracy with differentnumbers of training samples on the 6 datasets

From Figures 3(a) 3(b) 3(c) and 3(e) it can be observedthat the proposed SSPSOmethodoutperforms the other threemethods on the Heart Wine Thyroid and SPECT datasetsespecially when the number of the labeled samples per classis small It is because that SSPSO uses the information ofavailable unlabeled data which is of benefit to the classifica-tionWith the increase of the labeled samples the superioritybecomes weak From Figure 3(d) it is seen that SSPSO canobtain comparative accuracy with the other three methodsFrom Figure 3(f) SSPSO is slightly better than SVM butit is much better than PSO and NN methods Thereforeit can be concluded that SSPSO works well for some real-life classification tasks especially in the case that the labeledsamples are highly limited

From an evolutionary point of view in Figure 4 we reportthe behavior of a typical run of SSPSO in terms of the bestindividual fitness and average fitness in the population as afunction of the number of iterations It is carried out on theThyroid database As can be seen SSPSO shows two phasesIn the first phase with about 50 iterations the fitness value

Mathematical Problems in Engineering 9

0201 03 04 05 06 07 08 0904

05

06

07

08

09

1Ac

cura

cy

SSPSONNSVM

120573

(a)

SSPSONNSVM

01 02 03 04 05 06 07 08 09055

06

065

07

075

08

Accu

racy

120573

(b)

Figure 7 Classification accuracy as a function of 120573 in SSPSO on (a) Thyroid and (b) Heart

decreases sharply starting from 06712 for the best and 09192for the average and reaching about 02305 for the best and02356 for the averageThen the second phase follows lastingabout 50 iterations in which both the best and the averagefitness values decrease slowly and tend to become closer andcloser until they reach 02231 and 02247 respectively Andthen the average and the best fitness values becomemore andmore similar Finally both the two values get to 02230

43 USPS Digital Recognition We also conduct experimentson the USPS handwritten digits dataset to test the perfor-mance of SSPSO This dataset consists of 9298 samples with10 classes and each sample has the size of 16times16 pixels Firstlywe apply the principal component analysis on the dataset forfeature extraction and select the first 10 principle componentsas the new features

We consider four subsets of the dataset in the experimentthat is the images of digits 0 and 8 with 2261 examples intotal the images of digits 3 5 and 8 with 2248 examples intotal the images of digits 3 8 and 9 with a total numberof 2429 examples and the images of digits 1 2 3 and 4with a total number of 3874 examples We randomly select1sim10 samples per class respectively as the training dataand randomly select 200 unlabeled samples to construct theunlabeled sample set X

119880 which is used for semisupervised

learningThe weight factor 120573 in the fitness function of SSPSO(in (6)) is selected as 07

The recognition results averaged over 100 independenttrials are summarized in Figure 5 where the horizontal axisrepresents the number of randomly labeled digital imagesper class in the subset and the vertical axis represents

the classification accuracy From Figure 5(a) it is shown thatwhen the number of labeled samples per class is below 14SSPSO can obtain comparable performance with SVM andKNN and be better than PSO In particular SSPSO canoutperform the other methods when the labeled samples arefew For the results on the USPS subset of digits 3 5 and 8and the subset of digits 3 8 and 9 shown in Figures 5(b)and 5(c) respectively one can clearly see that SSPSOmethodoutperforms SVM and is much better than KNN and PSOmethods when the number of labeled samples is small InFigure 5(d) SSPSO still works better than the other methodsbut the superiority of the proposed SSPSO over the othermethods decrease with the increase of labeled samples

44 The Sensitivity Analysis of the Number of UnlabeledSamples In this section we validate the effect of the numberof the unlabeled samples on the classification accuracy Thisexperiment is carried on the subset of the USPS datasetwith the digit images of 3 5 and 8 We vary the size ofthe unlabeled set X

119880to be 10 50 100 200 400 and 600

Figure 6 illustrates the classification accuracy as the functionof the size of the unlabeled set and the number of labeledsamples From Figure 6 one can see that the number ofthe unlabeled samples affects the accuracy slightly when thenumber of the labeled samples is small The plot with 10unlabeled samples gets much lower accuracy than the otherplots which indicates that the number of unlabeled samplesused should not be too small With the increase of the size ofX119880 SSPSO can obtain better classification accuracy because

the proposed method can capture the real structure of thewhole dataset more precisely with more unlabeled samples

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

6 Mathematical Problems in Engineering

100 200 300 400 500 600 700 800 900 100002

03

04

05

06

07

08

09

1

Iterations

Fitn

ess v

alue

0 200 400 600 800 1000022

0225

023

0235

Average fitnessBest fitness

Figure 4 Typical behavior of best and average fitness of SSPSO as afunction of the number of iterations

Fitness function plays an important role in PSO A goodfitness function can quickly find the optimization positionsof the particles In [26] the fitness function 120595 of the classicalPSO classification method is computed as the sum of theEuclidean distances between all the training samples and theclass centroids encoded in the particle they belong to Thenthe sum is divided by 119897 which is the total number of trainingsamples The fitness of the 119894th particle is defined as

120595 (p119894) =

1

119897

sdot

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

) (5)

where 119862119871(119895) denotes the class label of the training sample x119895

p119862119871(119895)119894

denotes the centroid vector of the class119862119871(119895) encodedin the 119894th particle and 119889(x

119895 p119862119871(119895)119894

) is the Euclidean distancebetween the training sample x

119895and the class centroid p

119862119871(119895)119894

Equation (5) only considers the labeled samples which areused to provide the discriminant information However inthe case that the labeled samples are limited the labeledsamples are too few to represent the real distribution ofdataset while the abundant unlabeled samples are oftenavailable and may be helpful to capture the real geometricalstructure of the whole dataset To take full advantage of theexisting unlabeled samples wemodify the fitness function byintroducing the structure information of unlabeled samplesto the original fitness function of PSO classifier With theassumption of NN method that the neighborhood samplesshould have the same labels we propose to use a new fitnessfunction in our proposed SSPSO as follows

120595 (p119894) = 120573 (

1

119897

119897

sum

119895=1

119889 (x119895 p119862119871(119895)119894

)) + (1 minus 120573)

times (

1

119906

119906

sum

119896=1

min 119889 (x119896 p1119894)

119889 (x119896 p2119894) 119889 (x

119896 p119862119894))

(6)

where 120595(p119894) is the fitness value of the 119894th particle 120573 is a

weight factor in the range between [0 1] which controls theratio of information obtained from the labeled and unlabeledsamples 119906 is the number of the unlabeled samples X

119880 and

119897 is the number of labeled samples X119871 The first term on the

left side of the fitness function is the discriminate constraintwhich means that a good classifier should have a better resulton the labeled samples The second term is the structureconstraint which is helpful to find the real distribution of thewhole dataset so as to improve the classification performanceWhen 120573 = 1 we obtain the standard PSO algorithm andwhen 120573 = 0 we can obtain an unsupervised PSO clusteringmethod

The detailed process of the proposed SSPSO is as follows

Input The labeled dataset is X119871= x

1 x2 x

119897 and

the corresponding labels set is Y119871= (1199101 1199101 119910

119897)

119879 theunlabeled dataset is X

119880= x1 x2 x

119906

Output The labels of the unlabeled samples

Step 1 Load training dataset and unlabeled samples

Step 2 Normalize the dataset

Step 3 Initialize the swarm with 119873 particles by randomlygenerating both the position and velocity vectors for eachparticle with the entry value between the range [0 1] It isnoticed that the dimension of each particle equals the productof the number of attributes119863 and the number of classes 119862

Step 4 Iterate until the maximum number of iterations isreached

Substep 1 Calculate the fitness value 120595(p(119905)119894) for each particle

with (6)

Substep 2 Update the best fitness value 120595(pbest(119905)119894) and the

best particle of 119894th particle pbest(119905)119894 that is if 120595(p(119905)

119894) lt

120595(pbest(119905)119894) then 120595(pbest(119905)

119894) = 120595(p(119905)

119894) and pbest(119905)

119894= p(119905)119894

Substep 3 If necessary update the global best particlegbest(119905) that is b(119905) = arg minp(119905)120595(p

(119905)

1) 120595(p(119905)

2) 120595(p(119905)

119873))

if 120595(b(119905)) lt 120595(gbest(119905)) then 120595(gbest(119905)) = 120595(b(119905)) andgbest(119905) = b(119905)

Substep 4 Update particlesrsquo velocity with (2)

Substep 5 Update particlesrsquo position with (3)

Substep 6 Update the inertia factor 120596 with (4)

Step 5 Use the NN method to obtain the labels of the unla-beled samplesX

119880with the optimum centroids represented by

gbest

Mathematical Problems in Engineering 7

0 5 10 15 20076

078

08

082

084

086

088

09

092

094

096

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

0 5 10 15 20 25 30

065

07

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

0 5 10 15 20 25 30065

07

075

08

085

09

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

0 5 10 15 20 25 30 35 4007

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 5 Digit recognition on the USPS dataset with different algorithms (a) Digits ldquo0rdquo and 8rdquo (b) digits ldquo3rdquo ldquo5rdquo and ldquo8rdquo (c) digits ldquo3rdquo ldquo8rdquoand ldquo9rdquo and (d) digits ldquo1rdquo ldquo2rdquo ldquo3rdquo and ldquo4rdquo

4 Experimental Results and Analysis

In this section we assess our proposed method SSPSOon six UCI datasets four artificial datasets and the USPShandwritten dataset The datasets have different attributesand classes involving different problems including balancedand unbalanced ones

To evaluate the performance of SSPSO we make com-parisons of the classification results with the PSO-basedclassifier the traditional NN classifier and the classical SVM

classifier In order to compare the algorithms reasonably allthe parameters of PSO and SSPSO are selected to make themobtain the best results The parameter settings are as followsThe inertia weight factor120596 used in PSO and SSPSO decreaseslinearly from09 to 04 Both 119888

1and 1198882are set to 2The velocity

is defined in the range [minus005 005] The swarm scale119873 is setto 20 and the maximum number of iterations 119879max is 1000The parameters of SVM with Gaussian kernel function areselected by using the gridding search method on the trainingdataset

8 Mathematical Problems in Engineering

06507

07508

08509

095

Accu

racy

5 10 15 20 25 30 35 40 45 50 55 60Number of labeled samples

SSPSO-10SSPSO-50SSPSO-100

SSPSO-200SSPSO-400SSPSO-600

Figure 6 3 5 and 8 digit recognition on the USPS dataset bySSPSO with different numbers of unlabeled samples

In addition we analyze the effect of the number ofunlabeled samples on the classification accuracy on USPSdataset In order to test the robustness of the parameter120573 in the fitness function to the classification performancewe conduct experiments on UCI datasets with differentvalues of 120573 and analyze the effect of 120573 on the classificationperformance

41 Artificial Two-Dimension Problems To test the feasibilityof SSPSO for classification the proposed method is firstconducted on four artificial two-dimension datasets that islong1 sizes5 square1 and square4 The details are shown inTable 2 and the distributions of the four datasets are shownin Figure 1

In the experiments for the first two datasets we randomlyselect 1sim30 labeled samples per class as the training data andfor the last two datasets we randomly select 5sim40 labeledsamples per class as the training set and the rest are usedas the test set Figure 2 plots the curves of accuracy withrespect to the number of labeled samples which showsthe average results over 100 runs of the proposed SSPSOcomparing with PSO NN and SVM on the four datasetsThe weight factor 120573 in the fitness function of SSPSO (in(6)) is selected as 05 From Figure 2 it can be observedthat SSPSO can obtain favorable classification results on thefour datasets which means that SSPSO is feasible for theclassification problem Among the four datasets long1 is theeasiest to classify on which all the four methods acquire100 classification accuracy when the number of labeledsamples per class exceeds 10 But when the labeled instancesare few for example only 3 instances per class are labeledPSO NN and SVM cannot classify all the test data correctlywhile SSPSO can still obtain 100 classification accuracyIn Figure 2(b) the performance difference among SSPSONN and SVM is not noticeable when the number of labeledsamples per class is up to 15 but when the number of labeledinstances is small for example less than 10 SSPSO can obtainobvious better accuracy than the other methods It is becauseSSPSO utilizes the information of unlabeled instances whichis helpful to capture the global structure For square1 and

Table 2 Artificial datasets used in experiments

Data Classnumber

Attributenumber Instance number Normalization

long1 2 2 500500 Yessizes5 3 2 77154769 Yessquare1 4 2 250250250250 Yessquare4 4 2 250250250250 Yes

Table 3 UCI datasets used in experiments

Data Class Attributes Instances NormalizationHeart 2 13 150120 YesWine 3 13 597148 YesThyroid 3 5 1503530 YesTae 3 5 495052 YesSPECT 2 22 267 YesWdbc 3 30 357212 Yes

square4 datasets the superiority of SSPSO is more apparentthat is for all the training scenarios the best performance isachieved by the proposed method

42 UCI Dataset To further investigate the effectiveness ofSSPSO for classification we also conduct the experiments onsix real-life datasets with different numbers of attributes andclasses from the UCI machine learning repository [33] Thedescription of the datasets used in experiments is given inTable 3

For datasets with 2 classes we randomly select 1sim15labeled samples per class as the training data and for datasetswith 3 classes we randomly select 1sim10 labeled samples perclass as the training data and the rest are used as the testset The results are averaged over 100 runs The weight factor120573 in the fitness function of SSPSO (in (6)) is selected as05 Figure 3 shows the classification accuracy with differentnumbers of training samples on the 6 datasets

From Figures 3(a) 3(b) 3(c) and 3(e) it can be observedthat the proposed SSPSOmethodoutperforms the other threemethods on the Heart Wine Thyroid and SPECT datasetsespecially when the number of the labeled samples per classis small It is because that SSPSO uses the information ofavailable unlabeled data which is of benefit to the classifica-tionWith the increase of the labeled samples the superioritybecomes weak From Figure 3(d) it is seen that SSPSO canobtain comparative accuracy with the other three methodsFrom Figure 3(f) SSPSO is slightly better than SVM butit is much better than PSO and NN methods Thereforeit can be concluded that SSPSO works well for some real-life classification tasks especially in the case that the labeledsamples are highly limited

From an evolutionary point of view in Figure 4 we reportthe behavior of a typical run of SSPSO in terms of the bestindividual fitness and average fitness in the population as afunction of the number of iterations It is carried out on theThyroid database As can be seen SSPSO shows two phasesIn the first phase with about 50 iterations the fitness value

Mathematical Problems in Engineering 9

0201 03 04 05 06 07 08 0904

05

06

07

08

09

1Ac

cura

cy

SSPSONNSVM

120573

(a)

SSPSONNSVM

01 02 03 04 05 06 07 08 09055

06

065

07

075

08

Accu

racy

120573

(b)

Figure 7 Classification accuracy as a function of 120573 in SSPSO on (a) Thyroid and (b) Heart

decreases sharply starting from 06712 for the best and 09192for the average and reaching about 02305 for the best and02356 for the averageThen the second phase follows lastingabout 50 iterations in which both the best and the averagefitness values decrease slowly and tend to become closer andcloser until they reach 02231 and 02247 respectively Andthen the average and the best fitness values becomemore andmore similar Finally both the two values get to 02230

43 USPS Digital Recognition We also conduct experimentson the USPS handwritten digits dataset to test the perfor-mance of SSPSO This dataset consists of 9298 samples with10 classes and each sample has the size of 16times16 pixels Firstlywe apply the principal component analysis on the dataset forfeature extraction and select the first 10 principle componentsas the new features

We consider four subsets of the dataset in the experimentthat is the images of digits 0 and 8 with 2261 examples intotal the images of digits 3 5 and 8 with 2248 examples intotal the images of digits 3 8 and 9 with a total numberof 2429 examples and the images of digits 1 2 3 and 4with a total number of 3874 examples We randomly select1sim10 samples per class respectively as the training dataand randomly select 200 unlabeled samples to construct theunlabeled sample set X

119880 which is used for semisupervised

learningThe weight factor 120573 in the fitness function of SSPSO(in (6)) is selected as 07

The recognition results averaged over 100 independenttrials are summarized in Figure 5 where the horizontal axisrepresents the number of randomly labeled digital imagesper class in the subset and the vertical axis represents

the classification accuracy From Figure 5(a) it is shown thatwhen the number of labeled samples per class is below 14SSPSO can obtain comparable performance with SVM andKNN and be better than PSO In particular SSPSO canoutperform the other methods when the labeled samples arefew For the results on the USPS subset of digits 3 5 and 8and the subset of digits 3 8 and 9 shown in Figures 5(b)and 5(c) respectively one can clearly see that SSPSOmethodoutperforms SVM and is much better than KNN and PSOmethods when the number of labeled samples is small InFigure 5(d) SSPSO still works better than the other methodsbut the superiority of the proposed SSPSO over the othermethods decrease with the increase of labeled samples

44 The Sensitivity Analysis of the Number of UnlabeledSamples In this section we validate the effect of the numberof the unlabeled samples on the classification accuracy Thisexperiment is carried on the subset of the USPS datasetwith the digit images of 3 5 and 8 We vary the size ofthe unlabeled set X

119880to be 10 50 100 200 400 and 600

Figure 6 illustrates the classification accuracy as the functionof the size of the unlabeled set and the number of labeledsamples From Figure 6 one can see that the number ofthe unlabeled samples affects the accuracy slightly when thenumber of the labeled samples is small The plot with 10unlabeled samples gets much lower accuracy than the otherplots which indicates that the number of unlabeled samplesused should not be too small With the increase of the size ofX119880 SSPSO can obtain better classification accuracy because

the proposed method can capture the real structure of thewhole dataset more precisely with more unlabeled samples

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

Mathematical Problems in Engineering 7

0 5 10 15 20076

078

08

082

084

086

088

09

092

094

096

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(a)

0 5 10 15 20 25 30

065

07

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(b)

0 5 10 15 20 25 30065

07

075

08

085

09

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(c)

0 5 10 15 20 25 30 35 4007

075

08

085

09

095

Number of labeled samples

Accu

racy

SSPSOPSO

NNSVM

(d)

Figure 5 Digit recognition on the USPS dataset with different algorithms (a) Digits ldquo0rdquo and 8rdquo (b) digits ldquo3rdquo ldquo5rdquo and ldquo8rdquo (c) digits ldquo3rdquo ldquo8rdquoand ldquo9rdquo and (d) digits ldquo1rdquo ldquo2rdquo ldquo3rdquo and ldquo4rdquo

4 Experimental Results and Analysis

In this section we assess our proposed method SSPSOon six UCI datasets four artificial datasets and the USPShandwritten dataset The datasets have different attributesand classes involving different problems including balancedand unbalanced ones

To evaluate the performance of SSPSO we make com-parisons of the classification results with the PSO-basedclassifier the traditional NN classifier and the classical SVM

classifier In order to compare the algorithms reasonably allthe parameters of PSO and SSPSO are selected to make themobtain the best results The parameter settings are as followsThe inertia weight factor120596 used in PSO and SSPSO decreaseslinearly from09 to 04 Both 119888

1and 1198882are set to 2The velocity

is defined in the range [minus005 005] The swarm scale119873 is setto 20 and the maximum number of iterations 119879max is 1000The parameters of SVM with Gaussian kernel function areselected by using the gridding search method on the trainingdataset

8 Mathematical Problems in Engineering

06507

07508

08509

095

Accu

racy

5 10 15 20 25 30 35 40 45 50 55 60Number of labeled samples

SSPSO-10SSPSO-50SSPSO-100

SSPSO-200SSPSO-400SSPSO-600

Figure 6 3 5 and 8 digit recognition on the USPS dataset bySSPSO with different numbers of unlabeled samples

In addition we analyze the effect of the number ofunlabeled samples on the classification accuracy on USPSdataset In order to test the robustness of the parameter120573 in the fitness function to the classification performancewe conduct experiments on UCI datasets with differentvalues of 120573 and analyze the effect of 120573 on the classificationperformance

41 Artificial Two-Dimension Problems To test the feasibilityof SSPSO for classification the proposed method is firstconducted on four artificial two-dimension datasets that islong1 sizes5 square1 and square4 The details are shown inTable 2 and the distributions of the four datasets are shownin Figure 1

In the experiments for the first two datasets we randomlyselect 1sim30 labeled samples per class as the training data andfor the last two datasets we randomly select 5sim40 labeledsamples per class as the training set and the rest are usedas the test set Figure 2 plots the curves of accuracy withrespect to the number of labeled samples which showsthe average results over 100 runs of the proposed SSPSOcomparing with PSO NN and SVM on the four datasetsThe weight factor 120573 in the fitness function of SSPSO (in(6)) is selected as 05 From Figure 2 it can be observedthat SSPSO can obtain favorable classification results on thefour datasets which means that SSPSO is feasible for theclassification problem Among the four datasets long1 is theeasiest to classify on which all the four methods acquire100 classification accuracy when the number of labeledsamples per class exceeds 10 But when the labeled instancesare few for example only 3 instances per class are labeledPSO NN and SVM cannot classify all the test data correctlywhile SSPSO can still obtain 100 classification accuracyIn Figure 2(b) the performance difference among SSPSONN and SVM is not noticeable when the number of labeledsamples per class is up to 15 but when the number of labeledinstances is small for example less than 10 SSPSO can obtainobvious better accuracy than the other methods It is becauseSSPSO utilizes the information of unlabeled instances whichis helpful to capture the global structure For square1 and

Table 2 Artificial datasets used in experiments

Data Classnumber

Attributenumber Instance number Normalization

long1 2 2 500500 Yessizes5 3 2 77154769 Yessquare1 4 2 250250250250 Yessquare4 4 2 250250250250 Yes

Table 3 UCI datasets used in experiments

Data Class Attributes Instances NormalizationHeart 2 13 150120 YesWine 3 13 597148 YesThyroid 3 5 1503530 YesTae 3 5 495052 YesSPECT 2 22 267 YesWdbc 3 30 357212 Yes

square4 datasets the superiority of SSPSO is more apparentthat is for all the training scenarios the best performance isachieved by the proposed method

42 UCI Dataset To further investigate the effectiveness ofSSPSO for classification we also conduct the experiments onsix real-life datasets with different numbers of attributes andclasses from the UCI machine learning repository [33] Thedescription of the datasets used in experiments is given inTable 3

For datasets with 2 classes we randomly select 1sim15labeled samples per class as the training data and for datasetswith 3 classes we randomly select 1sim10 labeled samples perclass as the training data and the rest are used as the testset The results are averaged over 100 runs The weight factor120573 in the fitness function of SSPSO (in (6)) is selected as05 Figure 3 shows the classification accuracy with differentnumbers of training samples on the 6 datasets

From Figures 3(a) 3(b) 3(c) and 3(e) it can be observedthat the proposed SSPSOmethodoutperforms the other threemethods on the Heart Wine Thyroid and SPECT datasetsespecially when the number of the labeled samples per classis small It is because that SSPSO uses the information ofavailable unlabeled data which is of benefit to the classifica-tionWith the increase of the labeled samples the superioritybecomes weak From Figure 3(d) it is seen that SSPSO canobtain comparative accuracy with the other three methodsFrom Figure 3(f) SSPSO is slightly better than SVM butit is much better than PSO and NN methods Thereforeit can be concluded that SSPSO works well for some real-life classification tasks especially in the case that the labeledsamples are highly limited

From an evolutionary point of view in Figure 4 we reportthe behavior of a typical run of SSPSO in terms of the bestindividual fitness and average fitness in the population as afunction of the number of iterations It is carried out on theThyroid database As can be seen SSPSO shows two phasesIn the first phase with about 50 iterations the fitness value

Mathematical Problems in Engineering 9

0201 03 04 05 06 07 08 0904

05

06

07

08

09

1Ac

cura

cy

SSPSONNSVM

120573

(a)

SSPSONNSVM

01 02 03 04 05 06 07 08 09055

06

065

07

075

08

Accu

racy

120573

(b)

Figure 7 Classification accuracy as a function of 120573 in SSPSO on (a) Thyroid and (b) Heart

decreases sharply starting from 06712 for the best and 09192for the average and reaching about 02305 for the best and02356 for the averageThen the second phase follows lastingabout 50 iterations in which both the best and the averagefitness values decrease slowly and tend to become closer andcloser until they reach 02231 and 02247 respectively Andthen the average and the best fitness values becomemore andmore similar Finally both the two values get to 02230

43 USPS Digital Recognition We also conduct experimentson the USPS handwritten digits dataset to test the perfor-mance of SSPSO This dataset consists of 9298 samples with10 classes and each sample has the size of 16times16 pixels Firstlywe apply the principal component analysis on the dataset forfeature extraction and select the first 10 principle componentsas the new features

We consider four subsets of the dataset in the experimentthat is the images of digits 0 and 8 with 2261 examples intotal the images of digits 3 5 and 8 with 2248 examples intotal the images of digits 3 8 and 9 with a total numberof 2429 examples and the images of digits 1 2 3 and 4with a total number of 3874 examples We randomly select1sim10 samples per class respectively as the training dataand randomly select 200 unlabeled samples to construct theunlabeled sample set X

119880 which is used for semisupervised

learningThe weight factor 120573 in the fitness function of SSPSO(in (6)) is selected as 07

The recognition results averaged over 100 independenttrials are summarized in Figure 5 where the horizontal axisrepresents the number of randomly labeled digital imagesper class in the subset and the vertical axis represents

the classification accuracy From Figure 5(a) it is shown thatwhen the number of labeled samples per class is below 14SSPSO can obtain comparable performance with SVM andKNN and be better than PSO In particular SSPSO canoutperform the other methods when the labeled samples arefew For the results on the USPS subset of digits 3 5 and 8and the subset of digits 3 8 and 9 shown in Figures 5(b)and 5(c) respectively one can clearly see that SSPSOmethodoutperforms SVM and is much better than KNN and PSOmethods when the number of labeled samples is small InFigure 5(d) SSPSO still works better than the other methodsbut the superiority of the proposed SSPSO over the othermethods decrease with the increase of labeled samples

44 The Sensitivity Analysis of the Number of UnlabeledSamples In this section we validate the effect of the numberof the unlabeled samples on the classification accuracy Thisexperiment is carried on the subset of the USPS datasetwith the digit images of 3 5 and 8 We vary the size ofthe unlabeled set X

119880to be 10 50 100 200 400 and 600

Figure 6 illustrates the classification accuracy as the functionof the size of the unlabeled set and the number of labeledsamples From Figure 6 one can see that the number ofthe unlabeled samples affects the accuracy slightly when thenumber of the labeled samples is small The plot with 10unlabeled samples gets much lower accuracy than the otherplots which indicates that the number of unlabeled samplesused should not be too small With the increase of the size ofX119880 SSPSO can obtain better classification accuracy because

the proposed method can capture the real structure of thewhole dataset more precisely with more unlabeled samples

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

8 Mathematical Problems in Engineering

06507

07508

08509

095

Accu

racy

5 10 15 20 25 30 35 40 45 50 55 60Number of labeled samples

SSPSO-10SSPSO-50SSPSO-100

SSPSO-200SSPSO-400SSPSO-600

Figure 6 3 5 and 8 digit recognition on the USPS dataset bySSPSO with different numbers of unlabeled samples

In addition we analyze the effect of the number ofunlabeled samples on the classification accuracy on USPSdataset In order to test the robustness of the parameter120573 in the fitness function to the classification performancewe conduct experiments on UCI datasets with differentvalues of 120573 and analyze the effect of 120573 on the classificationperformance

41 Artificial Two-Dimension Problems To test the feasibilityof SSPSO for classification the proposed method is firstconducted on four artificial two-dimension datasets that islong1 sizes5 square1 and square4 The details are shown inTable 2 and the distributions of the four datasets are shownin Figure 1

In the experiments for the first two datasets we randomlyselect 1sim30 labeled samples per class as the training data andfor the last two datasets we randomly select 5sim40 labeledsamples per class as the training set and the rest are usedas the test set Figure 2 plots the curves of accuracy withrespect to the number of labeled samples which showsthe average results over 100 runs of the proposed SSPSOcomparing with PSO NN and SVM on the four datasetsThe weight factor 120573 in the fitness function of SSPSO (in(6)) is selected as 05 From Figure 2 it can be observedthat SSPSO can obtain favorable classification results on thefour datasets which means that SSPSO is feasible for theclassification problem Among the four datasets long1 is theeasiest to classify on which all the four methods acquire100 classification accuracy when the number of labeledsamples per class exceeds 10 But when the labeled instancesare few for example only 3 instances per class are labeledPSO NN and SVM cannot classify all the test data correctlywhile SSPSO can still obtain 100 classification accuracyIn Figure 2(b) the performance difference among SSPSONN and SVM is not noticeable when the number of labeledsamples per class is up to 15 but when the number of labeledinstances is small for example less than 10 SSPSO can obtainobvious better accuracy than the other methods It is becauseSSPSO utilizes the information of unlabeled instances whichis helpful to capture the global structure For square1 and

Table 2 Artificial datasets used in experiments

Data Classnumber

Attributenumber Instance number Normalization

long1 2 2 500500 Yessizes5 3 2 77154769 Yessquare1 4 2 250250250250 Yessquare4 4 2 250250250250 Yes

Table 3 UCI datasets used in experiments

Data Class Attributes Instances NormalizationHeart 2 13 150120 YesWine 3 13 597148 YesThyroid 3 5 1503530 YesTae 3 5 495052 YesSPECT 2 22 267 YesWdbc 3 30 357212 Yes

square4 datasets the superiority of SSPSO is more apparentthat is for all the training scenarios the best performance isachieved by the proposed method

42 UCI Dataset To further investigate the effectiveness ofSSPSO for classification we also conduct the experiments onsix real-life datasets with different numbers of attributes andclasses from the UCI machine learning repository [33] Thedescription of the datasets used in experiments is given inTable 3

For datasets with 2 classes we randomly select 1sim15labeled samples per class as the training data and for datasetswith 3 classes we randomly select 1sim10 labeled samples perclass as the training data and the rest are used as the testset The results are averaged over 100 runs The weight factor120573 in the fitness function of SSPSO (in (6)) is selected as05 Figure 3 shows the classification accuracy with differentnumbers of training samples on the 6 datasets

From Figures 3(a) 3(b) 3(c) and 3(e) it can be observedthat the proposed SSPSOmethodoutperforms the other threemethods on the Heart Wine Thyroid and SPECT datasetsespecially when the number of the labeled samples per classis small It is because that SSPSO uses the information ofavailable unlabeled data which is of benefit to the classifica-tionWith the increase of the labeled samples the superioritybecomes weak From Figure 3(d) it is seen that SSPSO canobtain comparative accuracy with the other three methodsFrom Figure 3(f) SSPSO is slightly better than SVM butit is much better than PSO and NN methods Thereforeit can be concluded that SSPSO works well for some real-life classification tasks especially in the case that the labeledsamples are highly limited

From an evolutionary point of view in Figure 4 we reportthe behavior of a typical run of SSPSO in terms of the bestindividual fitness and average fitness in the population as afunction of the number of iterations It is carried out on theThyroid database As can be seen SSPSO shows two phasesIn the first phase with about 50 iterations the fitness value

Mathematical Problems in Engineering 9

0201 03 04 05 06 07 08 0904

05

06

07

08

09

1Ac

cura

cy

SSPSONNSVM

120573

(a)

SSPSONNSVM

01 02 03 04 05 06 07 08 09055

06

065

07

075

08

Accu

racy

120573

(b)

Figure 7 Classification accuracy as a function of 120573 in SSPSO on (a) Thyroid and (b) Heart

decreases sharply starting from 06712 for the best and 09192for the average and reaching about 02305 for the best and02356 for the averageThen the second phase follows lastingabout 50 iterations in which both the best and the averagefitness values decrease slowly and tend to become closer andcloser until they reach 02231 and 02247 respectively Andthen the average and the best fitness values becomemore andmore similar Finally both the two values get to 02230

43 USPS Digital Recognition We also conduct experimentson the USPS handwritten digits dataset to test the perfor-mance of SSPSO This dataset consists of 9298 samples with10 classes and each sample has the size of 16times16 pixels Firstlywe apply the principal component analysis on the dataset forfeature extraction and select the first 10 principle componentsas the new features

We consider four subsets of the dataset in the experimentthat is the images of digits 0 and 8 with 2261 examples intotal the images of digits 3 5 and 8 with 2248 examples intotal the images of digits 3 8 and 9 with a total numberof 2429 examples and the images of digits 1 2 3 and 4with a total number of 3874 examples We randomly select1sim10 samples per class respectively as the training dataand randomly select 200 unlabeled samples to construct theunlabeled sample set X

119880 which is used for semisupervised

learningThe weight factor 120573 in the fitness function of SSPSO(in (6)) is selected as 07

The recognition results averaged over 100 independenttrials are summarized in Figure 5 where the horizontal axisrepresents the number of randomly labeled digital imagesper class in the subset and the vertical axis represents

the classification accuracy From Figure 5(a) it is shown thatwhen the number of labeled samples per class is below 14SSPSO can obtain comparable performance with SVM andKNN and be better than PSO In particular SSPSO canoutperform the other methods when the labeled samples arefew For the results on the USPS subset of digits 3 5 and 8and the subset of digits 3 8 and 9 shown in Figures 5(b)and 5(c) respectively one can clearly see that SSPSOmethodoutperforms SVM and is much better than KNN and PSOmethods when the number of labeled samples is small InFigure 5(d) SSPSO still works better than the other methodsbut the superiority of the proposed SSPSO over the othermethods decrease with the increase of labeled samples

44 The Sensitivity Analysis of the Number of UnlabeledSamples In this section we validate the effect of the numberof the unlabeled samples on the classification accuracy Thisexperiment is carried on the subset of the USPS datasetwith the digit images of 3 5 and 8 We vary the size ofthe unlabeled set X

119880to be 10 50 100 200 400 and 600

Figure 6 illustrates the classification accuracy as the functionof the size of the unlabeled set and the number of labeledsamples From Figure 6 one can see that the number ofthe unlabeled samples affects the accuracy slightly when thenumber of the labeled samples is small The plot with 10unlabeled samples gets much lower accuracy than the otherplots which indicates that the number of unlabeled samplesused should not be too small With the increase of the size ofX119880 SSPSO can obtain better classification accuracy because

the proposed method can capture the real structure of thewhole dataset more precisely with more unlabeled samples

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

Mathematical Problems in Engineering 9

0201 03 04 05 06 07 08 0904

05

06

07

08

09

1Ac

cura

cy

SSPSONNSVM

120573

(a)

SSPSONNSVM

01 02 03 04 05 06 07 08 09055

06

065

07

075

08

Accu

racy

120573

(b)

Figure 7 Classification accuracy as a function of 120573 in SSPSO on (a) Thyroid and (b) Heart

decreases sharply starting from 06712 for the best and 09192for the average and reaching about 02305 for the best and02356 for the averageThen the second phase follows lastingabout 50 iterations in which both the best and the averagefitness values decrease slowly and tend to become closer andcloser until they reach 02231 and 02247 respectively Andthen the average and the best fitness values becomemore andmore similar Finally both the two values get to 02230

43 USPS Digital Recognition We also conduct experimentson the USPS handwritten digits dataset to test the perfor-mance of SSPSO This dataset consists of 9298 samples with10 classes and each sample has the size of 16times16 pixels Firstlywe apply the principal component analysis on the dataset forfeature extraction and select the first 10 principle componentsas the new features

We consider four subsets of the dataset in the experimentthat is the images of digits 0 and 8 with 2261 examples intotal the images of digits 3 5 and 8 with 2248 examples intotal the images of digits 3 8 and 9 with a total numberof 2429 examples and the images of digits 1 2 3 and 4with a total number of 3874 examples We randomly select1sim10 samples per class respectively as the training dataand randomly select 200 unlabeled samples to construct theunlabeled sample set X

119880 which is used for semisupervised

learningThe weight factor 120573 in the fitness function of SSPSO(in (6)) is selected as 07

The recognition results averaged over 100 independenttrials are summarized in Figure 5 where the horizontal axisrepresents the number of randomly labeled digital imagesper class in the subset and the vertical axis represents

the classification accuracy From Figure 5(a) it is shown thatwhen the number of labeled samples per class is below 14SSPSO can obtain comparable performance with SVM andKNN and be better than PSO In particular SSPSO canoutperform the other methods when the labeled samples arefew For the results on the USPS subset of digits 3 5 and 8and the subset of digits 3 8 and 9 shown in Figures 5(b)and 5(c) respectively one can clearly see that SSPSOmethodoutperforms SVM and is much better than KNN and PSOmethods when the number of labeled samples is small InFigure 5(d) SSPSO still works better than the other methodsbut the superiority of the proposed SSPSO over the othermethods decrease with the increase of labeled samples

44 The Sensitivity Analysis of the Number of UnlabeledSamples In this section we validate the effect of the numberof the unlabeled samples on the classification accuracy Thisexperiment is carried on the subset of the USPS datasetwith the digit images of 3 5 and 8 We vary the size ofthe unlabeled set X

119880to be 10 50 100 200 400 and 600

Figure 6 illustrates the classification accuracy as the functionof the size of the unlabeled set and the number of labeledsamples From Figure 6 one can see that the number ofthe unlabeled samples affects the accuracy slightly when thenumber of the labeled samples is small The plot with 10unlabeled samples gets much lower accuracy than the otherplots which indicates that the number of unlabeled samplesused should not be too small With the increase of the size ofX119880 SSPSO can obtain better classification accuracy because

the proposed method can capture the real structure of thewhole dataset more precisely with more unlabeled samples

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

10 Mathematical Problems in Engineering

but the gaps between the plots of SSPSOwith different sizes ofX119880become very small It is noted from the above experiment

that for an unlabeled dataset X119880with a certain scale when

more unlabeled data are added the classification accuracyof SSPSO may increase a bit but it will also bring highercomputation cost So in the experiments 100 to 200 unlabeledsamples are proper to use for SSPSO

45 The Sensitivity Analysis of the Parameter 120573 120573 in fitnessfunction is an important parameter in our proposed SSPSOwhich controls the contributions of information obtainedfrom the labeled and unlabeled samples to the classificationIn this section we will analyze the sensitivity of 120573 in SSPSOThe experiments are conducted on two UCI datasets that isThyroid andHeart We randomly select 5 samples per class toform the labeled dataset and the rest are used for test Themean results over 100 times of randomly selected trainingdatasets with different values of 120573 are shown in Figure 7

From Figures 7(a) and 7(b) it can be observed that withdifferent values of 120573 SSPSO is not always better than NN andSVM methods When 120573 is small SSPSO may obtain a badperformance that is the accuracy ismuch lower thanNNandSVM When the value of 120573 is small the effect of the labeledsamples is weakened while the effect of the unlabeled samplesis strengthened which is much more like the unsupervisedlearning With the increase of 120573 the accuracy raises sharplyAfter120573 gets to 04 for theThyroid dataset and 05 for theHeartdataset the performance keeps stable and even decreasesmore or less To balance the effects of the labeled instancesand the available unlabeled instances 120573 is set to 05 in ourexperiments

5 Conclusions

In this paper a semisupervised PSOmethod for classificationhas been proposed PSO is used to find the centroids ofthe classes In order to take advantage of the amount ofunlabeled instances a semisupervised classification methodis proposed based on the assumption that near instancesin feature space should have the same labels Since thediscriminative information provided by the labeled samplesand the global distribution information provided by the largenumber of unlabeled samples is used to find a collection ofcentroids SSPSO obtains better performance than traditionalPSO classification method In the experiments four artificialdatasets six real-life datasets from the UCI machine learningrepository and the USPS handwritten dataset are applied toevaluate the effectiveness of the method The experimentalresults demonstrated that our proposed SSPSO method hasa good performance and can obtain higher accuracy incomparison to the traditional PSO classificationmethod NNmethod and SVM when there are only few labeled samplesavailable

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work was supported by the National Natural ScienceFoundation of China (nos 61272282 61203303 6127227961377011 and 61373111) the Program for New Century Excel-lent Talents in University (NCET-13-0948) and the ProgramforNew Scientific andTechnological Star of Shaanxi Province(No 2014KJXX-45)

References

[1] R Eberhart and J Kennedy ldquoNew optimizer using particleswarm theoryrdquo in Proceedings of the 6th International Sympo-sium onMicroMachine and Human Science pp 39ndash43 October1995

[2] J Kennedy and R Eberhart ldquoParticle swarm optimizationrdquoin Proceedings of the IEEE International Conference on NeuralNetworks (ICNN rsquo95) pp 1942ndash1948 December 1995

[3] R C Eberhart and Y Shi ldquoParticle swarm optimizationdevelopments applications and resourcesrdquo in Proceedings of theCongress on Evolutionary Computation (CEC rsquo01) pp 81ndash86May 2001

[4] A Paul A A Victoire and A E Jeyakumar ldquoParticle swarmapproach for retiming in VLSIrdquo in Proceedings of the 46th IEEEMid-West Symposium on Circuits and Systems pp 1532ndash15352003

[5] X Zhang W Wang Y Li and L Jiao ldquoPSO-based automaticrelevance determination and feature selection system for hyper-spectral image classificationrdquo IET Electronics Letters vol 48 no20 pp 1263ndash1265 2012

[6] R J Lewis ldquoAn introduction to classification and regression tree(CART) analysisrdquo inAnnual Meeting of the Society for AcademicEmergency Medicine in San Francisco Calif USA pp 1ndash14 May2000

[7] K R Muller S Mika G Ratsch K Tsuda and B ScholkopfldquoAn introduction to kernel-based learning algorithmsrdquo IEEETransactions on Neural Network vol 12 pp 181ndash201 2001

[8] C J C Burges ldquoA tutorial on support vector machines forpattern recognitionrdquo Data Mining and Knowledge Discoveryvol 2 no 2 pp 121ndash167 1998

[9] L P Wang Ed Support Vector Machines Theory and Applica-tion Springer Berlin Germany 2005

[10] L P Wang B Liu and C R Wan ldquoClassification using supportvector machines with graded resolutionrdquo in Proceedings of theIEEE International Conference on Granular Computing vol 2pp 666ndash670 July 2005

[11] G P Zhang ldquoNeural networks for classification a surveyrdquo IEEETransactions on Systems Man and Cybernetics C Applicationsand Reviews vol 30 no 4 pp 451ndash462 2000

[12] J Trejos-Zelaya andM Villalobos-Arias ldquoPartitioning by parti-cle swarmoptimizationrdquo in Selected Contributions inDataAnal-ysis and Classification Studies in Classification Data Analysisand Knowledge Organization pp 235ndash244 2007

[13] M Sugisaka and X Fan ldquoAn effective search method forneural network based face detection using particle swarmoptimizationrdquo IEICE Transactions on Information and Systemsvol E88-D no 2 pp 214ndash222 2005

[14] Z Q Wang X Sun and D X Zhang ldquoA PSO-based clas-sification rule mining algorithmrdquo in Proceedings of the 3rdInternational Conference on Intelligent Computing Advanced

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

Mathematical Problems in Engineering 11

Intelligent Computing Theories and Applications pp 377ndash3842007

[15] T Sousa A Silva and A Neves ldquoParticle swarm based datamining algorithms for classification tasksrdquo Parallel Computingvol 30 no 5-6 pp 767ndash783 2004

[16] Y J Zheng H F Ling J Y Xue et al ldquoPopulation classificationin fire evacuation a multiobjective particle swarm optimizationapproachrdquo IEEETransactions onEvolutionaryComputation vol18 no 1 pp 70ndash81 2014

[17] N P Holden and A A Freitas ldquoA hybrid PSOACO algorithmfor classificationrdquo in Proceedings of the 9th Annual Genetic andEvolutionary Computation Conference (GECCO rsquo07) pp 2745ndash2750 July 2007

[18] X Zhu ldquoSemi-supervised learning literature surveyrdquo Com-puter Sciences Technical Report 1530 University of Wisconsin-Madison 2008

[19] A Paul ldquoDynamic power management for ubiquitous networkdevicesrdquo Advance Science Letters vol 19 no 7 pp 2046ndash20492013

[20] J Wu ldquoA framework for learning comprehensible theories inXMLdocument classificationrdquo IEEETransactions on Knowledgeand Data Engineering vol 24 no 1 pp 1ndash14 2012

[21] I Koprinska J Poon J Clark and J Chan ldquoLearning to classifye-mailrdquo Information Sciences vol 177 no 10 pp 2167ndash2187 2007

[22] X Zhao X Li Y P Chao and S Wang ldquoHuman actionrecognition based on semi-supervised discriminant analysiswith global constraintrdquo Neurocomputing vol 105 pp 45ndash502013

[23] A Cervantes I M Galvan and P Isasi ldquoAMPSO a new particleswarm method for nearest neighborhood classificationrdquo IEEETransactions on Systems Man and Cybernetics B Cyberneticsvol 39 no 5 pp 1082ndash1091 2009

[24] T M Cover and P E Hart ldquoNearest neighbor pattern classifi-cationrdquo IEEE Transactions on Information Theory vol 13 no 1pp 21ndash27 1967

[25] F Aurenhammer and V Diagrams ldquoA survey of a fundamentalgeometric data structurerdquo ACMComputing Surveys vol 23 no3 pp 345ndash405 1991

[26] F Fernandez and P Isasi ldquoEvolutionary design of nearestprototype classifiersrdquo Journal of Heuristics vol 10 no 4 pp 431ndash454 2004

[27] I de Falco A Della Cioppa and E Tarantino ldquoEvaluationof particle swarm optimization effectiveness in classificationrdquoLecture Notes in Computer Science vol 3849 pp 164ndash171 2006

[28] K Y Huang ldquoA hybrid particle swarm optimization approachfor clustering and classification of datasetsrdquo Knowledge-BasedSystems vol 24 no 3 pp 420ndash426 2011

[29] L Zhang L Wang and W Lin ldquoSemisupervised biased max-imum margin analysis for interactive image retrievalrdquo IEEETransactions on Image Processing vol 21 no 4 pp 2294ndash23082012

[30] B Liu C Wan and L Wang ldquoAn efficient semi-unsupervisedgene selection method via spectral biclusteringrdquo IEEE Transac-tions on Nanobioscience vol 5 no 2 pp 110ndash114 2006

[31] X Zhang Y He N Zhou and Y Zheng ldquoSemisuperviseddimensionality reduction of hyperspectral images via localscaling cut criterionrdquo IEEE Geoscience Remote Sensing Lettersvol 10 no 6 pp 1547ndash1551 2013

[32] Y Shi and R Eberhart ldquoA modified particle swarm optimizerrdquoin Proceedings of the IEEE International Conference on Evolu-tionary Computation pp 69ndash73 1998

[33] D J Newman S Hettich C L Blake and C J MerzldquoUCI Repository of machine learning databasesrdquo Depart-ment of Information and Computer Science University ofCalifornia Irvine Calif USA 1998 httpwwwicsuciedusimmlearnMLRepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article Semisupervised Particle Swarm …downloads.hindawi.com/journals/mpe/2014/832135.pdfSemisupervised Particle Swarm Optimization for Classification XiangrongZhang, 1

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of