research article binary classification of multigranulation searching algorithm...

15
Research Article Binary Classification of Multigranulation Searching Algorithm Based on Probabilistic Decision Qinghua Zhang 1,2 and Tao Zhang 1 1 e Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China 2 School of Science, Chongqing University of Posts and Telecommunications, Chongqing 400065, China Correspondence should be addressed to Qinghua Zhang; [email protected] Received 6 June 2016; Revised 5 September 2016; Accepted 26 September 2016 Academic Editor: Kishin Sadarangani Copyright © 2016 Q. Zhang and T. Zhang. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Multigranulation computing, which adequately embodies the model of human intelligence in process of solving complex problems, is aimed at decomposing the complex problem into many subproblems in different granularity spaces, and then the subproblems will be solved and synthesized for obtaining the solution of original problem. In this paper, an efficient binary classification of multigranulation searching algorithm which has optimal-mathematical expectation of classification times for classifying the objects of the whole domain is established. And it can solve the binary classification problems based on both multigranulation computing mechanism and probability statistic principle, such as the blood analysis case. Given the binary classifier, the negative sample ratio, and the total number of objects in domain, this model can search the minimum mathematical expectation of classification times and the optimal classification granularity spaces for mining all the negative samples. And the experimental results demonstrate that, with the granules divided into many subgranules, the efficiency of the proposed method gradually increases and tends to be stable. In addition, the complexity for solving problem is extremely reduced. 1. Introduction With the rapid development of modern science and tech- nology, the daily information which people are facing is dramatically increasing, and it is urgent to find a simple and effective way to process the complex information. So the rise and development of multigranulation computing [1–8] have been promoted by this demand. Information granulation had attracted researchers’ great attention since a paper that focused on discussing information on granulation published by professor Zadeh in 1997 [9]. In 1985, a paper named “Granularity” was published by Professor Hobbs in the International Joint Conference on Artificial Intelligence held in Los Angeles, United States. It focuses on the granulation of decomposition and synthesis and how to obtain and generate different granularity [10]. ese studies play a leading role not only in granular computing methodology [11–17], but also in dealing with complex information [18–22]. Subsequently, the number of researches focused on granular computing has rapidly increased. Many scholars successfully use the basic theoretical model of multigranulation computing to deal with practical problems [23–29]. Currently, multigranu- lation computing becomes a basic theoretical model to solve the complex problems and discover knowledge from mass information [30, 31]. Multigranulation computing method is mainly aimed to establish a multilevels or multidimensional computing model, and then we need to find the solving way and synthesis strategies in different granularity spaces for solving complex problem. Complex information will be subdivided into lots of simple information in the different granularity spaces [32–35]; then, effective solutions will be obtained by data mining and knowledge discovery techniques to deal with simple information. So it can solve the complex problems in different granularity or dimensions. Aiming at large-scale binary classification problem, it is an important issue to determine the class of all objects in domain by using as little times as possible. And this issue attracts a large number of Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2016, Article ID 9329812, 14 pages http://dx.doi.org/10.1155/2016/9329812

Upload: others

Post on 21-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

Research ArticleBinary Classification of Multigranulation SearchingAlgorithm Based on Probabilistic Decision

Qinghua Zhang12 and Tao Zhang1

1The Chongqing Key Laboratory of Computational Intelligence Chongqing University of Posts and TelecommunicationsChongqing 400065 China2School of Science Chongqing University of Posts and Telecommunications Chongqing 400065 China

Correspondence should be addressed to Qinghua Zhang zhangqhcqupteducn

Received 6 June 2016 Revised 5 September 2016 Accepted 26 September 2016

Academic Editor Kishin Sadarangani

Copyright copy 2016 Q Zhang and T Zhang This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Multigranulation computing which adequately embodies themodel of human intelligence in process of solving complex problemsis aimed at decomposing the complex problem into many subproblems in different granularity spaces and then the subproblemswill be solved and synthesized for obtaining the solution of original problem In this paper an efficient binary classification ofmultigranulation searching algorithmwhich has optimal-mathematical expectation of classification times for classifying the objectsof the whole domain is established And it can solve the binary classification problems based on both multigranulation computingmechanism and probability statistic principle such as the blood analysis case Given the binary classifier the negative sample ratioand the total number of objects in domain this model can search the minimum mathematical expectation of classification timesand the optimal classification granularity spaces for mining all the negative samples And the experimental results demonstratethat with the granules divided into many subgranules the efficiency of the proposed method gradually increases and tends to bestable In addition the complexity for solving problem is extremely reduced

1 Introduction

With the rapid development of modern science and tech-nology the daily information which people are facing isdramatically increasing and it is urgent to find a simple andeffective way to process the complex information So the riseand development of multigranulation computing [1ndash8] havebeen promoted by this demand Information granulationhad attracted researchersrsquo great attention since a paper thatfocused on discussing information on granulation publishedby professor Zadeh in 1997 [9] In 1985 a paper namedldquoGranularityrdquo was published by Professor Hobbs in theInternational Joint Conference on Artificial Intelligence heldin Los Angeles United States It focuses on the granulation ofdecomposition and synthesis and how to obtain and generatedifferent granularity [10]These studies play a leading role notonly in granular computing methodology [11ndash17] but alsoin dealing with complex information [18ndash22] Subsequentlythe number of researches focused on granular computing

has rapidly increased Many scholars successfully use thebasic theoretical model of multigranulation computing todeal with practical problems [23ndash29] Currently multigranu-lation computing becomes a basic theoretical model to solvethe complex problems and discover knowledge from massinformation [30 31]

Multigranulation computing method is mainly aimedto establish a multilevels or multidimensional computingmodel and thenwe need to find the solvingway and synthesisstrategies in different granularity spaces for solving complexproblem Complex information will be subdivided into lotsof simple information in the different granularity spaces[32ndash35] then effective solutions will be obtained by datamining and knowledge discovery techniques to deal withsimple information So it can solve the complex problemsin different granularity or dimensions Aiming at large-scalebinary classification problem it is an important issue todetermine the class of all objects in domain by using as littletimes as possible And this issue attracts a large number of

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 9329812 14 pageshttpdxdoiorg10115520169329812

2 Mathematical Problems in Engineering

researchersrsquo attention [36ndash39] Supposing we have a binaryclassification algorithm and 119873 is the number of all objectsin domain and 119901 is the probability of negative samples indomain Then we need to give the class of all objects indomain The traditional method will search each object oneby one by the binary classification algorithm and it is simplebut has many heavy workloads when facing a large numberof objects set Therefore some researchers have proposedgroup classification that each object is composed of a lot ofsamples and this method improves the searching efficiency[40ndash42] For example how can we effectively complete thebinary classification tasks with minimum classification timeswhen facing the massive blood analysis The traditionalmethod is to test per person once However it is a heavyworkload when facing many thousands of objects Butto a certain extent the single-level granulation method(namely single-level group testing method) can reduce theworkload

In 1986 Professor Mingmin and Junli proposed thatusing the single-level group testing method can reduce theworkloads ofmassive blood analysis when the prevalence rate119901 of a sickness is less than about 03 [43] In this methodall objects will be subdivided into many small subgroupsand then every subgroup will be tested If the testing resultof a subgroup is sickness we will diagnose the result ofeach object by testing all the objects in this subgroup oneby one Else if its result is health we can diagnose that allobjects in this subgroup are heathy But for millions andeven billions of objects can this kind of single-level gran-ulation method still effectively solve the complex problemAt present there are lots of methods about the single-levelgranulation searching model [44ndash54] but the studies onthe multilevels granulation searching model are few [55ndash59]A binary classification of multilevels granulation searchingalgorithm namely establishing an efficient multigranulationbinary classification searching model based on hierarchicalquotient space structure is proposed in this paperThis algo-rithm combines the falsity and truth preserving principles inquotient space theory andmathematical expectations theoryObviously on the assuming that 119901 is the probability of nega-tive samples in domain the smaller 119901 the higher efficiencyof this algorithm A large number of experimental resultsindicate that the proposed algorithm has high efficiency anduniversality

The rest of this paper is organized as follows Firstsome preliminary concepts and conclusions are reviewed inSection 2 Then the binary classification of multigranulationsearching algorithm is proposed in Section 3 Next theexperimental analysis is discussed in Section 4 Finally thepaper is concluded in Section 5

2 Preliminary

For convenience some preliminary concepts are reviewed ordefined at first

Definition 1 (quotient space model [60]) Suppose that triplet(119883 119865 119879) describes a problem space or simply space (119883 119865)

where 119883 denotes the universe and 119865 is the structure ofuniverse119883119879 indicates the attributes (or features) of universe119883 Suppose that 119883 represents the universe with the finestgranularity When we view the same universe 119883 from acoarser granularity we have a coarse-granularity universedenoted by [119883] Then we can have a new problem space([119883] [119865] [119879]) The coarser universe [119883] can be defined byan equivalence relation 119877 on 119883 That is an element in [119883]is equivalent to a set of elements in119883 namely an equivalenceclass 119883 So [119883] consists of all equivalence classes induced by119877 From 119865 and 119879 we can define the corresponding [119865] and[119879]Thenwe have a new space ([119883] [119865] [119879]) called a quotientspace of original space (119883 119865 119879)Theorem 2 (falsity preserving principle [61]) If a problem[119860] rarr [119861] on quotient space ([119883] [119865] [119879]) has no solutionthen problem 119860 rarr 119861 on its original space (119883 119865 119879) has nosolution either In other words if 119860 rarr 119861 on (119883 119865 119879) has asolution then [119860] rarr [119861] on ([119883] [119865] [119879]) has a solution aswell

Theorem 3 (truth preserving principle [61]) A problem[119860] rarr [119861] on quotient space ([119883] [119865] [119879]) has a solution iffor [119909] ℎminus1([119909]) is a connected set on 119883 and problem 119860 rarr 119861on (119883 119865 119879) has a solution ℎ 119883 rarr [119883] is a natural projectionthat is defined as follows

ℎ (119909) = [119883] ℎminus1 (119906) = 119909 | ℎ (119909) isin 119906 (1)

Definition 4 (expectation [62]) Let 119883 be a discrete randomvariable The expectation or mean of 119883 is defined as 120583 =119864(119883) = sum119909 119909119901(119883 = 119909) where 119901(119883 = 119909) is the probabilityof119883 = 119909

In the case that 119883 takes values from an infinite numberset 120583 becomes an infinite series If the series convergesabsolutely we say the expectation 119864(119883) exists otherwise wesay that the expectation of119883 does not exist

Lemma 5 (see [43]) Let 119891119902(119909) = 1119909 + 1 minus 119902119909 (119909 =2 3 4 119902 isin (0 1)) be a function with integer variable If119891119902(119909) lt 1 always holds for all 119909 then 119902 isin (119890minus119890minus1 1)

Lemma 5 is the basis of the following discussion

Lemma 6 (see [42 63]) Let 119891(119909) = [119909] denote a top integralfunction And let 119891119902(119909) = 1119909 + 1 minus 119902119909 (119909 = 2 3 4 119902 isin(0 1)) be a function with integer variable 119891119902(119909) will reach itsminimum value when 119909 = [1(119901 + (11990122))] or 119909 = [1(119901 +(11990122))] + 1 119909 isin (1 1 minus 1 ln 119902]Lemma 7 Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) be a functionIf 2 le 119888 le 1 minus 1 ln 119902 and 1 le 119894 lt 1198882 then the inequality(119892(119894) + 119892(119888 minus 119894))2 le (119892((119888 minus 1)2) + 119892((119888 + 1)2))2 lt 119892(1198882)holds

Mathematical Problems in Engineering 3

M0

M1

M2

y = f(x)

x

f(i) + f(c minus i)

2= M2

ltf((c minus 1)2) + f((c + 1)2)

2= M1

ltf(c2) = M0

i (c minus 1)2 c2 c minus i

y

(c + 1)2

Figure 1 Property of convex function

Proof We can obtain the first and second derivatives of 119892(119909)as follows1198921015840 (119909) = 119902119909 (1 + 119909 ln 119902) 11989210158401015840 (119909) = 119902119909 ln 119902 (2 + 119909 ln 119902)

11989210158401015840 (119909) =

119902119909 ln 119902 (2 + 119909 ln 119902) lt 0 1 le 119909 lt 1199090119902119909 ln 119902 (2 + 119909 ln 119902) = 0 119909 = 1199090119902119909 ln 119902 (2 + 119909 ln 119902) gt 0 119909 gt 1199090

where 1199090 = minus 2ln 119902 119890minus119890

minus1 lt 119902 lt 1

(2)

So we can draw a conclusion that 119892(119909) is a convex function(property of convex function is shown in Figure 1) when1 le 119909 lt 1 minus 1 ln 119902 lt 1199090 = minus2 ln 119902 According to thedefinition of convex function the inequality (119892(119894) + 119892(119888 minus119894))2 le (119892((119888 minus 1)2) + 119892((119888 + 1)2))2 lt 119892(1198882) holdspermanently So Lemma 7 is proved completely

3 Binary Classification ofMultigranulation Searching ModelBased on Probability and Statistics

Granulation is seen as a way of constructing simple theo-ries out of more complex ones [1] At the same time thetransformation between two different granularity layers ismainly based on the falsity and truth preserving principlesin quotient space theory And it can solve many classicalproblems such as scale ball game and branch decisions Allthe above instances just clearly embody the ideas to solvecomplex problems with multigranulation methods

In the second section the relevant concepts about multi-granulation computing theory and probability theory arereviewed Of course a multigranulation binary classificationsearching model not only solves practical complex problemswith less cost but also can be easily constructed And thismodel may also play a certain role in the inspiration for theapplications of multigranulation computing theory

Generally supposing a nonempty finite set 119880 = 11990911199092 119909119899 where 119909119894 (119894 = 1 2 3 119899) is a binary classi-fication object the triples (119880 2119880 119901) is called a probability

quotient space with probability 119901 where 119901 is the negativesample rate in 119880 2119880 is the structure of universe 119880 Andlet (1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898 cap 119880119899 = Φ 119898 119899 isin1 2 119905 119898 = 119899) be called a random partition space onprobability quotient space There is an example of binaryclassification of multigranulation searching model

Example 8 On the assumption that many people need todo blood analysis of physical examination for diagnosinga disease (there are two classes normal stands for healthand abnormal stands for sickness) the domain 119880 =1199091 1199092 119909119899 stands for all the people Let 119873 denote thenumber of all people and 119901 stands for the prevalence rate Sothe quotient space of blood analysis of physical examinationis (119880 2119880 119901) Besides we also know a binary classifier (or areagent) that diagnoses a disease by testing blood sampleHow can we complete all the blood analysis with the minimalclassification times Namely how can we determine the classof all objects in domain There are usually three ways asfollows

Method 9 (traditional method) In order to accurately diag-nose all objects every blood sample will be tested one by oneso this method needs to search 119873 times This method is justlike the classification process of machine learning

Method 10 (single-level granulation method) This methodis to mix 119896 blood samples to a group where 119896 may be1 2 3 119899 namely the original quotient space will berandompartition to (1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898cap119880119899 =Φ 119898 119899 isin 1 2 119905 119898 = 119899) And then each mixed bloodgroup will be tested once If the testing result of a group isabnormal and according to Theorem 2 we know that thisabnormal group has abnormal object(s) In order to make adiagnosis all of the objects in this group should be tested onceagain one by one Similarly if the testing result of a group isnormal and according to Theorem 3 we know that all of theobjects are normal in this group Therefore all 119896 objects inthis group only need one time tomake a diagnosisThe binaryclassifier can also classify the new blood sample that consistsof 119896 blood samples in this process

If every group has been mixed by large-scale bloodsamples (namely 119896 is a large number) and when some

4 Mathematical Problems in Engineering

groups are tested to be abnormal that means lots of objectsmust be tested once again one by one In order to reducethe classification times this paper proposes a multilevelsgranulation model

Method 11 (multilevels granulation method) Firstly eachmixed blood group which contains 1198961 samples (objects)will be tested once where 1198961 may be 1 2 3 119899 namelythe original quotient space will be random partition to(1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898 cap 119880119899 = Φ 119898 119899 isin1 2 119905 119898 = 119899) Next if some groups are tested to benormal that means all objects in those groups are normal(health)Therefore all 1198961 objects only need one time to makea diagnosis in this group Else if some groups are tested to beabnormal those groups will be subdivided into many smallersubsets (subgroups) which contain 1198962 (1198962 lt 1198961) objectsnamely the quotient space of an abnormal group 119880119894 will berandom partition to (1198801198941 1198801198942 119880119894119897) (⋃119897119895=1119880119894119895 = 119880119894 119880119894119898 cap119880119894119899 = Φ 119898 119899 isin 1 2 119897 119898 = 119899 119897 lt 1198961) Finally eachsubgroup will be tested once again Similarly if a subgroup istested to be normal it is confirmed that all objects are healthin corresponding subgroup and if a subgroup is tested to beabnormal it will be subdivided into smaller subgroups whichcontain 1198963 (1198963 lt 1198962) objects once againTherefore the testingresults of all objects can be ensured by repeating the aboveprocess in a group until the number of objects is equal to 1 orits testing result is normal in a subgroupThen the searchingefficiency of the above three methods is analyzed as follows

In Method 9 every object has to be tested once fordiagnosing a disease so it must take up119873 times in total

In Method 10 the original problem space is subdividedinto many disjoint subspaces (subsets) If some subsets aretested to be normal that means all objects need only to betested onceTherefore the classification times can be reducedif the probability 119901 is small enough in some degree [9]

In Method 11 the key is trying to find the optimalmultigranulation space for searching all objects so a mul-tilevels granulation model needs to be established Thereare two questions One is grouping strategy namely howmany objects are contained in a group The other one isoptimal granulation levels namely how many levels shouldbe granulated from the original problem space

In this paper we mainly solve the two questions inMethod 11 According to the truth and falsity preservingprinciple in quotient space theory all normal parts of bloodsamples could be ignored Hence the original problem willbe simplified to a smaller subspace This idea not onlyreduces the complexity of the problem but also improves theefficiency of searching abnormal objects

Algorithm Strategy Example 8 can be regarded as a treestructure which each node (which stands for a group) isan 119909-tuple Obviously the searching problem of the treehas been transformed into a hierarchical reasoning processin a monotonous relation sequence The original space hasbeen transformed into a hierarchical structure where allsubproblems will be solved in different granularity levels

Table 1 The probability distribution of 11988411198841 11198961 11198961 + 1119901 1198841 = 1199101 1199021198961 1 minus 1199021198961

Firstly the general rule can be concluded by analyzing thesimplest hierarchy and grouping case which is the single-levelgranulation Secondly we can calculate the mathematicalexpectation of the classification times of blood analysisFinally an optimal hierarchical granulation model will beestablished by comparing with the expectation of classifica-tion times

Analysis Supposing that there is an object set 119884 =1199101 1199102 119910119899 the prevalence rate is 119901 So the probability ofan object that appears normal in blood analysis is 119902 = 1 minus 119901The probability of a group which is tested to be normal is 1199021198961 and to be abnormal is 1 minus 1199021198961 where 1198961 is the objects numberof a group

31 The Single-Level Granulation Firstly the domain (whichcontains 119873 objects) is randomly subdivided into many sub-groups where each subset contains 1198961 objects In otherwordsa new quotient space [1198841] is obtained based on an equivalencerelation119877Then supposing that the average classification timeof each object is a random variable 1198841 so the probabilitydistribution of 1198841 is shown in Table 1

Thus themathematical expectation of1198841 can be obtainedas follows

1198641 (1198841) = 11198961 times 1199021198961 + (1 + 11198961) times (1 minus 1199021198961)= 11198961 + (1 minus 1199021198961)

(3)

Then the total mathematical expectation of the domain canbe obtained as follows

119873 times 1198641 (1198841) = 119873times 11198961 times 1199021198961 + (1 + 11198961) times (1 minus 1199021198961) (4)

When the probability of 119901 keeps unchanged and 1198961satisfies inequality 1198641(1198841) lt 1 this single-level granulationmethod can reduce classification times For example if 119901 =05 and 1198961 gt 1 and according to Lemma 5 1198641(1198961) gt 1 nomatter what the value of 1198961Then this single-level granulationmethod is worse than traditional method (namely testingevery object in turn) Conversely if 119901 = 0001 and 1198961 = 321198641(1198841) will reach its minimum value and the classificationtime of single-level granulation method is less than the

Mathematical Problems in Engineering 5

N

k1 k1

k2 k2 middot middot middotmiddot middot middot

middot middot middot

Figure 2 Double-levels granulation

traditional method Let119873 = 10000 the total of classificationtimes is approximately equal to 628 as follows

119873 times 1198641 (1198841) = 10000times 132 times 11990232 + (1 + 132) times (1 minus 11990232)

asymp 628(5)

This shows that this method can greatly improve the effi-ciency of diagnosing and reduce 9372 classification times inthe single-level granulation method If there is an extremelylow prevalence rate for example 119901 = 0000001 the totalof classification times reaches its minimum value when eachgroup contains 1001 objects (namely 1198961 = 1001) If everygroup is subdivided into many smaller subgroups again andrepeating the above method can the total of classificationtimes be further reduced

32 The Double-Levels Granulation After the objects ofdomain are granulated by the method of Section 31 theoriginal objects space becomes a new quotient space in whicheach group has 1198961 objects According to the falsity and truthpreserving principles in quotient space theory if the group istested to be abnormal it can be granulated into many smallersubgroups The double-levels granulation can be shown inFigure 2

Then the probability distribution of the double-levelsgranulation is discussed as follows

If each group contains 1198961 objects and tested once in the1st layer the average of classification times is 11198961 for eachobject Similarly the average of classification times of eachobject is 11198962 in the 2nd layer When a subgroup contains 1198962objects and is tested to be abnormal every object has to beretested one by one once again in this subgroup so the totalof classification times of each object is equal to 11198962 + 1

For simplicity suppose that every group in the 1st layerwill be subdivided into two subgroups which respectivelycontain 11989621 and 11989621 objects in the 2nd layer

The classification time is shown in Table 2 (M representsthe testing result which is abnormal and represents nor-mal)

Table 2The average classification times of each objectwith differentresults

Times Objects1198961 11989621 11989622

Result1 3 + 11989621 M M 3 + 11989622 M M3 + 1198961 M M M

Table 3 The probability distribution of 11988421198842 11198961 11198961 + 11198962 11198961 + 11198962 + 1119901 1198842 = 1199102 1199021198961 (1 minus 1199021198961) times 1199021198962 (1 minus 1199021198961) times (1 minus 1199021198962)

For instance let 1198961 = 8 11989621 = 4 and 11989622 = 4 there arefour kinds of cases to happen

Case 1 If a group is tested to be normal in the 1st layer so thetotal of classification times of this group is 1198961 times 11198961 = 1Case 2 If a group is tested to be abnormal in the 1st layerand its one subgroup is tested to be abnormal and the othersubgroup is also tested to be abnormal in the 2nd layer so thetotal of classification times of this group is 11989621times(11198961+111989621+1) + 11989622 times (11198961 + 111989622) = 3 + 11989621 = 7Case 3 If a group is tested to be abnormal in the 1st layer itsone subgroup is tested to be normal and the other subgroupis tested to be abnormal in the 2nd layer so the total ofclassification times of this group is 11989621 times(11198961 +111989621)+11989622 times(11198961 + 111989622 + 1) = 3 + 11989622 = 7Case 4 If a group is tested to be abnormal in the 1st layer andits two subgroups are tested to be abnormal in the 2nd layerso the total of classification times of this group is 11989621times(11198961+111989621 + 1) + 11989622 times (11198961 + 111989622 + 1) = 3 + 1198961 = 11

Suppose each group contains 1198961 objects in the 1st layerand their every subgroup has 1198962 objects in the 2nd layerSupposing that the average classification times of each objectis a random variable 1198842 then the probability distribution of1198842 is shown in Table 3

Thus in the 2nd layer the mathematical expectation of1198842 which is the average classification times of each object isobtained as follows

1198642 (1198842) = 11198961 times 1199021198961 + ( 11198961 +11198962) times (1 minus 1199021198961) times 1199021198962

+ (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

= 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962) (6)

As long as the number of granulation levels increases to 2the average classification times of each object will be furtherreduced for instance when 119901 = 0001 and 119873 = 10000

6 Mathematical Problems in Engineering

Table 4 The probability distribution of 119884119894119884119894 119901 119884119894 = 11991011989411198961 119902119896111198961 +

11198962 (1 minus 1199021198961) times 1199021198962 119894sum119895=1

1119896119895 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894119894sum119895=1

1119896119895 + 1 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times (1 minus 119902119896119894)

As we know the minimum expectation of the total ofclassification times is about 628 with 1198961 = 32 in the single-level granulation And according to (6) and Lemma 6 1198642(1198842)will reach minimum value when 1198962 = 16 The minimummathematical expectation of each objectrsquos average classifica-tion times is shown as follows

119873 times 1198642 (119883) = 119873 times 11198961 times 1199021198961 + 11198961 +11198962 times (1 minus 1199021198961)

times 1199021198962 + (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

asymp 338(7)

The mathematical expectation of classification times cansave 9662 compared with traditional method and save4618 compared with single-level granulationmethod Nextwe will discuss 119894-levels granulation (119894 = 3 4 5 119899)33 The i-Levels Granulation For blood analysis case thegranulation strategy in 119894th layer is concluded by the knownobjects number of each group in previous layers (namely1198961 1198962 119896119894minus1 are known and just only 119896119894 is unknown)According to the double-levels granulationmethod and sup-posing that the classification time of each object is a randomvariable 119884119894 in the 119894-levels granulation so the probabilitydistribution of 119884119894 is shown in Table 4

Obviously the sum of probability distribution is equal to1 in each layer

Proof

Case 1 (the single-level granulation) One has

1199021198961 + 1 minus 1199021198961 = 1 (8)

Case 2 (the double-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + (1 minus 1199021198961) times (1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times (1199021198962 + 1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times 1 = 1

(9)

Case 3 (the 119894-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894 + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1)times (119902119896119894 + (1 minus 119902119896119894)) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) = 1199021198961+ (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (119902119896119894minus1 + (1 minus 119902119896119894minus1)) = sdot sdot sdot = 1199021198961 + (1 minus 1199021198961)times 1 = 1

(10)

The proof is completed

Definition 12 (classification times expectation of granulation)In a probability quotient space a multilevels granulationmodel will be established from domain 119880 = 1199091 1199092 119909119899which is a nonempty finite set the health rate is 119902 the maxgranular levels is 119871 and the number of objects in 119894th layer is119896119894 119894 = 1 2 119871 So the average classification time of eachobjects is 119864119894(119884119894) in 119894th layer

119864119894 (119884119894) = 11198961 +119871sum119894=2

[[1119896119894 times119894minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119894=1

(1 minus 119902119896119894) (11)

In this paper we mainly focus on establishing a mini-mum granulation expectation model of classification timesby multigranulation computing method For simplicity themathematical expectation of classification times will beregarded as the measure of contrasting with the searchingefficiency According to Lemma 5 themultilevels granulationmodel can simplify the complex problem only if the preva-lence rate 119901 isin (0 1 minus 119890minus119890minus1) in the blood analysis case

Theorem 13 Let the prevalence rate 119901 isin (0 03) if a groupis tested to be abnormal in the 1st layer (namely this groupcontains abnormal objects) the average classification times ofeach object will be further reduced by subdividing this grouponce again

Proof The expectation difference between the single-levelgranulation 1198641(1198841) and the double-levels granulation 1198642(1198842)can adequately embody their efficiency Under the conditions

Mathematical Problems in Engineering 7

of 119890minus119890minus1 lt 119902 lt 1 and 1 le 1198962 lt 1198961 and according to (3) and (6)the expectation difference1198641(1198841)minus1198642(1198842) is shown as follows

1198641 (1198841) minus 1198642 (1198842)= 11198961 + (1 minus 1199021198961)

minus 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962)= (1 minus 1199021198961) times 1 minus ( 11198962 + 1 minus 1199021198962) gt 0

(12)

According to Lemma 5 (1 minus 1199021198961) gt 0 and 119891119902(1198962) = 11198962 + 1 minus1199021198962 lt 1 always hold then we can get 1minus(11198962+1minus1199021198962) gt 0 Sothe inequality 1198641(119883) minus 1198642(119883) gt 0 is proved successfully

Theorem 13 illustrates that it can reduce classificationtimes by continuing to granulate the abnormal groups intothe 2nd layer when 1198961 gt 1 There is attempt to prove thatthe total of classification times will be further reduced bycontinuously granulating the abnormal groups into 119894th layersuntil a grouprsquos number is no less than 1

Theorem 14 Supposing the prevalence rate 119901 isin (0 03)if a group is tested to be abnormal (namely this groupcontains abnormal objects) the average classification times ofeach object will be reduced by continuously subdividing theabnormal group until the objects number of its subgroup is noless than 1

Proof The expectation difference between (119894 minus 1)-levelsgranulation 119864119894minus1(119884119894minus1) and 119894-levels granulation 119864119894(119884119894) reflectstheir efficiency On the condition of 119890minus119890minus1 lt 119902 lt 1 and1 le 119896119894 lt 119896119894minus1 and according to (11) the expectation difference119864119894minus1(119884119894minus1) minus 119864119894(119884119894) is shown as follows

119864119894minus1 (119884119894minus1) minus 119864119894 (119884119894) = 11198961 +119894minus1sum119897=2

[[1119896119897 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119894minus1prod119897=1

(1 minus 119902119896119897) minus 11198961 +

119894sum119897=2

[[1119896119894 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119897=1

(1 minus 119902119896119897)= (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1)

times 1 minus ( 1119896119894 + 1 minus 119902119896119894) gt 0

(13)

Because (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1) gt 0 is known accordingto Lemma 5 and 119896119894 ge 1 we can get (1119896119894 + 1 minus 119902119896119894) lt 1namely 1 minus (1119896119894 + 1 minus 119902119896119894) gt 0 So 119864119894minus1 minus 119864119894 gt 0 is provedsuccessfully

Theorem 14 shows that this method will continuouslyimprove the searching efficiency in the process of granu-lating abnormal groups from 1st layer to 119894th layer because

119864119894minus1(119884119894minus1) minus119864119894(119884119894) gt 0 always holds However it is found thatthe classification times cannot be reduced when the objectsnumber of an abnormal group is less than or equal to 4 so theobjects of this abnormal group should be tested one by oneIn order to achieve the best efficiency then we will explorehow to determine the optimum granulation namely how todetermine the optimum objects number of each group andhow to obtain the optimum granulation levels

34 The Optimum Granulation It is a difficult and key pointto explore an appropriate granularity space for dealing witha complex problem And it not only requires us to keep theintegrity of the original information but also simplify thecomplex problem So we take the blood analysis case as anexample to explain how to obtain the optimum granularityspace in this paper Suppose the condition 119890minus119890minus1 lt 119902 lt 1always holds

Case 1 (granulating abnormal groups from the 1st layer to the2nd layer) (a) If 1198961 is an even number every group whichcontains 1198961 objects in 1st layer will be subdivided into twosubgroups into 2nd layer

Scheme 15 Supposing the one subgroup of the 2nd layer has119894 (1 le 119894 lt 11989612) objects according to formula (6) theexpectation of classification times for each object is1198642(119894) Andthe other subgroup has (1198961 minus 119894) objects so the expectation ofclassification times for each object is 1198642(1198961 minus 119894) The averageexpectation of classification times for each object in the 2ndlayer is shown as follows

119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (1198961 minus 119894)1198961 (14)

Scheme 16 Suppose every abnormal group in 1st layer is aver-age subdivided into two subgroups namely each subgrouphas 11989612 objects in the 2nd layer According to formula (6)the average expectation of classification times for each objectin the 2nd layer is shown as follows

2 times 11989612 times 1198642 (11989612)1198961 = 1198961 times 1198642 (11989612)1198961 (15)

The expectation difference between the above twoschemes embodies their efficiency In order to prove thatScheme 16 is more efficient than Scheme 15 we only need toprove that the following inequality is correct namely

(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))1198961 minus 1198961 times 1198642 (11989612)1198961

gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (16)

8 Mathematical Problems in Engineering

Table 5 The changes of average expectation with different objects number in two groups

(11989621 11989622) (1 15) (2 14) (3 13) (4 12) (5 11) (6 10) (7 9) (8 8)1198642 007367 007329 007297 007270 007249 007234 007225 007222

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + (1198961 minus 119894)

times ( 11198961 + (1 minus 1199021198961) times (1 + 1(1198961 minus 119894) minus 119902(1198961minus119894)))

gt 1198961 ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))

1198961 minus 1198961 times 1198642 (11989612)1198961gt 0

(17)

The proof is completed

Therefore if every group has 1198961 (1198961 is an even number and1198961 gt 1) objects in the 1st layer that need to be subdivided intotwo subgroup Scheme 16 is more efficient than Scheme 15

The experiment results have verified the above conclusionin Table 5 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the expectationof classification times obtainsminimumvalue for each objectwhere 11989621 is the number of the one subgroup in the 2ndlayer 11989622 is the number of the other subgroup and 1198642 isthe corresponding expectation of classification times for eachobject

(b) If 1198961 is an even number every group which contains1198961 objects in 1st layer will be subdivided into three subgroupsinto 2nd layer

Scheme 17 In the 2nd layer if the first subgroup has 119894 (1 le 119894 lt11989612) objects the average expectation of classification timesfor each object is 1198642(119894) If the second subgroup has 119895 (1 le119895 lt 11989612) objects the expectation of classification times foreach object is 1198642(119895) Then the third subgroup has (1198961 minus 119894 minus119895) objects and the average expectation of classification timesfor each object is 1198642(1198961 minus 119894 minus 119895) So the average expectationof classification times for each object in the 2nd is shown asfollows

119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895)1198961 (18)

Similarly it is easy to be prove that Scheme 16 is alsomoreefficient than Scheme 17 In other words we only need toprove the following inequality namely

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (19)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119905 times 119902119894 + (1198961 minus 119905) times 119902(1198961minus119905))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + 119895

times ( 11198961 + (1 minus 1199021198961) times (1 + 1119895 minus 119902119895)) + (1198961 minus 119894 minus 119895)times ( 11198961 + (1 minus 1199021198961) times (1 + 1

(1198961 minus 119894 minus 119895) minus 119902(1198961minus119894minus119895)))gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0

(20)

The proof is completed

Therefore if every group which contains 1198961 (is an evennumber and 1198961 gt 1) objects needs to be subdivided into threesubgroups in the first layer Scheme 16 is more efficient thanScheme 17

The experimental results have verified the above conclu-sion in Table 6 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the averageexpectation of classification times reachesminimumvalue foreach object In Table 6 the first line stands for the objectsnumber of first group in the 2nd layer and the first row standsfor the objects number of second group and data stands forthe corresponding average expectation of classification timesFor example (1 1 77143) expresses that the objects number

Mathematical Problems in Engineering 9

Table 6 The changes of average expectation with different objectsnumber of three groups

Objects Objects1 2 3 4 5

Expectation (times10minus2)1 77143 76786 76488 76250 760722 76786 76458 76189 75980 758313 76488 76189 75950 75770 756514 76250 75980 75770 75620 755305 76072 75831 75651 75530 754706 75953 75742 75591 75500 754707 75894 75712 75591 75530 755308 75894 75742 75651 75620 756519 75953 75831 75770 75770 75980

of three groups respectively is 1 1 and 14 and the averageclassification time for each object is 1198642 = 0077143 in the 2ndlayer

(c) When an abnormal group contains 1198961 (even) objectsand it needs to be further granulated into the 2nd layerScheme 16 still has the best efficient

There are two granulation schemes Scheme 18 is that theabnormal groups are randomly subdivided into 119904 (119904 lt 1198961)subgroups in the 1st layer and Scheme 16 is that the abnormalgroups are averagely subdivided into two subgroups in the 1stlayer

Scheme 18 Supposing an abnormal group will be subdividedinto 119904 (119904 lt 1198961) subgroups The first group has 1199091 (1 le 1199091 lt11989612) objects and the average expectation of classificationtimes for each object is 1198642(1199091) the 2nd subgroup has1199092 (1 le 1199092 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(1199092) 119894th subgrouphas 119909119894 (1 le 119909119894 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(119909119894) 119904th subgrouphas 119909119904 (1 le 119909119904 lt 11989612) objects and the average expectationof classification times for each object is 1198642(119909119904) Hence theaverage expectation of classification times for each object inthe 2nd layer is shown as follows

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) (21)

Similarly in order to prove that Scheme 16 is moreefficient than Scheme 18 we only need to prove the followinginequality namely

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0(119890minus119890minus1 lt 119902 lt 1 1198961 gt 1)

(22)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 Then we have

sum119904119894=1 119892 (119909119894)2 lt 119892 ( 1198882) 997904rArrsum119904119894=1 119909119894 times 119902119909119894

2 lt 11989612 times 11990211989612 997904rArr119904sum119895=1

119909119895 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119909119895 minus 119902119909119895))

gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0

(23)

The proof is completed

Therefore when every abnormal groupwhich contains 1198961(which is an even number and 1198961 gt 1) in the 1st layer objectsneeds to be granulated into many subgroups Scheme 16 ismore efficient than other schemes

(d) In a similar way when every abnormal group whichcontains 1198961 (1198961 is an odd number and 1198961 gt 1) objects in the 1stlayerwill be granulated intomany subgroups the best schemeis that every abnormal group is uniformly subdivided intotwo subgroups namely each subgroup contains (1198961 minus 1)2or (1198961 + 1)2 objects in the 2nd layer

Case 2 (granulating abnormal groups from the 1st layer to 119894thlayer)

Theorem 19 In 119894th layer if the objects number of eachabnormal group is more than 4 then the total of classificationtimes can be reduced by keeping on subdividing the abnormalgroups into two subgroups which contain equal objects as far aspossible Namely if each group contains 119896119894 objects in 119894th layerthen each subgroupmay contain 1198961198942 or (1198961minus1)2 or (1198961+1)2objects in (119894 + 1)th layerProof In the multigranulation method the objects numberof each subgroup in the next layer is determined by the objectsnumber of group in the current layer In other words theobjects number of each subgroup in the (119894 + 1)th layer isdetermined by the known objects number of each group in119894th layer

According to recursive idea the process of granulatingabnormal group from 119894th layer into (119894+1)th layer is similar tothat 1st layer into 2nd layer It is known that the best efficientway is as far as possible uniformly subdividing an abnormalgroup in current layer into two subgroups in next layer whengranulating abnormal group from 1st layer into 2nd layerTherefore the best efficient way is also as far as possibleuniformly subdividing each abnormal group in 119894th layer intotwo subgroups in (119894 + 1)th layer The proof is completed

Based on 1198961 which is the optimum objects number ofeach group in 1th layer then the optimum granulation levels

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

2 Mathematical Problems in Engineering

researchersrsquo attention [36ndash39] Supposing we have a binaryclassification algorithm and 119873 is the number of all objectsin domain and 119901 is the probability of negative samples indomain Then we need to give the class of all objects indomain The traditional method will search each object oneby one by the binary classification algorithm and it is simplebut has many heavy workloads when facing a large numberof objects set Therefore some researchers have proposedgroup classification that each object is composed of a lot ofsamples and this method improves the searching efficiency[40ndash42] For example how can we effectively complete thebinary classification tasks with minimum classification timeswhen facing the massive blood analysis The traditionalmethod is to test per person once However it is a heavyworkload when facing many thousands of objects Butto a certain extent the single-level granulation method(namely single-level group testing method) can reduce theworkload

In 1986 Professor Mingmin and Junli proposed thatusing the single-level group testing method can reduce theworkloads ofmassive blood analysis when the prevalence rate119901 of a sickness is less than about 03 [43] In this methodall objects will be subdivided into many small subgroupsand then every subgroup will be tested If the testing resultof a subgroup is sickness we will diagnose the result ofeach object by testing all the objects in this subgroup oneby one Else if its result is health we can diagnose that allobjects in this subgroup are heathy But for millions andeven billions of objects can this kind of single-level gran-ulation method still effectively solve the complex problemAt present there are lots of methods about the single-levelgranulation searching model [44ndash54] but the studies onthe multilevels granulation searching model are few [55ndash59]A binary classification of multilevels granulation searchingalgorithm namely establishing an efficient multigranulationbinary classification searching model based on hierarchicalquotient space structure is proposed in this paperThis algo-rithm combines the falsity and truth preserving principles inquotient space theory andmathematical expectations theoryObviously on the assuming that 119901 is the probability of nega-tive samples in domain the smaller 119901 the higher efficiencyof this algorithm A large number of experimental resultsindicate that the proposed algorithm has high efficiency anduniversality

The rest of this paper is organized as follows Firstsome preliminary concepts and conclusions are reviewed inSection 2 Then the binary classification of multigranulationsearching algorithm is proposed in Section 3 Next theexperimental analysis is discussed in Section 4 Finally thepaper is concluded in Section 5

2 Preliminary

For convenience some preliminary concepts are reviewed ordefined at first

Definition 1 (quotient space model [60]) Suppose that triplet(119883 119865 119879) describes a problem space or simply space (119883 119865)

where 119883 denotes the universe and 119865 is the structure ofuniverse119883119879 indicates the attributes (or features) of universe119883 Suppose that 119883 represents the universe with the finestgranularity When we view the same universe 119883 from acoarser granularity we have a coarse-granularity universedenoted by [119883] Then we can have a new problem space([119883] [119865] [119879]) The coarser universe [119883] can be defined byan equivalence relation 119877 on 119883 That is an element in [119883]is equivalent to a set of elements in119883 namely an equivalenceclass 119883 So [119883] consists of all equivalence classes induced by119877 From 119865 and 119879 we can define the corresponding [119865] and[119879]Thenwe have a new space ([119883] [119865] [119879]) called a quotientspace of original space (119883 119865 119879)Theorem 2 (falsity preserving principle [61]) If a problem[119860] rarr [119861] on quotient space ([119883] [119865] [119879]) has no solutionthen problem 119860 rarr 119861 on its original space (119883 119865 119879) has nosolution either In other words if 119860 rarr 119861 on (119883 119865 119879) has asolution then [119860] rarr [119861] on ([119883] [119865] [119879]) has a solution aswell

Theorem 3 (truth preserving principle [61]) A problem[119860] rarr [119861] on quotient space ([119883] [119865] [119879]) has a solution iffor [119909] ℎminus1([119909]) is a connected set on 119883 and problem 119860 rarr 119861on (119883 119865 119879) has a solution ℎ 119883 rarr [119883] is a natural projectionthat is defined as follows

ℎ (119909) = [119883] ℎminus1 (119906) = 119909 | ℎ (119909) isin 119906 (1)

Definition 4 (expectation [62]) Let 119883 be a discrete randomvariable The expectation or mean of 119883 is defined as 120583 =119864(119883) = sum119909 119909119901(119883 = 119909) where 119901(119883 = 119909) is the probabilityof119883 = 119909

In the case that 119883 takes values from an infinite numberset 120583 becomes an infinite series If the series convergesabsolutely we say the expectation 119864(119883) exists otherwise wesay that the expectation of119883 does not exist

Lemma 5 (see [43]) Let 119891119902(119909) = 1119909 + 1 minus 119902119909 (119909 =2 3 4 119902 isin (0 1)) be a function with integer variable If119891119902(119909) lt 1 always holds for all 119909 then 119902 isin (119890minus119890minus1 1)

Lemma 5 is the basis of the following discussion

Lemma 6 (see [42 63]) Let 119891(119909) = [119909] denote a top integralfunction And let 119891119902(119909) = 1119909 + 1 minus 119902119909 (119909 = 2 3 4 119902 isin(0 1)) be a function with integer variable 119891119902(119909) will reach itsminimum value when 119909 = [1(119901 + (11990122))] or 119909 = [1(119901 +(11990122))] + 1 119909 isin (1 1 minus 1 ln 119902]Lemma 7 Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) be a functionIf 2 le 119888 le 1 minus 1 ln 119902 and 1 le 119894 lt 1198882 then the inequality(119892(119894) + 119892(119888 minus 119894))2 le (119892((119888 minus 1)2) + 119892((119888 + 1)2))2 lt 119892(1198882)holds

Mathematical Problems in Engineering 3

M0

M1

M2

y = f(x)

x

f(i) + f(c minus i)

2= M2

ltf((c minus 1)2) + f((c + 1)2)

2= M1

ltf(c2) = M0

i (c minus 1)2 c2 c minus i

y

(c + 1)2

Figure 1 Property of convex function

Proof We can obtain the first and second derivatives of 119892(119909)as follows1198921015840 (119909) = 119902119909 (1 + 119909 ln 119902) 11989210158401015840 (119909) = 119902119909 ln 119902 (2 + 119909 ln 119902)

11989210158401015840 (119909) =

119902119909 ln 119902 (2 + 119909 ln 119902) lt 0 1 le 119909 lt 1199090119902119909 ln 119902 (2 + 119909 ln 119902) = 0 119909 = 1199090119902119909 ln 119902 (2 + 119909 ln 119902) gt 0 119909 gt 1199090

where 1199090 = minus 2ln 119902 119890minus119890

minus1 lt 119902 lt 1

(2)

So we can draw a conclusion that 119892(119909) is a convex function(property of convex function is shown in Figure 1) when1 le 119909 lt 1 minus 1 ln 119902 lt 1199090 = minus2 ln 119902 According to thedefinition of convex function the inequality (119892(119894) + 119892(119888 minus119894))2 le (119892((119888 minus 1)2) + 119892((119888 + 1)2))2 lt 119892(1198882) holdspermanently So Lemma 7 is proved completely

3 Binary Classification ofMultigranulation Searching ModelBased on Probability and Statistics

Granulation is seen as a way of constructing simple theo-ries out of more complex ones [1] At the same time thetransformation between two different granularity layers ismainly based on the falsity and truth preserving principlesin quotient space theory And it can solve many classicalproblems such as scale ball game and branch decisions Allthe above instances just clearly embody the ideas to solvecomplex problems with multigranulation methods

In the second section the relevant concepts about multi-granulation computing theory and probability theory arereviewed Of course a multigranulation binary classificationsearching model not only solves practical complex problemswith less cost but also can be easily constructed And thismodel may also play a certain role in the inspiration for theapplications of multigranulation computing theory

Generally supposing a nonempty finite set 119880 = 11990911199092 119909119899 where 119909119894 (119894 = 1 2 3 119899) is a binary classi-fication object the triples (119880 2119880 119901) is called a probability

quotient space with probability 119901 where 119901 is the negativesample rate in 119880 2119880 is the structure of universe 119880 Andlet (1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898 cap 119880119899 = Φ 119898 119899 isin1 2 119905 119898 = 119899) be called a random partition space onprobability quotient space There is an example of binaryclassification of multigranulation searching model

Example 8 On the assumption that many people need todo blood analysis of physical examination for diagnosinga disease (there are two classes normal stands for healthand abnormal stands for sickness) the domain 119880 =1199091 1199092 119909119899 stands for all the people Let 119873 denote thenumber of all people and 119901 stands for the prevalence rate Sothe quotient space of blood analysis of physical examinationis (119880 2119880 119901) Besides we also know a binary classifier (or areagent) that diagnoses a disease by testing blood sampleHow can we complete all the blood analysis with the minimalclassification times Namely how can we determine the classof all objects in domain There are usually three ways asfollows

Method 9 (traditional method) In order to accurately diag-nose all objects every blood sample will be tested one by oneso this method needs to search 119873 times This method is justlike the classification process of machine learning

Method 10 (single-level granulation method) This methodis to mix 119896 blood samples to a group where 119896 may be1 2 3 119899 namely the original quotient space will berandompartition to (1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898cap119880119899 =Φ 119898 119899 isin 1 2 119905 119898 = 119899) And then each mixed bloodgroup will be tested once If the testing result of a group isabnormal and according to Theorem 2 we know that thisabnormal group has abnormal object(s) In order to make adiagnosis all of the objects in this group should be tested onceagain one by one Similarly if the testing result of a group isnormal and according to Theorem 3 we know that all of theobjects are normal in this group Therefore all 119896 objects inthis group only need one time tomake a diagnosisThe binaryclassifier can also classify the new blood sample that consistsof 119896 blood samples in this process

If every group has been mixed by large-scale bloodsamples (namely 119896 is a large number) and when some

4 Mathematical Problems in Engineering

groups are tested to be abnormal that means lots of objectsmust be tested once again one by one In order to reducethe classification times this paper proposes a multilevelsgranulation model

Method 11 (multilevels granulation method) Firstly eachmixed blood group which contains 1198961 samples (objects)will be tested once where 1198961 may be 1 2 3 119899 namelythe original quotient space will be random partition to(1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898 cap 119880119899 = Φ 119898 119899 isin1 2 119905 119898 = 119899) Next if some groups are tested to benormal that means all objects in those groups are normal(health)Therefore all 1198961 objects only need one time to makea diagnosis in this group Else if some groups are tested to beabnormal those groups will be subdivided into many smallersubsets (subgroups) which contain 1198962 (1198962 lt 1198961) objectsnamely the quotient space of an abnormal group 119880119894 will berandom partition to (1198801198941 1198801198942 119880119894119897) (⋃119897119895=1119880119894119895 = 119880119894 119880119894119898 cap119880119894119899 = Φ 119898 119899 isin 1 2 119897 119898 = 119899 119897 lt 1198961) Finally eachsubgroup will be tested once again Similarly if a subgroup istested to be normal it is confirmed that all objects are healthin corresponding subgroup and if a subgroup is tested to beabnormal it will be subdivided into smaller subgroups whichcontain 1198963 (1198963 lt 1198962) objects once againTherefore the testingresults of all objects can be ensured by repeating the aboveprocess in a group until the number of objects is equal to 1 orits testing result is normal in a subgroupThen the searchingefficiency of the above three methods is analyzed as follows

In Method 9 every object has to be tested once fordiagnosing a disease so it must take up119873 times in total

In Method 10 the original problem space is subdividedinto many disjoint subspaces (subsets) If some subsets aretested to be normal that means all objects need only to betested onceTherefore the classification times can be reducedif the probability 119901 is small enough in some degree [9]

In Method 11 the key is trying to find the optimalmultigranulation space for searching all objects so a mul-tilevels granulation model needs to be established Thereare two questions One is grouping strategy namely howmany objects are contained in a group The other one isoptimal granulation levels namely how many levels shouldbe granulated from the original problem space

In this paper we mainly solve the two questions inMethod 11 According to the truth and falsity preservingprinciple in quotient space theory all normal parts of bloodsamples could be ignored Hence the original problem willbe simplified to a smaller subspace This idea not onlyreduces the complexity of the problem but also improves theefficiency of searching abnormal objects

Algorithm Strategy Example 8 can be regarded as a treestructure which each node (which stands for a group) isan 119909-tuple Obviously the searching problem of the treehas been transformed into a hierarchical reasoning processin a monotonous relation sequence The original space hasbeen transformed into a hierarchical structure where allsubproblems will be solved in different granularity levels

Table 1 The probability distribution of 11988411198841 11198961 11198961 + 1119901 1198841 = 1199101 1199021198961 1 minus 1199021198961

Firstly the general rule can be concluded by analyzing thesimplest hierarchy and grouping case which is the single-levelgranulation Secondly we can calculate the mathematicalexpectation of the classification times of blood analysisFinally an optimal hierarchical granulation model will beestablished by comparing with the expectation of classifica-tion times

Analysis Supposing that there is an object set 119884 =1199101 1199102 119910119899 the prevalence rate is 119901 So the probability ofan object that appears normal in blood analysis is 119902 = 1 minus 119901The probability of a group which is tested to be normal is 1199021198961 and to be abnormal is 1 minus 1199021198961 where 1198961 is the objects numberof a group

31 The Single-Level Granulation Firstly the domain (whichcontains 119873 objects) is randomly subdivided into many sub-groups where each subset contains 1198961 objects In otherwordsa new quotient space [1198841] is obtained based on an equivalencerelation119877Then supposing that the average classification timeof each object is a random variable 1198841 so the probabilitydistribution of 1198841 is shown in Table 1

Thus themathematical expectation of1198841 can be obtainedas follows

1198641 (1198841) = 11198961 times 1199021198961 + (1 + 11198961) times (1 minus 1199021198961)= 11198961 + (1 minus 1199021198961)

(3)

Then the total mathematical expectation of the domain canbe obtained as follows

119873 times 1198641 (1198841) = 119873times 11198961 times 1199021198961 + (1 + 11198961) times (1 minus 1199021198961) (4)

When the probability of 119901 keeps unchanged and 1198961satisfies inequality 1198641(1198841) lt 1 this single-level granulationmethod can reduce classification times For example if 119901 =05 and 1198961 gt 1 and according to Lemma 5 1198641(1198961) gt 1 nomatter what the value of 1198961Then this single-level granulationmethod is worse than traditional method (namely testingevery object in turn) Conversely if 119901 = 0001 and 1198961 = 321198641(1198841) will reach its minimum value and the classificationtime of single-level granulation method is less than the

Mathematical Problems in Engineering 5

N

k1 k1

k2 k2 middot middot middotmiddot middot middot

middot middot middot

Figure 2 Double-levels granulation

traditional method Let119873 = 10000 the total of classificationtimes is approximately equal to 628 as follows

119873 times 1198641 (1198841) = 10000times 132 times 11990232 + (1 + 132) times (1 minus 11990232)

asymp 628(5)

This shows that this method can greatly improve the effi-ciency of diagnosing and reduce 9372 classification times inthe single-level granulation method If there is an extremelylow prevalence rate for example 119901 = 0000001 the totalof classification times reaches its minimum value when eachgroup contains 1001 objects (namely 1198961 = 1001) If everygroup is subdivided into many smaller subgroups again andrepeating the above method can the total of classificationtimes be further reduced

32 The Double-Levels Granulation After the objects ofdomain are granulated by the method of Section 31 theoriginal objects space becomes a new quotient space in whicheach group has 1198961 objects According to the falsity and truthpreserving principles in quotient space theory if the group istested to be abnormal it can be granulated into many smallersubgroups The double-levels granulation can be shown inFigure 2

Then the probability distribution of the double-levelsgranulation is discussed as follows

If each group contains 1198961 objects and tested once in the1st layer the average of classification times is 11198961 for eachobject Similarly the average of classification times of eachobject is 11198962 in the 2nd layer When a subgroup contains 1198962objects and is tested to be abnormal every object has to beretested one by one once again in this subgroup so the totalof classification times of each object is equal to 11198962 + 1

For simplicity suppose that every group in the 1st layerwill be subdivided into two subgroups which respectivelycontain 11989621 and 11989621 objects in the 2nd layer

The classification time is shown in Table 2 (M representsthe testing result which is abnormal and represents nor-mal)

Table 2The average classification times of each objectwith differentresults

Times Objects1198961 11989621 11989622

Result1 3 + 11989621 M M 3 + 11989622 M M3 + 1198961 M M M

Table 3 The probability distribution of 11988421198842 11198961 11198961 + 11198962 11198961 + 11198962 + 1119901 1198842 = 1199102 1199021198961 (1 minus 1199021198961) times 1199021198962 (1 minus 1199021198961) times (1 minus 1199021198962)

For instance let 1198961 = 8 11989621 = 4 and 11989622 = 4 there arefour kinds of cases to happen

Case 1 If a group is tested to be normal in the 1st layer so thetotal of classification times of this group is 1198961 times 11198961 = 1Case 2 If a group is tested to be abnormal in the 1st layerand its one subgroup is tested to be abnormal and the othersubgroup is also tested to be abnormal in the 2nd layer so thetotal of classification times of this group is 11989621times(11198961+111989621+1) + 11989622 times (11198961 + 111989622) = 3 + 11989621 = 7Case 3 If a group is tested to be abnormal in the 1st layer itsone subgroup is tested to be normal and the other subgroupis tested to be abnormal in the 2nd layer so the total ofclassification times of this group is 11989621 times(11198961 +111989621)+11989622 times(11198961 + 111989622 + 1) = 3 + 11989622 = 7Case 4 If a group is tested to be abnormal in the 1st layer andits two subgroups are tested to be abnormal in the 2nd layerso the total of classification times of this group is 11989621times(11198961+111989621 + 1) + 11989622 times (11198961 + 111989622 + 1) = 3 + 1198961 = 11

Suppose each group contains 1198961 objects in the 1st layerand their every subgroup has 1198962 objects in the 2nd layerSupposing that the average classification times of each objectis a random variable 1198842 then the probability distribution of1198842 is shown in Table 3

Thus in the 2nd layer the mathematical expectation of1198842 which is the average classification times of each object isobtained as follows

1198642 (1198842) = 11198961 times 1199021198961 + ( 11198961 +11198962) times (1 minus 1199021198961) times 1199021198962

+ (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

= 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962) (6)

As long as the number of granulation levels increases to 2the average classification times of each object will be furtherreduced for instance when 119901 = 0001 and 119873 = 10000

6 Mathematical Problems in Engineering

Table 4 The probability distribution of 119884119894119884119894 119901 119884119894 = 11991011989411198961 119902119896111198961 +

11198962 (1 minus 1199021198961) times 1199021198962 119894sum119895=1

1119896119895 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894119894sum119895=1

1119896119895 + 1 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times (1 minus 119902119896119894)

As we know the minimum expectation of the total ofclassification times is about 628 with 1198961 = 32 in the single-level granulation And according to (6) and Lemma 6 1198642(1198842)will reach minimum value when 1198962 = 16 The minimummathematical expectation of each objectrsquos average classifica-tion times is shown as follows

119873 times 1198642 (119883) = 119873 times 11198961 times 1199021198961 + 11198961 +11198962 times (1 minus 1199021198961)

times 1199021198962 + (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

asymp 338(7)

The mathematical expectation of classification times cansave 9662 compared with traditional method and save4618 compared with single-level granulationmethod Nextwe will discuss 119894-levels granulation (119894 = 3 4 5 119899)33 The i-Levels Granulation For blood analysis case thegranulation strategy in 119894th layer is concluded by the knownobjects number of each group in previous layers (namely1198961 1198962 119896119894minus1 are known and just only 119896119894 is unknown)According to the double-levels granulationmethod and sup-posing that the classification time of each object is a randomvariable 119884119894 in the 119894-levels granulation so the probabilitydistribution of 119884119894 is shown in Table 4

Obviously the sum of probability distribution is equal to1 in each layer

Proof

Case 1 (the single-level granulation) One has

1199021198961 + 1 minus 1199021198961 = 1 (8)

Case 2 (the double-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + (1 minus 1199021198961) times (1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times (1199021198962 + 1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times 1 = 1

(9)

Case 3 (the 119894-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894 + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1)times (119902119896119894 + (1 minus 119902119896119894)) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) = 1199021198961+ (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (119902119896119894minus1 + (1 minus 119902119896119894minus1)) = sdot sdot sdot = 1199021198961 + (1 minus 1199021198961)times 1 = 1

(10)

The proof is completed

Definition 12 (classification times expectation of granulation)In a probability quotient space a multilevels granulationmodel will be established from domain 119880 = 1199091 1199092 119909119899which is a nonempty finite set the health rate is 119902 the maxgranular levels is 119871 and the number of objects in 119894th layer is119896119894 119894 = 1 2 119871 So the average classification time of eachobjects is 119864119894(119884119894) in 119894th layer

119864119894 (119884119894) = 11198961 +119871sum119894=2

[[1119896119894 times119894minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119894=1

(1 minus 119902119896119894) (11)

In this paper we mainly focus on establishing a mini-mum granulation expectation model of classification timesby multigranulation computing method For simplicity themathematical expectation of classification times will beregarded as the measure of contrasting with the searchingefficiency According to Lemma 5 themultilevels granulationmodel can simplify the complex problem only if the preva-lence rate 119901 isin (0 1 minus 119890minus119890minus1) in the blood analysis case

Theorem 13 Let the prevalence rate 119901 isin (0 03) if a groupis tested to be abnormal in the 1st layer (namely this groupcontains abnormal objects) the average classification times ofeach object will be further reduced by subdividing this grouponce again

Proof The expectation difference between the single-levelgranulation 1198641(1198841) and the double-levels granulation 1198642(1198842)can adequately embody their efficiency Under the conditions

Mathematical Problems in Engineering 7

of 119890minus119890minus1 lt 119902 lt 1 and 1 le 1198962 lt 1198961 and according to (3) and (6)the expectation difference1198641(1198841)minus1198642(1198842) is shown as follows

1198641 (1198841) minus 1198642 (1198842)= 11198961 + (1 minus 1199021198961)

minus 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962)= (1 minus 1199021198961) times 1 minus ( 11198962 + 1 minus 1199021198962) gt 0

(12)

According to Lemma 5 (1 minus 1199021198961) gt 0 and 119891119902(1198962) = 11198962 + 1 minus1199021198962 lt 1 always hold then we can get 1minus(11198962+1minus1199021198962) gt 0 Sothe inequality 1198641(119883) minus 1198642(119883) gt 0 is proved successfully

Theorem 13 illustrates that it can reduce classificationtimes by continuing to granulate the abnormal groups intothe 2nd layer when 1198961 gt 1 There is attempt to prove thatthe total of classification times will be further reduced bycontinuously granulating the abnormal groups into 119894th layersuntil a grouprsquos number is no less than 1

Theorem 14 Supposing the prevalence rate 119901 isin (0 03)if a group is tested to be abnormal (namely this groupcontains abnormal objects) the average classification times ofeach object will be reduced by continuously subdividing theabnormal group until the objects number of its subgroup is noless than 1

Proof The expectation difference between (119894 minus 1)-levelsgranulation 119864119894minus1(119884119894minus1) and 119894-levels granulation 119864119894(119884119894) reflectstheir efficiency On the condition of 119890minus119890minus1 lt 119902 lt 1 and1 le 119896119894 lt 119896119894minus1 and according to (11) the expectation difference119864119894minus1(119884119894minus1) minus 119864119894(119884119894) is shown as follows

119864119894minus1 (119884119894minus1) minus 119864119894 (119884119894) = 11198961 +119894minus1sum119897=2

[[1119896119897 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119894minus1prod119897=1

(1 minus 119902119896119897) minus 11198961 +

119894sum119897=2

[[1119896119894 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119897=1

(1 minus 119902119896119897)= (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1)

times 1 minus ( 1119896119894 + 1 minus 119902119896119894) gt 0

(13)

Because (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1) gt 0 is known accordingto Lemma 5 and 119896119894 ge 1 we can get (1119896119894 + 1 minus 119902119896119894) lt 1namely 1 minus (1119896119894 + 1 minus 119902119896119894) gt 0 So 119864119894minus1 minus 119864119894 gt 0 is provedsuccessfully

Theorem 14 shows that this method will continuouslyimprove the searching efficiency in the process of granu-lating abnormal groups from 1st layer to 119894th layer because

119864119894minus1(119884119894minus1) minus119864119894(119884119894) gt 0 always holds However it is found thatthe classification times cannot be reduced when the objectsnumber of an abnormal group is less than or equal to 4 so theobjects of this abnormal group should be tested one by oneIn order to achieve the best efficiency then we will explorehow to determine the optimum granulation namely how todetermine the optimum objects number of each group andhow to obtain the optimum granulation levels

34 The Optimum Granulation It is a difficult and key pointto explore an appropriate granularity space for dealing witha complex problem And it not only requires us to keep theintegrity of the original information but also simplify thecomplex problem So we take the blood analysis case as anexample to explain how to obtain the optimum granularityspace in this paper Suppose the condition 119890minus119890minus1 lt 119902 lt 1always holds

Case 1 (granulating abnormal groups from the 1st layer to the2nd layer) (a) If 1198961 is an even number every group whichcontains 1198961 objects in 1st layer will be subdivided into twosubgroups into 2nd layer

Scheme 15 Supposing the one subgroup of the 2nd layer has119894 (1 le 119894 lt 11989612) objects according to formula (6) theexpectation of classification times for each object is1198642(119894) Andthe other subgroup has (1198961 minus 119894) objects so the expectation ofclassification times for each object is 1198642(1198961 minus 119894) The averageexpectation of classification times for each object in the 2ndlayer is shown as follows

119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (1198961 minus 119894)1198961 (14)

Scheme 16 Suppose every abnormal group in 1st layer is aver-age subdivided into two subgroups namely each subgrouphas 11989612 objects in the 2nd layer According to formula (6)the average expectation of classification times for each objectin the 2nd layer is shown as follows

2 times 11989612 times 1198642 (11989612)1198961 = 1198961 times 1198642 (11989612)1198961 (15)

The expectation difference between the above twoschemes embodies their efficiency In order to prove thatScheme 16 is more efficient than Scheme 15 we only need toprove that the following inequality is correct namely

(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))1198961 minus 1198961 times 1198642 (11989612)1198961

gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (16)

8 Mathematical Problems in Engineering

Table 5 The changes of average expectation with different objects number in two groups

(11989621 11989622) (1 15) (2 14) (3 13) (4 12) (5 11) (6 10) (7 9) (8 8)1198642 007367 007329 007297 007270 007249 007234 007225 007222

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + (1198961 minus 119894)

times ( 11198961 + (1 minus 1199021198961) times (1 + 1(1198961 minus 119894) minus 119902(1198961minus119894)))

gt 1198961 ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))

1198961 minus 1198961 times 1198642 (11989612)1198961gt 0

(17)

The proof is completed

Therefore if every group has 1198961 (1198961 is an even number and1198961 gt 1) objects in the 1st layer that need to be subdivided intotwo subgroup Scheme 16 is more efficient than Scheme 15

The experiment results have verified the above conclusionin Table 5 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the expectationof classification times obtainsminimumvalue for each objectwhere 11989621 is the number of the one subgroup in the 2ndlayer 11989622 is the number of the other subgroup and 1198642 isthe corresponding expectation of classification times for eachobject

(b) If 1198961 is an even number every group which contains1198961 objects in 1st layer will be subdivided into three subgroupsinto 2nd layer

Scheme 17 In the 2nd layer if the first subgroup has 119894 (1 le 119894 lt11989612) objects the average expectation of classification timesfor each object is 1198642(119894) If the second subgroup has 119895 (1 le119895 lt 11989612) objects the expectation of classification times foreach object is 1198642(119895) Then the third subgroup has (1198961 minus 119894 minus119895) objects and the average expectation of classification timesfor each object is 1198642(1198961 minus 119894 minus 119895) So the average expectationof classification times for each object in the 2nd is shown asfollows

119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895)1198961 (18)

Similarly it is easy to be prove that Scheme 16 is alsomoreefficient than Scheme 17 In other words we only need toprove the following inequality namely

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (19)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119905 times 119902119894 + (1198961 minus 119905) times 119902(1198961minus119905))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + 119895

times ( 11198961 + (1 minus 1199021198961) times (1 + 1119895 minus 119902119895)) + (1198961 minus 119894 minus 119895)times ( 11198961 + (1 minus 1199021198961) times (1 + 1

(1198961 minus 119894 minus 119895) minus 119902(1198961minus119894minus119895)))gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0

(20)

The proof is completed

Therefore if every group which contains 1198961 (is an evennumber and 1198961 gt 1) objects needs to be subdivided into threesubgroups in the first layer Scheme 16 is more efficient thanScheme 17

The experimental results have verified the above conclu-sion in Table 6 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the averageexpectation of classification times reachesminimumvalue foreach object In Table 6 the first line stands for the objectsnumber of first group in the 2nd layer and the first row standsfor the objects number of second group and data stands forthe corresponding average expectation of classification timesFor example (1 1 77143) expresses that the objects number

Mathematical Problems in Engineering 9

Table 6 The changes of average expectation with different objectsnumber of three groups

Objects Objects1 2 3 4 5

Expectation (times10minus2)1 77143 76786 76488 76250 760722 76786 76458 76189 75980 758313 76488 76189 75950 75770 756514 76250 75980 75770 75620 755305 76072 75831 75651 75530 754706 75953 75742 75591 75500 754707 75894 75712 75591 75530 755308 75894 75742 75651 75620 756519 75953 75831 75770 75770 75980

of three groups respectively is 1 1 and 14 and the averageclassification time for each object is 1198642 = 0077143 in the 2ndlayer

(c) When an abnormal group contains 1198961 (even) objectsand it needs to be further granulated into the 2nd layerScheme 16 still has the best efficient

There are two granulation schemes Scheme 18 is that theabnormal groups are randomly subdivided into 119904 (119904 lt 1198961)subgroups in the 1st layer and Scheme 16 is that the abnormalgroups are averagely subdivided into two subgroups in the 1stlayer

Scheme 18 Supposing an abnormal group will be subdividedinto 119904 (119904 lt 1198961) subgroups The first group has 1199091 (1 le 1199091 lt11989612) objects and the average expectation of classificationtimes for each object is 1198642(1199091) the 2nd subgroup has1199092 (1 le 1199092 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(1199092) 119894th subgrouphas 119909119894 (1 le 119909119894 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(119909119894) 119904th subgrouphas 119909119904 (1 le 119909119904 lt 11989612) objects and the average expectationof classification times for each object is 1198642(119909119904) Hence theaverage expectation of classification times for each object inthe 2nd layer is shown as follows

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) (21)

Similarly in order to prove that Scheme 16 is moreefficient than Scheme 18 we only need to prove the followinginequality namely

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0(119890minus119890minus1 lt 119902 lt 1 1198961 gt 1)

(22)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 Then we have

sum119904119894=1 119892 (119909119894)2 lt 119892 ( 1198882) 997904rArrsum119904119894=1 119909119894 times 119902119909119894

2 lt 11989612 times 11990211989612 997904rArr119904sum119895=1

119909119895 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119909119895 minus 119902119909119895))

gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0

(23)

The proof is completed

Therefore when every abnormal groupwhich contains 1198961(which is an even number and 1198961 gt 1) in the 1st layer objectsneeds to be granulated into many subgroups Scheme 16 ismore efficient than other schemes

(d) In a similar way when every abnormal group whichcontains 1198961 (1198961 is an odd number and 1198961 gt 1) objects in the 1stlayerwill be granulated intomany subgroups the best schemeis that every abnormal group is uniformly subdivided intotwo subgroups namely each subgroup contains (1198961 minus 1)2or (1198961 + 1)2 objects in the 2nd layer

Case 2 (granulating abnormal groups from the 1st layer to 119894thlayer)

Theorem 19 In 119894th layer if the objects number of eachabnormal group is more than 4 then the total of classificationtimes can be reduced by keeping on subdividing the abnormalgroups into two subgroups which contain equal objects as far aspossible Namely if each group contains 119896119894 objects in 119894th layerthen each subgroupmay contain 1198961198942 or (1198961minus1)2 or (1198961+1)2objects in (119894 + 1)th layerProof In the multigranulation method the objects numberof each subgroup in the next layer is determined by the objectsnumber of group in the current layer In other words theobjects number of each subgroup in the (119894 + 1)th layer isdetermined by the known objects number of each group in119894th layer

According to recursive idea the process of granulatingabnormal group from 119894th layer into (119894+1)th layer is similar tothat 1st layer into 2nd layer It is known that the best efficientway is as far as possible uniformly subdividing an abnormalgroup in current layer into two subgroups in next layer whengranulating abnormal group from 1st layer into 2nd layerTherefore the best efficient way is also as far as possibleuniformly subdividing each abnormal group in 119894th layer intotwo subgroups in (119894 + 1)th layer The proof is completed

Based on 1198961 which is the optimum objects number ofeach group in 1th layer then the optimum granulation levels

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

Mathematical Problems in Engineering 3

M0

M1

M2

y = f(x)

x

f(i) + f(c minus i)

2= M2

ltf((c minus 1)2) + f((c + 1)2)

2= M1

ltf(c2) = M0

i (c minus 1)2 c2 c minus i

y

(c + 1)2

Figure 1 Property of convex function

Proof We can obtain the first and second derivatives of 119892(119909)as follows1198921015840 (119909) = 119902119909 (1 + 119909 ln 119902) 11989210158401015840 (119909) = 119902119909 ln 119902 (2 + 119909 ln 119902)

11989210158401015840 (119909) =

119902119909 ln 119902 (2 + 119909 ln 119902) lt 0 1 le 119909 lt 1199090119902119909 ln 119902 (2 + 119909 ln 119902) = 0 119909 = 1199090119902119909 ln 119902 (2 + 119909 ln 119902) gt 0 119909 gt 1199090

where 1199090 = minus 2ln 119902 119890minus119890

minus1 lt 119902 lt 1

(2)

So we can draw a conclusion that 119892(119909) is a convex function(property of convex function is shown in Figure 1) when1 le 119909 lt 1 minus 1 ln 119902 lt 1199090 = minus2 ln 119902 According to thedefinition of convex function the inequality (119892(119894) + 119892(119888 minus119894))2 le (119892((119888 minus 1)2) + 119892((119888 + 1)2))2 lt 119892(1198882) holdspermanently So Lemma 7 is proved completely

3 Binary Classification ofMultigranulation Searching ModelBased on Probability and Statistics

Granulation is seen as a way of constructing simple theo-ries out of more complex ones [1] At the same time thetransformation between two different granularity layers ismainly based on the falsity and truth preserving principlesin quotient space theory And it can solve many classicalproblems such as scale ball game and branch decisions Allthe above instances just clearly embody the ideas to solvecomplex problems with multigranulation methods

In the second section the relevant concepts about multi-granulation computing theory and probability theory arereviewed Of course a multigranulation binary classificationsearching model not only solves practical complex problemswith less cost but also can be easily constructed And thismodel may also play a certain role in the inspiration for theapplications of multigranulation computing theory

Generally supposing a nonempty finite set 119880 = 11990911199092 119909119899 where 119909119894 (119894 = 1 2 3 119899) is a binary classi-fication object the triples (119880 2119880 119901) is called a probability

quotient space with probability 119901 where 119901 is the negativesample rate in 119880 2119880 is the structure of universe 119880 Andlet (1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898 cap 119880119899 = Φ 119898 119899 isin1 2 119905 119898 = 119899) be called a random partition space onprobability quotient space There is an example of binaryclassification of multigranulation searching model

Example 8 On the assumption that many people need todo blood analysis of physical examination for diagnosinga disease (there are two classes normal stands for healthand abnormal stands for sickness) the domain 119880 =1199091 1199092 119909119899 stands for all the people Let 119873 denote thenumber of all people and 119901 stands for the prevalence rate Sothe quotient space of blood analysis of physical examinationis (119880 2119880 119901) Besides we also know a binary classifier (or areagent) that diagnoses a disease by testing blood sampleHow can we complete all the blood analysis with the minimalclassification times Namely how can we determine the classof all objects in domain There are usually three ways asfollows

Method 9 (traditional method) In order to accurately diag-nose all objects every blood sample will be tested one by oneso this method needs to search 119873 times This method is justlike the classification process of machine learning

Method 10 (single-level granulation method) This methodis to mix 119896 blood samples to a group where 119896 may be1 2 3 119899 namely the original quotient space will berandompartition to (1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898cap119880119899 =Φ 119898 119899 isin 1 2 119905 119898 = 119899) And then each mixed bloodgroup will be tested once If the testing result of a group isabnormal and according to Theorem 2 we know that thisabnormal group has abnormal object(s) In order to make adiagnosis all of the objects in this group should be tested onceagain one by one Similarly if the testing result of a group isnormal and according to Theorem 3 we know that all of theobjects are normal in this group Therefore all 119896 objects inthis group only need one time tomake a diagnosisThe binaryclassifier can also classify the new blood sample that consistsof 119896 blood samples in this process

If every group has been mixed by large-scale bloodsamples (namely 119896 is a large number) and when some

4 Mathematical Problems in Engineering

groups are tested to be abnormal that means lots of objectsmust be tested once again one by one In order to reducethe classification times this paper proposes a multilevelsgranulation model

Method 11 (multilevels granulation method) Firstly eachmixed blood group which contains 1198961 samples (objects)will be tested once where 1198961 may be 1 2 3 119899 namelythe original quotient space will be random partition to(1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898 cap 119880119899 = Φ 119898 119899 isin1 2 119905 119898 = 119899) Next if some groups are tested to benormal that means all objects in those groups are normal(health)Therefore all 1198961 objects only need one time to makea diagnosis in this group Else if some groups are tested to beabnormal those groups will be subdivided into many smallersubsets (subgroups) which contain 1198962 (1198962 lt 1198961) objectsnamely the quotient space of an abnormal group 119880119894 will berandom partition to (1198801198941 1198801198942 119880119894119897) (⋃119897119895=1119880119894119895 = 119880119894 119880119894119898 cap119880119894119899 = Φ 119898 119899 isin 1 2 119897 119898 = 119899 119897 lt 1198961) Finally eachsubgroup will be tested once again Similarly if a subgroup istested to be normal it is confirmed that all objects are healthin corresponding subgroup and if a subgroup is tested to beabnormal it will be subdivided into smaller subgroups whichcontain 1198963 (1198963 lt 1198962) objects once againTherefore the testingresults of all objects can be ensured by repeating the aboveprocess in a group until the number of objects is equal to 1 orits testing result is normal in a subgroupThen the searchingefficiency of the above three methods is analyzed as follows

In Method 9 every object has to be tested once fordiagnosing a disease so it must take up119873 times in total

In Method 10 the original problem space is subdividedinto many disjoint subspaces (subsets) If some subsets aretested to be normal that means all objects need only to betested onceTherefore the classification times can be reducedif the probability 119901 is small enough in some degree [9]

In Method 11 the key is trying to find the optimalmultigranulation space for searching all objects so a mul-tilevels granulation model needs to be established Thereare two questions One is grouping strategy namely howmany objects are contained in a group The other one isoptimal granulation levels namely how many levels shouldbe granulated from the original problem space

In this paper we mainly solve the two questions inMethod 11 According to the truth and falsity preservingprinciple in quotient space theory all normal parts of bloodsamples could be ignored Hence the original problem willbe simplified to a smaller subspace This idea not onlyreduces the complexity of the problem but also improves theefficiency of searching abnormal objects

Algorithm Strategy Example 8 can be regarded as a treestructure which each node (which stands for a group) isan 119909-tuple Obviously the searching problem of the treehas been transformed into a hierarchical reasoning processin a monotonous relation sequence The original space hasbeen transformed into a hierarchical structure where allsubproblems will be solved in different granularity levels

Table 1 The probability distribution of 11988411198841 11198961 11198961 + 1119901 1198841 = 1199101 1199021198961 1 minus 1199021198961

Firstly the general rule can be concluded by analyzing thesimplest hierarchy and grouping case which is the single-levelgranulation Secondly we can calculate the mathematicalexpectation of the classification times of blood analysisFinally an optimal hierarchical granulation model will beestablished by comparing with the expectation of classifica-tion times

Analysis Supposing that there is an object set 119884 =1199101 1199102 119910119899 the prevalence rate is 119901 So the probability ofan object that appears normal in blood analysis is 119902 = 1 minus 119901The probability of a group which is tested to be normal is 1199021198961 and to be abnormal is 1 minus 1199021198961 where 1198961 is the objects numberof a group

31 The Single-Level Granulation Firstly the domain (whichcontains 119873 objects) is randomly subdivided into many sub-groups where each subset contains 1198961 objects In otherwordsa new quotient space [1198841] is obtained based on an equivalencerelation119877Then supposing that the average classification timeof each object is a random variable 1198841 so the probabilitydistribution of 1198841 is shown in Table 1

Thus themathematical expectation of1198841 can be obtainedas follows

1198641 (1198841) = 11198961 times 1199021198961 + (1 + 11198961) times (1 minus 1199021198961)= 11198961 + (1 minus 1199021198961)

(3)

Then the total mathematical expectation of the domain canbe obtained as follows

119873 times 1198641 (1198841) = 119873times 11198961 times 1199021198961 + (1 + 11198961) times (1 minus 1199021198961) (4)

When the probability of 119901 keeps unchanged and 1198961satisfies inequality 1198641(1198841) lt 1 this single-level granulationmethod can reduce classification times For example if 119901 =05 and 1198961 gt 1 and according to Lemma 5 1198641(1198961) gt 1 nomatter what the value of 1198961Then this single-level granulationmethod is worse than traditional method (namely testingevery object in turn) Conversely if 119901 = 0001 and 1198961 = 321198641(1198841) will reach its minimum value and the classificationtime of single-level granulation method is less than the

Mathematical Problems in Engineering 5

N

k1 k1

k2 k2 middot middot middotmiddot middot middot

middot middot middot

Figure 2 Double-levels granulation

traditional method Let119873 = 10000 the total of classificationtimes is approximately equal to 628 as follows

119873 times 1198641 (1198841) = 10000times 132 times 11990232 + (1 + 132) times (1 minus 11990232)

asymp 628(5)

This shows that this method can greatly improve the effi-ciency of diagnosing and reduce 9372 classification times inthe single-level granulation method If there is an extremelylow prevalence rate for example 119901 = 0000001 the totalof classification times reaches its minimum value when eachgroup contains 1001 objects (namely 1198961 = 1001) If everygroup is subdivided into many smaller subgroups again andrepeating the above method can the total of classificationtimes be further reduced

32 The Double-Levels Granulation After the objects ofdomain are granulated by the method of Section 31 theoriginal objects space becomes a new quotient space in whicheach group has 1198961 objects According to the falsity and truthpreserving principles in quotient space theory if the group istested to be abnormal it can be granulated into many smallersubgroups The double-levels granulation can be shown inFigure 2

Then the probability distribution of the double-levelsgranulation is discussed as follows

If each group contains 1198961 objects and tested once in the1st layer the average of classification times is 11198961 for eachobject Similarly the average of classification times of eachobject is 11198962 in the 2nd layer When a subgroup contains 1198962objects and is tested to be abnormal every object has to beretested one by one once again in this subgroup so the totalof classification times of each object is equal to 11198962 + 1

For simplicity suppose that every group in the 1st layerwill be subdivided into two subgroups which respectivelycontain 11989621 and 11989621 objects in the 2nd layer

The classification time is shown in Table 2 (M representsthe testing result which is abnormal and represents nor-mal)

Table 2The average classification times of each objectwith differentresults

Times Objects1198961 11989621 11989622

Result1 3 + 11989621 M M 3 + 11989622 M M3 + 1198961 M M M

Table 3 The probability distribution of 11988421198842 11198961 11198961 + 11198962 11198961 + 11198962 + 1119901 1198842 = 1199102 1199021198961 (1 minus 1199021198961) times 1199021198962 (1 minus 1199021198961) times (1 minus 1199021198962)

For instance let 1198961 = 8 11989621 = 4 and 11989622 = 4 there arefour kinds of cases to happen

Case 1 If a group is tested to be normal in the 1st layer so thetotal of classification times of this group is 1198961 times 11198961 = 1Case 2 If a group is tested to be abnormal in the 1st layerand its one subgroup is tested to be abnormal and the othersubgroup is also tested to be abnormal in the 2nd layer so thetotal of classification times of this group is 11989621times(11198961+111989621+1) + 11989622 times (11198961 + 111989622) = 3 + 11989621 = 7Case 3 If a group is tested to be abnormal in the 1st layer itsone subgroup is tested to be normal and the other subgroupis tested to be abnormal in the 2nd layer so the total ofclassification times of this group is 11989621 times(11198961 +111989621)+11989622 times(11198961 + 111989622 + 1) = 3 + 11989622 = 7Case 4 If a group is tested to be abnormal in the 1st layer andits two subgroups are tested to be abnormal in the 2nd layerso the total of classification times of this group is 11989621times(11198961+111989621 + 1) + 11989622 times (11198961 + 111989622 + 1) = 3 + 1198961 = 11

Suppose each group contains 1198961 objects in the 1st layerand their every subgroup has 1198962 objects in the 2nd layerSupposing that the average classification times of each objectis a random variable 1198842 then the probability distribution of1198842 is shown in Table 3

Thus in the 2nd layer the mathematical expectation of1198842 which is the average classification times of each object isobtained as follows

1198642 (1198842) = 11198961 times 1199021198961 + ( 11198961 +11198962) times (1 minus 1199021198961) times 1199021198962

+ (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

= 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962) (6)

As long as the number of granulation levels increases to 2the average classification times of each object will be furtherreduced for instance when 119901 = 0001 and 119873 = 10000

6 Mathematical Problems in Engineering

Table 4 The probability distribution of 119884119894119884119894 119901 119884119894 = 11991011989411198961 119902119896111198961 +

11198962 (1 minus 1199021198961) times 1199021198962 119894sum119895=1

1119896119895 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894119894sum119895=1

1119896119895 + 1 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times (1 minus 119902119896119894)

As we know the minimum expectation of the total ofclassification times is about 628 with 1198961 = 32 in the single-level granulation And according to (6) and Lemma 6 1198642(1198842)will reach minimum value when 1198962 = 16 The minimummathematical expectation of each objectrsquos average classifica-tion times is shown as follows

119873 times 1198642 (119883) = 119873 times 11198961 times 1199021198961 + 11198961 +11198962 times (1 minus 1199021198961)

times 1199021198962 + (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

asymp 338(7)

The mathematical expectation of classification times cansave 9662 compared with traditional method and save4618 compared with single-level granulationmethod Nextwe will discuss 119894-levels granulation (119894 = 3 4 5 119899)33 The i-Levels Granulation For blood analysis case thegranulation strategy in 119894th layer is concluded by the knownobjects number of each group in previous layers (namely1198961 1198962 119896119894minus1 are known and just only 119896119894 is unknown)According to the double-levels granulationmethod and sup-posing that the classification time of each object is a randomvariable 119884119894 in the 119894-levels granulation so the probabilitydistribution of 119884119894 is shown in Table 4

Obviously the sum of probability distribution is equal to1 in each layer

Proof

Case 1 (the single-level granulation) One has

1199021198961 + 1 minus 1199021198961 = 1 (8)

Case 2 (the double-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + (1 minus 1199021198961) times (1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times (1199021198962 + 1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times 1 = 1

(9)

Case 3 (the 119894-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894 + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1)times (119902119896119894 + (1 minus 119902119896119894)) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) = 1199021198961+ (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (119902119896119894minus1 + (1 minus 119902119896119894minus1)) = sdot sdot sdot = 1199021198961 + (1 minus 1199021198961)times 1 = 1

(10)

The proof is completed

Definition 12 (classification times expectation of granulation)In a probability quotient space a multilevels granulationmodel will be established from domain 119880 = 1199091 1199092 119909119899which is a nonempty finite set the health rate is 119902 the maxgranular levels is 119871 and the number of objects in 119894th layer is119896119894 119894 = 1 2 119871 So the average classification time of eachobjects is 119864119894(119884119894) in 119894th layer

119864119894 (119884119894) = 11198961 +119871sum119894=2

[[1119896119894 times119894minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119894=1

(1 minus 119902119896119894) (11)

In this paper we mainly focus on establishing a mini-mum granulation expectation model of classification timesby multigranulation computing method For simplicity themathematical expectation of classification times will beregarded as the measure of contrasting with the searchingefficiency According to Lemma 5 themultilevels granulationmodel can simplify the complex problem only if the preva-lence rate 119901 isin (0 1 minus 119890minus119890minus1) in the blood analysis case

Theorem 13 Let the prevalence rate 119901 isin (0 03) if a groupis tested to be abnormal in the 1st layer (namely this groupcontains abnormal objects) the average classification times ofeach object will be further reduced by subdividing this grouponce again

Proof The expectation difference between the single-levelgranulation 1198641(1198841) and the double-levels granulation 1198642(1198842)can adequately embody their efficiency Under the conditions

Mathematical Problems in Engineering 7

of 119890minus119890minus1 lt 119902 lt 1 and 1 le 1198962 lt 1198961 and according to (3) and (6)the expectation difference1198641(1198841)minus1198642(1198842) is shown as follows

1198641 (1198841) minus 1198642 (1198842)= 11198961 + (1 minus 1199021198961)

minus 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962)= (1 minus 1199021198961) times 1 minus ( 11198962 + 1 minus 1199021198962) gt 0

(12)

According to Lemma 5 (1 minus 1199021198961) gt 0 and 119891119902(1198962) = 11198962 + 1 minus1199021198962 lt 1 always hold then we can get 1minus(11198962+1minus1199021198962) gt 0 Sothe inequality 1198641(119883) minus 1198642(119883) gt 0 is proved successfully

Theorem 13 illustrates that it can reduce classificationtimes by continuing to granulate the abnormal groups intothe 2nd layer when 1198961 gt 1 There is attempt to prove thatthe total of classification times will be further reduced bycontinuously granulating the abnormal groups into 119894th layersuntil a grouprsquos number is no less than 1

Theorem 14 Supposing the prevalence rate 119901 isin (0 03)if a group is tested to be abnormal (namely this groupcontains abnormal objects) the average classification times ofeach object will be reduced by continuously subdividing theabnormal group until the objects number of its subgroup is noless than 1

Proof The expectation difference between (119894 minus 1)-levelsgranulation 119864119894minus1(119884119894minus1) and 119894-levels granulation 119864119894(119884119894) reflectstheir efficiency On the condition of 119890minus119890minus1 lt 119902 lt 1 and1 le 119896119894 lt 119896119894minus1 and according to (11) the expectation difference119864119894minus1(119884119894minus1) minus 119864119894(119884119894) is shown as follows

119864119894minus1 (119884119894minus1) minus 119864119894 (119884119894) = 11198961 +119894minus1sum119897=2

[[1119896119897 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119894minus1prod119897=1

(1 minus 119902119896119897) minus 11198961 +

119894sum119897=2

[[1119896119894 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119897=1

(1 minus 119902119896119897)= (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1)

times 1 minus ( 1119896119894 + 1 minus 119902119896119894) gt 0

(13)

Because (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1) gt 0 is known accordingto Lemma 5 and 119896119894 ge 1 we can get (1119896119894 + 1 minus 119902119896119894) lt 1namely 1 minus (1119896119894 + 1 minus 119902119896119894) gt 0 So 119864119894minus1 minus 119864119894 gt 0 is provedsuccessfully

Theorem 14 shows that this method will continuouslyimprove the searching efficiency in the process of granu-lating abnormal groups from 1st layer to 119894th layer because

119864119894minus1(119884119894minus1) minus119864119894(119884119894) gt 0 always holds However it is found thatthe classification times cannot be reduced when the objectsnumber of an abnormal group is less than or equal to 4 so theobjects of this abnormal group should be tested one by oneIn order to achieve the best efficiency then we will explorehow to determine the optimum granulation namely how todetermine the optimum objects number of each group andhow to obtain the optimum granulation levels

34 The Optimum Granulation It is a difficult and key pointto explore an appropriate granularity space for dealing witha complex problem And it not only requires us to keep theintegrity of the original information but also simplify thecomplex problem So we take the blood analysis case as anexample to explain how to obtain the optimum granularityspace in this paper Suppose the condition 119890minus119890minus1 lt 119902 lt 1always holds

Case 1 (granulating abnormal groups from the 1st layer to the2nd layer) (a) If 1198961 is an even number every group whichcontains 1198961 objects in 1st layer will be subdivided into twosubgroups into 2nd layer

Scheme 15 Supposing the one subgroup of the 2nd layer has119894 (1 le 119894 lt 11989612) objects according to formula (6) theexpectation of classification times for each object is1198642(119894) Andthe other subgroup has (1198961 minus 119894) objects so the expectation ofclassification times for each object is 1198642(1198961 minus 119894) The averageexpectation of classification times for each object in the 2ndlayer is shown as follows

119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (1198961 minus 119894)1198961 (14)

Scheme 16 Suppose every abnormal group in 1st layer is aver-age subdivided into two subgroups namely each subgrouphas 11989612 objects in the 2nd layer According to formula (6)the average expectation of classification times for each objectin the 2nd layer is shown as follows

2 times 11989612 times 1198642 (11989612)1198961 = 1198961 times 1198642 (11989612)1198961 (15)

The expectation difference between the above twoschemes embodies their efficiency In order to prove thatScheme 16 is more efficient than Scheme 15 we only need toprove that the following inequality is correct namely

(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))1198961 minus 1198961 times 1198642 (11989612)1198961

gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (16)

8 Mathematical Problems in Engineering

Table 5 The changes of average expectation with different objects number in two groups

(11989621 11989622) (1 15) (2 14) (3 13) (4 12) (5 11) (6 10) (7 9) (8 8)1198642 007367 007329 007297 007270 007249 007234 007225 007222

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + (1198961 minus 119894)

times ( 11198961 + (1 minus 1199021198961) times (1 + 1(1198961 minus 119894) minus 119902(1198961minus119894)))

gt 1198961 ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))

1198961 minus 1198961 times 1198642 (11989612)1198961gt 0

(17)

The proof is completed

Therefore if every group has 1198961 (1198961 is an even number and1198961 gt 1) objects in the 1st layer that need to be subdivided intotwo subgroup Scheme 16 is more efficient than Scheme 15

The experiment results have verified the above conclusionin Table 5 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the expectationof classification times obtainsminimumvalue for each objectwhere 11989621 is the number of the one subgroup in the 2ndlayer 11989622 is the number of the other subgroup and 1198642 isthe corresponding expectation of classification times for eachobject

(b) If 1198961 is an even number every group which contains1198961 objects in 1st layer will be subdivided into three subgroupsinto 2nd layer

Scheme 17 In the 2nd layer if the first subgroup has 119894 (1 le 119894 lt11989612) objects the average expectation of classification timesfor each object is 1198642(119894) If the second subgroup has 119895 (1 le119895 lt 11989612) objects the expectation of classification times foreach object is 1198642(119895) Then the third subgroup has (1198961 minus 119894 minus119895) objects and the average expectation of classification timesfor each object is 1198642(1198961 minus 119894 minus 119895) So the average expectationof classification times for each object in the 2nd is shown asfollows

119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895)1198961 (18)

Similarly it is easy to be prove that Scheme 16 is alsomoreefficient than Scheme 17 In other words we only need toprove the following inequality namely

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (19)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119905 times 119902119894 + (1198961 minus 119905) times 119902(1198961minus119905))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + 119895

times ( 11198961 + (1 minus 1199021198961) times (1 + 1119895 minus 119902119895)) + (1198961 minus 119894 minus 119895)times ( 11198961 + (1 minus 1199021198961) times (1 + 1

(1198961 minus 119894 minus 119895) minus 119902(1198961minus119894minus119895)))gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0

(20)

The proof is completed

Therefore if every group which contains 1198961 (is an evennumber and 1198961 gt 1) objects needs to be subdivided into threesubgroups in the first layer Scheme 16 is more efficient thanScheme 17

The experimental results have verified the above conclu-sion in Table 6 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the averageexpectation of classification times reachesminimumvalue foreach object In Table 6 the first line stands for the objectsnumber of first group in the 2nd layer and the first row standsfor the objects number of second group and data stands forthe corresponding average expectation of classification timesFor example (1 1 77143) expresses that the objects number

Mathematical Problems in Engineering 9

Table 6 The changes of average expectation with different objectsnumber of three groups

Objects Objects1 2 3 4 5

Expectation (times10minus2)1 77143 76786 76488 76250 760722 76786 76458 76189 75980 758313 76488 76189 75950 75770 756514 76250 75980 75770 75620 755305 76072 75831 75651 75530 754706 75953 75742 75591 75500 754707 75894 75712 75591 75530 755308 75894 75742 75651 75620 756519 75953 75831 75770 75770 75980

of three groups respectively is 1 1 and 14 and the averageclassification time for each object is 1198642 = 0077143 in the 2ndlayer

(c) When an abnormal group contains 1198961 (even) objectsand it needs to be further granulated into the 2nd layerScheme 16 still has the best efficient

There are two granulation schemes Scheme 18 is that theabnormal groups are randomly subdivided into 119904 (119904 lt 1198961)subgroups in the 1st layer and Scheme 16 is that the abnormalgroups are averagely subdivided into two subgroups in the 1stlayer

Scheme 18 Supposing an abnormal group will be subdividedinto 119904 (119904 lt 1198961) subgroups The first group has 1199091 (1 le 1199091 lt11989612) objects and the average expectation of classificationtimes for each object is 1198642(1199091) the 2nd subgroup has1199092 (1 le 1199092 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(1199092) 119894th subgrouphas 119909119894 (1 le 119909119894 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(119909119894) 119904th subgrouphas 119909119904 (1 le 119909119904 lt 11989612) objects and the average expectationof classification times for each object is 1198642(119909119904) Hence theaverage expectation of classification times for each object inthe 2nd layer is shown as follows

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) (21)

Similarly in order to prove that Scheme 16 is moreefficient than Scheme 18 we only need to prove the followinginequality namely

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0(119890minus119890minus1 lt 119902 lt 1 1198961 gt 1)

(22)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 Then we have

sum119904119894=1 119892 (119909119894)2 lt 119892 ( 1198882) 997904rArrsum119904119894=1 119909119894 times 119902119909119894

2 lt 11989612 times 11990211989612 997904rArr119904sum119895=1

119909119895 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119909119895 minus 119902119909119895))

gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0

(23)

The proof is completed

Therefore when every abnormal groupwhich contains 1198961(which is an even number and 1198961 gt 1) in the 1st layer objectsneeds to be granulated into many subgroups Scheme 16 ismore efficient than other schemes

(d) In a similar way when every abnormal group whichcontains 1198961 (1198961 is an odd number and 1198961 gt 1) objects in the 1stlayerwill be granulated intomany subgroups the best schemeis that every abnormal group is uniformly subdivided intotwo subgroups namely each subgroup contains (1198961 minus 1)2or (1198961 + 1)2 objects in the 2nd layer

Case 2 (granulating abnormal groups from the 1st layer to 119894thlayer)

Theorem 19 In 119894th layer if the objects number of eachabnormal group is more than 4 then the total of classificationtimes can be reduced by keeping on subdividing the abnormalgroups into two subgroups which contain equal objects as far aspossible Namely if each group contains 119896119894 objects in 119894th layerthen each subgroupmay contain 1198961198942 or (1198961minus1)2 or (1198961+1)2objects in (119894 + 1)th layerProof In the multigranulation method the objects numberof each subgroup in the next layer is determined by the objectsnumber of group in the current layer In other words theobjects number of each subgroup in the (119894 + 1)th layer isdetermined by the known objects number of each group in119894th layer

According to recursive idea the process of granulatingabnormal group from 119894th layer into (119894+1)th layer is similar tothat 1st layer into 2nd layer It is known that the best efficientway is as far as possible uniformly subdividing an abnormalgroup in current layer into two subgroups in next layer whengranulating abnormal group from 1st layer into 2nd layerTherefore the best efficient way is also as far as possibleuniformly subdividing each abnormal group in 119894th layer intotwo subgroups in (119894 + 1)th layer The proof is completed

Based on 1198961 which is the optimum objects number ofeach group in 1th layer then the optimum granulation levels

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

4 Mathematical Problems in Engineering

groups are tested to be abnormal that means lots of objectsmust be tested once again one by one In order to reducethe classification times this paper proposes a multilevelsgranulation model

Method 11 (multilevels granulation method) Firstly eachmixed blood group which contains 1198961 samples (objects)will be tested once where 1198961 may be 1 2 3 119899 namelythe original quotient space will be random partition to(1198801 1198802 119880119905) (⋃119905119894=1119880119894 = 119880 119880119898 cap 119880119899 = Φ 119898 119899 isin1 2 119905 119898 = 119899) Next if some groups are tested to benormal that means all objects in those groups are normal(health)Therefore all 1198961 objects only need one time to makea diagnosis in this group Else if some groups are tested to beabnormal those groups will be subdivided into many smallersubsets (subgroups) which contain 1198962 (1198962 lt 1198961) objectsnamely the quotient space of an abnormal group 119880119894 will berandom partition to (1198801198941 1198801198942 119880119894119897) (⋃119897119895=1119880119894119895 = 119880119894 119880119894119898 cap119880119894119899 = Φ 119898 119899 isin 1 2 119897 119898 = 119899 119897 lt 1198961) Finally eachsubgroup will be tested once again Similarly if a subgroup istested to be normal it is confirmed that all objects are healthin corresponding subgroup and if a subgroup is tested to beabnormal it will be subdivided into smaller subgroups whichcontain 1198963 (1198963 lt 1198962) objects once againTherefore the testingresults of all objects can be ensured by repeating the aboveprocess in a group until the number of objects is equal to 1 orits testing result is normal in a subgroupThen the searchingefficiency of the above three methods is analyzed as follows

In Method 9 every object has to be tested once fordiagnosing a disease so it must take up119873 times in total

In Method 10 the original problem space is subdividedinto many disjoint subspaces (subsets) If some subsets aretested to be normal that means all objects need only to betested onceTherefore the classification times can be reducedif the probability 119901 is small enough in some degree [9]

In Method 11 the key is trying to find the optimalmultigranulation space for searching all objects so a mul-tilevels granulation model needs to be established Thereare two questions One is grouping strategy namely howmany objects are contained in a group The other one isoptimal granulation levels namely how many levels shouldbe granulated from the original problem space

In this paper we mainly solve the two questions inMethod 11 According to the truth and falsity preservingprinciple in quotient space theory all normal parts of bloodsamples could be ignored Hence the original problem willbe simplified to a smaller subspace This idea not onlyreduces the complexity of the problem but also improves theefficiency of searching abnormal objects

Algorithm Strategy Example 8 can be regarded as a treestructure which each node (which stands for a group) isan 119909-tuple Obviously the searching problem of the treehas been transformed into a hierarchical reasoning processin a monotonous relation sequence The original space hasbeen transformed into a hierarchical structure where allsubproblems will be solved in different granularity levels

Table 1 The probability distribution of 11988411198841 11198961 11198961 + 1119901 1198841 = 1199101 1199021198961 1 minus 1199021198961

Firstly the general rule can be concluded by analyzing thesimplest hierarchy and grouping case which is the single-levelgranulation Secondly we can calculate the mathematicalexpectation of the classification times of blood analysisFinally an optimal hierarchical granulation model will beestablished by comparing with the expectation of classifica-tion times

Analysis Supposing that there is an object set 119884 =1199101 1199102 119910119899 the prevalence rate is 119901 So the probability ofan object that appears normal in blood analysis is 119902 = 1 minus 119901The probability of a group which is tested to be normal is 1199021198961 and to be abnormal is 1 minus 1199021198961 where 1198961 is the objects numberof a group

31 The Single-Level Granulation Firstly the domain (whichcontains 119873 objects) is randomly subdivided into many sub-groups where each subset contains 1198961 objects In otherwordsa new quotient space [1198841] is obtained based on an equivalencerelation119877Then supposing that the average classification timeof each object is a random variable 1198841 so the probabilitydistribution of 1198841 is shown in Table 1

Thus themathematical expectation of1198841 can be obtainedas follows

1198641 (1198841) = 11198961 times 1199021198961 + (1 + 11198961) times (1 minus 1199021198961)= 11198961 + (1 minus 1199021198961)

(3)

Then the total mathematical expectation of the domain canbe obtained as follows

119873 times 1198641 (1198841) = 119873times 11198961 times 1199021198961 + (1 + 11198961) times (1 minus 1199021198961) (4)

When the probability of 119901 keeps unchanged and 1198961satisfies inequality 1198641(1198841) lt 1 this single-level granulationmethod can reduce classification times For example if 119901 =05 and 1198961 gt 1 and according to Lemma 5 1198641(1198961) gt 1 nomatter what the value of 1198961Then this single-level granulationmethod is worse than traditional method (namely testingevery object in turn) Conversely if 119901 = 0001 and 1198961 = 321198641(1198841) will reach its minimum value and the classificationtime of single-level granulation method is less than the

Mathematical Problems in Engineering 5

N

k1 k1

k2 k2 middot middot middotmiddot middot middot

middot middot middot

Figure 2 Double-levels granulation

traditional method Let119873 = 10000 the total of classificationtimes is approximately equal to 628 as follows

119873 times 1198641 (1198841) = 10000times 132 times 11990232 + (1 + 132) times (1 minus 11990232)

asymp 628(5)

This shows that this method can greatly improve the effi-ciency of diagnosing and reduce 9372 classification times inthe single-level granulation method If there is an extremelylow prevalence rate for example 119901 = 0000001 the totalof classification times reaches its minimum value when eachgroup contains 1001 objects (namely 1198961 = 1001) If everygroup is subdivided into many smaller subgroups again andrepeating the above method can the total of classificationtimes be further reduced

32 The Double-Levels Granulation After the objects ofdomain are granulated by the method of Section 31 theoriginal objects space becomes a new quotient space in whicheach group has 1198961 objects According to the falsity and truthpreserving principles in quotient space theory if the group istested to be abnormal it can be granulated into many smallersubgroups The double-levels granulation can be shown inFigure 2

Then the probability distribution of the double-levelsgranulation is discussed as follows

If each group contains 1198961 objects and tested once in the1st layer the average of classification times is 11198961 for eachobject Similarly the average of classification times of eachobject is 11198962 in the 2nd layer When a subgroup contains 1198962objects and is tested to be abnormal every object has to beretested one by one once again in this subgroup so the totalof classification times of each object is equal to 11198962 + 1

For simplicity suppose that every group in the 1st layerwill be subdivided into two subgroups which respectivelycontain 11989621 and 11989621 objects in the 2nd layer

The classification time is shown in Table 2 (M representsthe testing result which is abnormal and represents nor-mal)

Table 2The average classification times of each objectwith differentresults

Times Objects1198961 11989621 11989622

Result1 3 + 11989621 M M 3 + 11989622 M M3 + 1198961 M M M

Table 3 The probability distribution of 11988421198842 11198961 11198961 + 11198962 11198961 + 11198962 + 1119901 1198842 = 1199102 1199021198961 (1 minus 1199021198961) times 1199021198962 (1 minus 1199021198961) times (1 minus 1199021198962)

For instance let 1198961 = 8 11989621 = 4 and 11989622 = 4 there arefour kinds of cases to happen

Case 1 If a group is tested to be normal in the 1st layer so thetotal of classification times of this group is 1198961 times 11198961 = 1Case 2 If a group is tested to be abnormal in the 1st layerand its one subgroup is tested to be abnormal and the othersubgroup is also tested to be abnormal in the 2nd layer so thetotal of classification times of this group is 11989621times(11198961+111989621+1) + 11989622 times (11198961 + 111989622) = 3 + 11989621 = 7Case 3 If a group is tested to be abnormal in the 1st layer itsone subgroup is tested to be normal and the other subgroupis tested to be abnormal in the 2nd layer so the total ofclassification times of this group is 11989621 times(11198961 +111989621)+11989622 times(11198961 + 111989622 + 1) = 3 + 11989622 = 7Case 4 If a group is tested to be abnormal in the 1st layer andits two subgroups are tested to be abnormal in the 2nd layerso the total of classification times of this group is 11989621times(11198961+111989621 + 1) + 11989622 times (11198961 + 111989622 + 1) = 3 + 1198961 = 11

Suppose each group contains 1198961 objects in the 1st layerand their every subgroup has 1198962 objects in the 2nd layerSupposing that the average classification times of each objectis a random variable 1198842 then the probability distribution of1198842 is shown in Table 3

Thus in the 2nd layer the mathematical expectation of1198842 which is the average classification times of each object isobtained as follows

1198642 (1198842) = 11198961 times 1199021198961 + ( 11198961 +11198962) times (1 minus 1199021198961) times 1199021198962

+ (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

= 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962) (6)

As long as the number of granulation levels increases to 2the average classification times of each object will be furtherreduced for instance when 119901 = 0001 and 119873 = 10000

6 Mathematical Problems in Engineering

Table 4 The probability distribution of 119884119894119884119894 119901 119884119894 = 11991011989411198961 119902119896111198961 +

11198962 (1 minus 1199021198961) times 1199021198962 119894sum119895=1

1119896119895 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894119894sum119895=1

1119896119895 + 1 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times (1 minus 119902119896119894)

As we know the minimum expectation of the total ofclassification times is about 628 with 1198961 = 32 in the single-level granulation And according to (6) and Lemma 6 1198642(1198842)will reach minimum value when 1198962 = 16 The minimummathematical expectation of each objectrsquos average classifica-tion times is shown as follows

119873 times 1198642 (119883) = 119873 times 11198961 times 1199021198961 + 11198961 +11198962 times (1 minus 1199021198961)

times 1199021198962 + (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

asymp 338(7)

The mathematical expectation of classification times cansave 9662 compared with traditional method and save4618 compared with single-level granulationmethod Nextwe will discuss 119894-levels granulation (119894 = 3 4 5 119899)33 The i-Levels Granulation For blood analysis case thegranulation strategy in 119894th layer is concluded by the knownobjects number of each group in previous layers (namely1198961 1198962 119896119894minus1 are known and just only 119896119894 is unknown)According to the double-levels granulationmethod and sup-posing that the classification time of each object is a randomvariable 119884119894 in the 119894-levels granulation so the probabilitydistribution of 119884119894 is shown in Table 4

Obviously the sum of probability distribution is equal to1 in each layer

Proof

Case 1 (the single-level granulation) One has

1199021198961 + 1 minus 1199021198961 = 1 (8)

Case 2 (the double-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + (1 minus 1199021198961) times (1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times (1199021198962 + 1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times 1 = 1

(9)

Case 3 (the 119894-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894 + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1)times (119902119896119894 + (1 minus 119902119896119894)) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) = 1199021198961+ (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (119902119896119894minus1 + (1 minus 119902119896119894minus1)) = sdot sdot sdot = 1199021198961 + (1 minus 1199021198961)times 1 = 1

(10)

The proof is completed

Definition 12 (classification times expectation of granulation)In a probability quotient space a multilevels granulationmodel will be established from domain 119880 = 1199091 1199092 119909119899which is a nonempty finite set the health rate is 119902 the maxgranular levels is 119871 and the number of objects in 119894th layer is119896119894 119894 = 1 2 119871 So the average classification time of eachobjects is 119864119894(119884119894) in 119894th layer

119864119894 (119884119894) = 11198961 +119871sum119894=2

[[1119896119894 times119894minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119894=1

(1 minus 119902119896119894) (11)

In this paper we mainly focus on establishing a mini-mum granulation expectation model of classification timesby multigranulation computing method For simplicity themathematical expectation of classification times will beregarded as the measure of contrasting with the searchingefficiency According to Lemma 5 themultilevels granulationmodel can simplify the complex problem only if the preva-lence rate 119901 isin (0 1 minus 119890minus119890minus1) in the blood analysis case

Theorem 13 Let the prevalence rate 119901 isin (0 03) if a groupis tested to be abnormal in the 1st layer (namely this groupcontains abnormal objects) the average classification times ofeach object will be further reduced by subdividing this grouponce again

Proof The expectation difference between the single-levelgranulation 1198641(1198841) and the double-levels granulation 1198642(1198842)can adequately embody their efficiency Under the conditions

Mathematical Problems in Engineering 7

of 119890minus119890minus1 lt 119902 lt 1 and 1 le 1198962 lt 1198961 and according to (3) and (6)the expectation difference1198641(1198841)minus1198642(1198842) is shown as follows

1198641 (1198841) minus 1198642 (1198842)= 11198961 + (1 minus 1199021198961)

minus 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962)= (1 minus 1199021198961) times 1 minus ( 11198962 + 1 minus 1199021198962) gt 0

(12)

According to Lemma 5 (1 minus 1199021198961) gt 0 and 119891119902(1198962) = 11198962 + 1 minus1199021198962 lt 1 always hold then we can get 1minus(11198962+1minus1199021198962) gt 0 Sothe inequality 1198641(119883) minus 1198642(119883) gt 0 is proved successfully

Theorem 13 illustrates that it can reduce classificationtimes by continuing to granulate the abnormal groups intothe 2nd layer when 1198961 gt 1 There is attempt to prove thatthe total of classification times will be further reduced bycontinuously granulating the abnormal groups into 119894th layersuntil a grouprsquos number is no less than 1

Theorem 14 Supposing the prevalence rate 119901 isin (0 03)if a group is tested to be abnormal (namely this groupcontains abnormal objects) the average classification times ofeach object will be reduced by continuously subdividing theabnormal group until the objects number of its subgroup is noless than 1

Proof The expectation difference between (119894 minus 1)-levelsgranulation 119864119894minus1(119884119894minus1) and 119894-levels granulation 119864119894(119884119894) reflectstheir efficiency On the condition of 119890minus119890minus1 lt 119902 lt 1 and1 le 119896119894 lt 119896119894minus1 and according to (11) the expectation difference119864119894minus1(119884119894minus1) minus 119864119894(119884119894) is shown as follows

119864119894minus1 (119884119894minus1) minus 119864119894 (119884119894) = 11198961 +119894minus1sum119897=2

[[1119896119897 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119894minus1prod119897=1

(1 minus 119902119896119897) minus 11198961 +

119894sum119897=2

[[1119896119894 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119897=1

(1 minus 119902119896119897)= (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1)

times 1 minus ( 1119896119894 + 1 minus 119902119896119894) gt 0

(13)

Because (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1) gt 0 is known accordingto Lemma 5 and 119896119894 ge 1 we can get (1119896119894 + 1 minus 119902119896119894) lt 1namely 1 minus (1119896119894 + 1 minus 119902119896119894) gt 0 So 119864119894minus1 minus 119864119894 gt 0 is provedsuccessfully

Theorem 14 shows that this method will continuouslyimprove the searching efficiency in the process of granu-lating abnormal groups from 1st layer to 119894th layer because

119864119894minus1(119884119894minus1) minus119864119894(119884119894) gt 0 always holds However it is found thatthe classification times cannot be reduced when the objectsnumber of an abnormal group is less than or equal to 4 so theobjects of this abnormal group should be tested one by oneIn order to achieve the best efficiency then we will explorehow to determine the optimum granulation namely how todetermine the optimum objects number of each group andhow to obtain the optimum granulation levels

34 The Optimum Granulation It is a difficult and key pointto explore an appropriate granularity space for dealing witha complex problem And it not only requires us to keep theintegrity of the original information but also simplify thecomplex problem So we take the blood analysis case as anexample to explain how to obtain the optimum granularityspace in this paper Suppose the condition 119890minus119890minus1 lt 119902 lt 1always holds

Case 1 (granulating abnormal groups from the 1st layer to the2nd layer) (a) If 1198961 is an even number every group whichcontains 1198961 objects in 1st layer will be subdivided into twosubgroups into 2nd layer

Scheme 15 Supposing the one subgroup of the 2nd layer has119894 (1 le 119894 lt 11989612) objects according to formula (6) theexpectation of classification times for each object is1198642(119894) Andthe other subgroup has (1198961 minus 119894) objects so the expectation ofclassification times for each object is 1198642(1198961 minus 119894) The averageexpectation of classification times for each object in the 2ndlayer is shown as follows

119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (1198961 minus 119894)1198961 (14)

Scheme 16 Suppose every abnormal group in 1st layer is aver-age subdivided into two subgroups namely each subgrouphas 11989612 objects in the 2nd layer According to formula (6)the average expectation of classification times for each objectin the 2nd layer is shown as follows

2 times 11989612 times 1198642 (11989612)1198961 = 1198961 times 1198642 (11989612)1198961 (15)

The expectation difference between the above twoschemes embodies their efficiency In order to prove thatScheme 16 is more efficient than Scheme 15 we only need toprove that the following inequality is correct namely

(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))1198961 minus 1198961 times 1198642 (11989612)1198961

gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (16)

8 Mathematical Problems in Engineering

Table 5 The changes of average expectation with different objects number in two groups

(11989621 11989622) (1 15) (2 14) (3 13) (4 12) (5 11) (6 10) (7 9) (8 8)1198642 007367 007329 007297 007270 007249 007234 007225 007222

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + (1198961 minus 119894)

times ( 11198961 + (1 minus 1199021198961) times (1 + 1(1198961 minus 119894) minus 119902(1198961minus119894)))

gt 1198961 ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))

1198961 minus 1198961 times 1198642 (11989612)1198961gt 0

(17)

The proof is completed

Therefore if every group has 1198961 (1198961 is an even number and1198961 gt 1) objects in the 1st layer that need to be subdivided intotwo subgroup Scheme 16 is more efficient than Scheme 15

The experiment results have verified the above conclusionin Table 5 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the expectationof classification times obtainsminimumvalue for each objectwhere 11989621 is the number of the one subgroup in the 2ndlayer 11989622 is the number of the other subgroup and 1198642 isthe corresponding expectation of classification times for eachobject

(b) If 1198961 is an even number every group which contains1198961 objects in 1st layer will be subdivided into three subgroupsinto 2nd layer

Scheme 17 In the 2nd layer if the first subgroup has 119894 (1 le 119894 lt11989612) objects the average expectation of classification timesfor each object is 1198642(119894) If the second subgroup has 119895 (1 le119895 lt 11989612) objects the expectation of classification times foreach object is 1198642(119895) Then the third subgroup has (1198961 minus 119894 minus119895) objects and the average expectation of classification timesfor each object is 1198642(1198961 minus 119894 minus 119895) So the average expectationof classification times for each object in the 2nd is shown asfollows

119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895)1198961 (18)

Similarly it is easy to be prove that Scheme 16 is alsomoreefficient than Scheme 17 In other words we only need toprove the following inequality namely

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (19)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119905 times 119902119894 + (1198961 minus 119905) times 119902(1198961minus119905))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + 119895

times ( 11198961 + (1 minus 1199021198961) times (1 + 1119895 minus 119902119895)) + (1198961 minus 119894 minus 119895)times ( 11198961 + (1 minus 1199021198961) times (1 + 1

(1198961 minus 119894 minus 119895) minus 119902(1198961minus119894minus119895)))gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0

(20)

The proof is completed

Therefore if every group which contains 1198961 (is an evennumber and 1198961 gt 1) objects needs to be subdivided into threesubgroups in the first layer Scheme 16 is more efficient thanScheme 17

The experimental results have verified the above conclu-sion in Table 6 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the averageexpectation of classification times reachesminimumvalue foreach object In Table 6 the first line stands for the objectsnumber of first group in the 2nd layer and the first row standsfor the objects number of second group and data stands forthe corresponding average expectation of classification timesFor example (1 1 77143) expresses that the objects number

Mathematical Problems in Engineering 9

Table 6 The changes of average expectation with different objectsnumber of three groups

Objects Objects1 2 3 4 5

Expectation (times10minus2)1 77143 76786 76488 76250 760722 76786 76458 76189 75980 758313 76488 76189 75950 75770 756514 76250 75980 75770 75620 755305 76072 75831 75651 75530 754706 75953 75742 75591 75500 754707 75894 75712 75591 75530 755308 75894 75742 75651 75620 756519 75953 75831 75770 75770 75980

of three groups respectively is 1 1 and 14 and the averageclassification time for each object is 1198642 = 0077143 in the 2ndlayer

(c) When an abnormal group contains 1198961 (even) objectsand it needs to be further granulated into the 2nd layerScheme 16 still has the best efficient

There are two granulation schemes Scheme 18 is that theabnormal groups are randomly subdivided into 119904 (119904 lt 1198961)subgroups in the 1st layer and Scheme 16 is that the abnormalgroups are averagely subdivided into two subgroups in the 1stlayer

Scheme 18 Supposing an abnormal group will be subdividedinto 119904 (119904 lt 1198961) subgroups The first group has 1199091 (1 le 1199091 lt11989612) objects and the average expectation of classificationtimes for each object is 1198642(1199091) the 2nd subgroup has1199092 (1 le 1199092 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(1199092) 119894th subgrouphas 119909119894 (1 le 119909119894 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(119909119894) 119904th subgrouphas 119909119904 (1 le 119909119904 lt 11989612) objects and the average expectationof classification times for each object is 1198642(119909119904) Hence theaverage expectation of classification times for each object inthe 2nd layer is shown as follows

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) (21)

Similarly in order to prove that Scheme 16 is moreefficient than Scheme 18 we only need to prove the followinginequality namely

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0(119890minus119890minus1 lt 119902 lt 1 1198961 gt 1)

(22)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 Then we have

sum119904119894=1 119892 (119909119894)2 lt 119892 ( 1198882) 997904rArrsum119904119894=1 119909119894 times 119902119909119894

2 lt 11989612 times 11990211989612 997904rArr119904sum119895=1

119909119895 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119909119895 minus 119902119909119895))

gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0

(23)

The proof is completed

Therefore when every abnormal groupwhich contains 1198961(which is an even number and 1198961 gt 1) in the 1st layer objectsneeds to be granulated into many subgroups Scheme 16 ismore efficient than other schemes

(d) In a similar way when every abnormal group whichcontains 1198961 (1198961 is an odd number and 1198961 gt 1) objects in the 1stlayerwill be granulated intomany subgroups the best schemeis that every abnormal group is uniformly subdivided intotwo subgroups namely each subgroup contains (1198961 minus 1)2or (1198961 + 1)2 objects in the 2nd layer

Case 2 (granulating abnormal groups from the 1st layer to 119894thlayer)

Theorem 19 In 119894th layer if the objects number of eachabnormal group is more than 4 then the total of classificationtimes can be reduced by keeping on subdividing the abnormalgroups into two subgroups which contain equal objects as far aspossible Namely if each group contains 119896119894 objects in 119894th layerthen each subgroupmay contain 1198961198942 or (1198961minus1)2 or (1198961+1)2objects in (119894 + 1)th layerProof In the multigranulation method the objects numberof each subgroup in the next layer is determined by the objectsnumber of group in the current layer In other words theobjects number of each subgroup in the (119894 + 1)th layer isdetermined by the known objects number of each group in119894th layer

According to recursive idea the process of granulatingabnormal group from 119894th layer into (119894+1)th layer is similar tothat 1st layer into 2nd layer It is known that the best efficientway is as far as possible uniformly subdividing an abnormalgroup in current layer into two subgroups in next layer whengranulating abnormal group from 1st layer into 2nd layerTherefore the best efficient way is also as far as possibleuniformly subdividing each abnormal group in 119894th layer intotwo subgroups in (119894 + 1)th layer The proof is completed

Based on 1198961 which is the optimum objects number ofeach group in 1th layer then the optimum granulation levels

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

Mathematical Problems in Engineering 5

N

k1 k1

k2 k2 middot middot middotmiddot middot middot

middot middot middot

Figure 2 Double-levels granulation

traditional method Let119873 = 10000 the total of classificationtimes is approximately equal to 628 as follows

119873 times 1198641 (1198841) = 10000times 132 times 11990232 + (1 + 132) times (1 minus 11990232)

asymp 628(5)

This shows that this method can greatly improve the effi-ciency of diagnosing and reduce 9372 classification times inthe single-level granulation method If there is an extremelylow prevalence rate for example 119901 = 0000001 the totalof classification times reaches its minimum value when eachgroup contains 1001 objects (namely 1198961 = 1001) If everygroup is subdivided into many smaller subgroups again andrepeating the above method can the total of classificationtimes be further reduced

32 The Double-Levels Granulation After the objects ofdomain are granulated by the method of Section 31 theoriginal objects space becomes a new quotient space in whicheach group has 1198961 objects According to the falsity and truthpreserving principles in quotient space theory if the group istested to be abnormal it can be granulated into many smallersubgroups The double-levels granulation can be shown inFigure 2

Then the probability distribution of the double-levelsgranulation is discussed as follows

If each group contains 1198961 objects and tested once in the1st layer the average of classification times is 11198961 for eachobject Similarly the average of classification times of eachobject is 11198962 in the 2nd layer When a subgroup contains 1198962objects and is tested to be abnormal every object has to beretested one by one once again in this subgroup so the totalof classification times of each object is equal to 11198962 + 1

For simplicity suppose that every group in the 1st layerwill be subdivided into two subgroups which respectivelycontain 11989621 and 11989621 objects in the 2nd layer

The classification time is shown in Table 2 (M representsthe testing result which is abnormal and represents nor-mal)

Table 2The average classification times of each objectwith differentresults

Times Objects1198961 11989621 11989622

Result1 3 + 11989621 M M 3 + 11989622 M M3 + 1198961 M M M

Table 3 The probability distribution of 11988421198842 11198961 11198961 + 11198962 11198961 + 11198962 + 1119901 1198842 = 1199102 1199021198961 (1 minus 1199021198961) times 1199021198962 (1 minus 1199021198961) times (1 minus 1199021198962)

For instance let 1198961 = 8 11989621 = 4 and 11989622 = 4 there arefour kinds of cases to happen

Case 1 If a group is tested to be normal in the 1st layer so thetotal of classification times of this group is 1198961 times 11198961 = 1Case 2 If a group is tested to be abnormal in the 1st layerand its one subgroup is tested to be abnormal and the othersubgroup is also tested to be abnormal in the 2nd layer so thetotal of classification times of this group is 11989621times(11198961+111989621+1) + 11989622 times (11198961 + 111989622) = 3 + 11989621 = 7Case 3 If a group is tested to be abnormal in the 1st layer itsone subgroup is tested to be normal and the other subgroupis tested to be abnormal in the 2nd layer so the total ofclassification times of this group is 11989621 times(11198961 +111989621)+11989622 times(11198961 + 111989622 + 1) = 3 + 11989622 = 7Case 4 If a group is tested to be abnormal in the 1st layer andits two subgroups are tested to be abnormal in the 2nd layerso the total of classification times of this group is 11989621times(11198961+111989621 + 1) + 11989622 times (11198961 + 111989622 + 1) = 3 + 1198961 = 11

Suppose each group contains 1198961 objects in the 1st layerand their every subgroup has 1198962 objects in the 2nd layerSupposing that the average classification times of each objectis a random variable 1198842 then the probability distribution of1198842 is shown in Table 3

Thus in the 2nd layer the mathematical expectation of1198842 which is the average classification times of each object isobtained as follows

1198642 (1198842) = 11198961 times 1199021198961 + ( 11198961 +11198962) times (1 minus 1199021198961) times 1199021198962

+ (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

= 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962) (6)

As long as the number of granulation levels increases to 2the average classification times of each object will be furtherreduced for instance when 119901 = 0001 and 119873 = 10000

6 Mathematical Problems in Engineering

Table 4 The probability distribution of 119884119894119884119894 119901 119884119894 = 11991011989411198961 119902119896111198961 +

11198962 (1 minus 1199021198961) times 1199021198962 119894sum119895=1

1119896119895 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894119894sum119895=1

1119896119895 + 1 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times (1 minus 119902119896119894)

As we know the minimum expectation of the total ofclassification times is about 628 with 1198961 = 32 in the single-level granulation And according to (6) and Lemma 6 1198642(1198842)will reach minimum value when 1198962 = 16 The minimummathematical expectation of each objectrsquos average classifica-tion times is shown as follows

119873 times 1198642 (119883) = 119873 times 11198961 times 1199021198961 + 11198961 +11198962 times (1 minus 1199021198961)

times 1199021198962 + (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

asymp 338(7)

The mathematical expectation of classification times cansave 9662 compared with traditional method and save4618 compared with single-level granulationmethod Nextwe will discuss 119894-levels granulation (119894 = 3 4 5 119899)33 The i-Levels Granulation For blood analysis case thegranulation strategy in 119894th layer is concluded by the knownobjects number of each group in previous layers (namely1198961 1198962 119896119894minus1 are known and just only 119896119894 is unknown)According to the double-levels granulationmethod and sup-posing that the classification time of each object is a randomvariable 119884119894 in the 119894-levels granulation so the probabilitydistribution of 119884119894 is shown in Table 4

Obviously the sum of probability distribution is equal to1 in each layer

Proof

Case 1 (the single-level granulation) One has

1199021198961 + 1 minus 1199021198961 = 1 (8)

Case 2 (the double-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + (1 minus 1199021198961) times (1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times (1199021198962 + 1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times 1 = 1

(9)

Case 3 (the 119894-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894 + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1)times (119902119896119894 + (1 minus 119902119896119894)) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) = 1199021198961+ (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (119902119896119894minus1 + (1 minus 119902119896119894minus1)) = sdot sdot sdot = 1199021198961 + (1 minus 1199021198961)times 1 = 1

(10)

The proof is completed

Definition 12 (classification times expectation of granulation)In a probability quotient space a multilevels granulationmodel will be established from domain 119880 = 1199091 1199092 119909119899which is a nonempty finite set the health rate is 119902 the maxgranular levels is 119871 and the number of objects in 119894th layer is119896119894 119894 = 1 2 119871 So the average classification time of eachobjects is 119864119894(119884119894) in 119894th layer

119864119894 (119884119894) = 11198961 +119871sum119894=2

[[1119896119894 times119894minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119894=1

(1 minus 119902119896119894) (11)

In this paper we mainly focus on establishing a mini-mum granulation expectation model of classification timesby multigranulation computing method For simplicity themathematical expectation of classification times will beregarded as the measure of contrasting with the searchingefficiency According to Lemma 5 themultilevels granulationmodel can simplify the complex problem only if the preva-lence rate 119901 isin (0 1 minus 119890minus119890minus1) in the blood analysis case

Theorem 13 Let the prevalence rate 119901 isin (0 03) if a groupis tested to be abnormal in the 1st layer (namely this groupcontains abnormal objects) the average classification times ofeach object will be further reduced by subdividing this grouponce again

Proof The expectation difference between the single-levelgranulation 1198641(1198841) and the double-levels granulation 1198642(1198842)can adequately embody their efficiency Under the conditions

Mathematical Problems in Engineering 7

of 119890minus119890minus1 lt 119902 lt 1 and 1 le 1198962 lt 1198961 and according to (3) and (6)the expectation difference1198641(1198841)minus1198642(1198842) is shown as follows

1198641 (1198841) minus 1198642 (1198842)= 11198961 + (1 minus 1199021198961)

minus 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962)= (1 minus 1199021198961) times 1 minus ( 11198962 + 1 minus 1199021198962) gt 0

(12)

According to Lemma 5 (1 minus 1199021198961) gt 0 and 119891119902(1198962) = 11198962 + 1 minus1199021198962 lt 1 always hold then we can get 1minus(11198962+1minus1199021198962) gt 0 Sothe inequality 1198641(119883) minus 1198642(119883) gt 0 is proved successfully

Theorem 13 illustrates that it can reduce classificationtimes by continuing to granulate the abnormal groups intothe 2nd layer when 1198961 gt 1 There is attempt to prove thatthe total of classification times will be further reduced bycontinuously granulating the abnormal groups into 119894th layersuntil a grouprsquos number is no less than 1

Theorem 14 Supposing the prevalence rate 119901 isin (0 03)if a group is tested to be abnormal (namely this groupcontains abnormal objects) the average classification times ofeach object will be reduced by continuously subdividing theabnormal group until the objects number of its subgroup is noless than 1

Proof The expectation difference between (119894 minus 1)-levelsgranulation 119864119894minus1(119884119894minus1) and 119894-levels granulation 119864119894(119884119894) reflectstheir efficiency On the condition of 119890minus119890minus1 lt 119902 lt 1 and1 le 119896119894 lt 119896119894minus1 and according to (11) the expectation difference119864119894minus1(119884119894minus1) minus 119864119894(119884119894) is shown as follows

119864119894minus1 (119884119894minus1) minus 119864119894 (119884119894) = 11198961 +119894minus1sum119897=2

[[1119896119897 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119894minus1prod119897=1

(1 minus 119902119896119897) minus 11198961 +

119894sum119897=2

[[1119896119894 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119897=1

(1 minus 119902119896119897)= (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1)

times 1 minus ( 1119896119894 + 1 minus 119902119896119894) gt 0

(13)

Because (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1) gt 0 is known accordingto Lemma 5 and 119896119894 ge 1 we can get (1119896119894 + 1 minus 119902119896119894) lt 1namely 1 minus (1119896119894 + 1 minus 119902119896119894) gt 0 So 119864119894minus1 minus 119864119894 gt 0 is provedsuccessfully

Theorem 14 shows that this method will continuouslyimprove the searching efficiency in the process of granu-lating abnormal groups from 1st layer to 119894th layer because

119864119894minus1(119884119894minus1) minus119864119894(119884119894) gt 0 always holds However it is found thatthe classification times cannot be reduced when the objectsnumber of an abnormal group is less than or equal to 4 so theobjects of this abnormal group should be tested one by oneIn order to achieve the best efficiency then we will explorehow to determine the optimum granulation namely how todetermine the optimum objects number of each group andhow to obtain the optimum granulation levels

34 The Optimum Granulation It is a difficult and key pointto explore an appropriate granularity space for dealing witha complex problem And it not only requires us to keep theintegrity of the original information but also simplify thecomplex problem So we take the blood analysis case as anexample to explain how to obtain the optimum granularityspace in this paper Suppose the condition 119890minus119890minus1 lt 119902 lt 1always holds

Case 1 (granulating abnormal groups from the 1st layer to the2nd layer) (a) If 1198961 is an even number every group whichcontains 1198961 objects in 1st layer will be subdivided into twosubgroups into 2nd layer

Scheme 15 Supposing the one subgroup of the 2nd layer has119894 (1 le 119894 lt 11989612) objects according to formula (6) theexpectation of classification times for each object is1198642(119894) Andthe other subgroup has (1198961 minus 119894) objects so the expectation ofclassification times for each object is 1198642(1198961 minus 119894) The averageexpectation of classification times for each object in the 2ndlayer is shown as follows

119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (1198961 minus 119894)1198961 (14)

Scheme 16 Suppose every abnormal group in 1st layer is aver-age subdivided into two subgroups namely each subgrouphas 11989612 objects in the 2nd layer According to formula (6)the average expectation of classification times for each objectin the 2nd layer is shown as follows

2 times 11989612 times 1198642 (11989612)1198961 = 1198961 times 1198642 (11989612)1198961 (15)

The expectation difference between the above twoschemes embodies their efficiency In order to prove thatScheme 16 is more efficient than Scheme 15 we only need toprove that the following inequality is correct namely

(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))1198961 minus 1198961 times 1198642 (11989612)1198961

gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (16)

8 Mathematical Problems in Engineering

Table 5 The changes of average expectation with different objects number in two groups

(11989621 11989622) (1 15) (2 14) (3 13) (4 12) (5 11) (6 10) (7 9) (8 8)1198642 007367 007329 007297 007270 007249 007234 007225 007222

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + (1198961 minus 119894)

times ( 11198961 + (1 minus 1199021198961) times (1 + 1(1198961 minus 119894) minus 119902(1198961minus119894)))

gt 1198961 ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))

1198961 minus 1198961 times 1198642 (11989612)1198961gt 0

(17)

The proof is completed

Therefore if every group has 1198961 (1198961 is an even number and1198961 gt 1) objects in the 1st layer that need to be subdivided intotwo subgroup Scheme 16 is more efficient than Scheme 15

The experiment results have verified the above conclusionin Table 5 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the expectationof classification times obtainsminimumvalue for each objectwhere 11989621 is the number of the one subgroup in the 2ndlayer 11989622 is the number of the other subgroup and 1198642 isthe corresponding expectation of classification times for eachobject

(b) If 1198961 is an even number every group which contains1198961 objects in 1st layer will be subdivided into three subgroupsinto 2nd layer

Scheme 17 In the 2nd layer if the first subgroup has 119894 (1 le 119894 lt11989612) objects the average expectation of classification timesfor each object is 1198642(119894) If the second subgroup has 119895 (1 le119895 lt 11989612) objects the expectation of classification times foreach object is 1198642(119895) Then the third subgroup has (1198961 minus 119894 minus119895) objects and the average expectation of classification timesfor each object is 1198642(1198961 minus 119894 minus 119895) So the average expectationof classification times for each object in the 2nd is shown asfollows

119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895)1198961 (18)

Similarly it is easy to be prove that Scheme 16 is alsomoreefficient than Scheme 17 In other words we only need toprove the following inequality namely

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (19)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119905 times 119902119894 + (1198961 minus 119905) times 119902(1198961minus119905))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + 119895

times ( 11198961 + (1 minus 1199021198961) times (1 + 1119895 minus 119902119895)) + (1198961 minus 119894 minus 119895)times ( 11198961 + (1 minus 1199021198961) times (1 + 1

(1198961 minus 119894 minus 119895) minus 119902(1198961minus119894minus119895)))gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0

(20)

The proof is completed

Therefore if every group which contains 1198961 (is an evennumber and 1198961 gt 1) objects needs to be subdivided into threesubgroups in the first layer Scheme 16 is more efficient thanScheme 17

The experimental results have verified the above conclu-sion in Table 6 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the averageexpectation of classification times reachesminimumvalue foreach object In Table 6 the first line stands for the objectsnumber of first group in the 2nd layer and the first row standsfor the objects number of second group and data stands forthe corresponding average expectation of classification timesFor example (1 1 77143) expresses that the objects number

Mathematical Problems in Engineering 9

Table 6 The changes of average expectation with different objectsnumber of three groups

Objects Objects1 2 3 4 5

Expectation (times10minus2)1 77143 76786 76488 76250 760722 76786 76458 76189 75980 758313 76488 76189 75950 75770 756514 76250 75980 75770 75620 755305 76072 75831 75651 75530 754706 75953 75742 75591 75500 754707 75894 75712 75591 75530 755308 75894 75742 75651 75620 756519 75953 75831 75770 75770 75980

of three groups respectively is 1 1 and 14 and the averageclassification time for each object is 1198642 = 0077143 in the 2ndlayer

(c) When an abnormal group contains 1198961 (even) objectsand it needs to be further granulated into the 2nd layerScheme 16 still has the best efficient

There are two granulation schemes Scheme 18 is that theabnormal groups are randomly subdivided into 119904 (119904 lt 1198961)subgroups in the 1st layer and Scheme 16 is that the abnormalgroups are averagely subdivided into two subgroups in the 1stlayer

Scheme 18 Supposing an abnormal group will be subdividedinto 119904 (119904 lt 1198961) subgroups The first group has 1199091 (1 le 1199091 lt11989612) objects and the average expectation of classificationtimes for each object is 1198642(1199091) the 2nd subgroup has1199092 (1 le 1199092 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(1199092) 119894th subgrouphas 119909119894 (1 le 119909119894 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(119909119894) 119904th subgrouphas 119909119904 (1 le 119909119904 lt 11989612) objects and the average expectationof classification times for each object is 1198642(119909119904) Hence theaverage expectation of classification times for each object inthe 2nd layer is shown as follows

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) (21)

Similarly in order to prove that Scheme 16 is moreefficient than Scheme 18 we only need to prove the followinginequality namely

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0(119890minus119890minus1 lt 119902 lt 1 1198961 gt 1)

(22)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 Then we have

sum119904119894=1 119892 (119909119894)2 lt 119892 ( 1198882) 997904rArrsum119904119894=1 119909119894 times 119902119909119894

2 lt 11989612 times 11990211989612 997904rArr119904sum119895=1

119909119895 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119909119895 minus 119902119909119895))

gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0

(23)

The proof is completed

Therefore when every abnormal groupwhich contains 1198961(which is an even number and 1198961 gt 1) in the 1st layer objectsneeds to be granulated into many subgroups Scheme 16 ismore efficient than other schemes

(d) In a similar way when every abnormal group whichcontains 1198961 (1198961 is an odd number and 1198961 gt 1) objects in the 1stlayerwill be granulated intomany subgroups the best schemeis that every abnormal group is uniformly subdivided intotwo subgroups namely each subgroup contains (1198961 minus 1)2or (1198961 + 1)2 objects in the 2nd layer

Case 2 (granulating abnormal groups from the 1st layer to 119894thlayer)

Theorem 19 In 119894th layer if the objects number of eachabnormal group is more than 4 then the total of classificationtimes can be reduced by keeping on subdividing the abnormalgroups into two subgroups which contain equal objects as far aspossible Namely if each group contains 119896119894 objects in 119894th layerthen each subgroupmay contain 1198961198942 or (1198961minus1)2 or (1198961+1)2objects in (119894 + 1)th layerProof In the multigranulation method the objects numberof each subgroup in the next layer is determined by the objectsnumber of group in the current layer In other words theobjects number of each subgroup in the (119894 + 1)th layer isdetermined by the known objects number of each group in119894th layer

According to recursive idea the process of granulatingabnormal group from 119894th layer into (119894+1)th layer is similar tothat 1st layer into 2nd layer It is known that the best efficientway is as far as possible uniformly subdividing an abnormalgroup in current layer into two subgroups in next layer whengranulating abnormal group from 1st layer into 2nd layerTherefore the best efficient way is also as far as possibleuniformly subdividing each abnormal group in 119894th layer intotwo subgroups in (119894 + 1)th layer The proof is completed

Based on 1198961 which is the optimum objects number ofeach group in 1th layer then the optimum granulation levels

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

6 Mathematical Problems in Engineering

Table 4 The probability distribution of 119884119894119884119894 119901 119884119894 = 11991011989411198961 119902119896111198961 +

11198962 (1 minus 1199021198961) times 1199021198962 119894sum119895=1

1119896119895 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894119894sum119895=1

1119896119895 + 1 (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) times (1 minus 119902119896119894)

As we know the minimum expectation of the total ofclassification times is about 628 with 1198961 = 32 in the single-level granulation And according to (6) and Lemma 6 1198642(1198842)will reach minimum value when 1198962 = 16 The minimummathematical expectation of each objectrsquos average classifica-tion times is shown as follows

119873 times 1198642 (119883) = 119873 times 11198961 times 1199021198961 + 11198961 +11198962 times (1 minus 1199021198961)

times 1199021198962 + (1 + 11198961 +11198962) times (1 minus 1199021198961) times (1 minus 1199021198962)

asymp 338(7)

The mathematical expectation of classification times cansave 9662 compared with traditional method and save4618 compared with single-level granulationmethod Nextwe will discuss 119894-levels granulation (119894 = 3 4 5 119899)33 The i-Levels Granulation For blood analysis case thegranulation strategy in 119894th layer is concluded by the knownobjects number of each group in previous layers (namely1198961 1198962 119896119894minus1 are known and just only 119896119894 is unknown)According to the double-levels granulationmethod and sup-posing that the classification time of each object is a randomvariable 119884119894 in the 119894-levels granulation so the probabilitydistribution of 119884119894 is shown in Table 4

Obviously the sum of probability distribution is equal to1 in each layer

Proof

Case 1 (the single-level granulation) One has

1199021198961 + 1 minus 1199021198961 = 1 (8)

Case 2 (the double-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + (1 minus 1199021198961) times (1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times (1199021198962 + 1 minus 1199021198962)= 1199021198961 + (1 minus 1199021198961) times 1 = 1

(9)

Case 3 (the 119894-levels granulation) One has

1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894minus1) times 119902119896119894 + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (1 minus 119902119896119894) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1)times (119902119896119894 + (1 minus 119902119896119894)) = 1199021198961 + (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot+ (1 minus 1199021198961) times (1 minus 1199021198962) times sdot sdot sdot times (1 minus 119902119896119894minus1) = 1199021198961+ (1 minus 1199021198961) times 1199021198962 + sdot sdot sdot + (1 minus 1199021198961) times (1 minus 1199021198962)times sdot sdot sdot times (119902119896119894minus1 + (1 minus 119902119896119894minus1)) = sdot sdot sdot = 1199021198961 + (1 minus 1199021198961)times 1 = 1

(10)

The proof is completed

Definition 12 (classification times expectation of granulation)In a probability quotient space a multilevels granulationmodel will be established from domain 119880 = 1199091 1199092 119909119899which is a nonempty finite set the health rate is 119902 the maxgranular levels is 119871 and the number of objects in 119894th layer is119896119894 119894 = 1 2 119871 So the average classification time of eachobjects is 119864119894(119884119894) in 119894th layer

119864119894 (119884119894) = 11198961 +119871sum119894=2

[[1119896119894 times119894minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119894=1

(1 minus 119902119896119894) (11)

In this paper we mainly focus on establishing a mini-mum granulation expectation model of classification timesby multigranulation computing method For simplicity themathematical expectation of classification times will beregarded as the measure of contrasting with the searchingefficiency According to Lemma 5 themultilevels granulationmodel can simplify the complex problem only if the preva-lence rate 119901 isin (0 1 minus 119890minus119890minus1) in the blood analysis case

Theorem 13 Let the prevalence rate 119901 isin (0 03) if a groupis tested to be abnormal in the 1st layer (namely this groupcontains abnormal objects) the average classification times ofeach object will be further reduced by subdividing this grouponce again

Proof The expectation difference between the single-levelgranulation 1198641(1198841) and the double-levels granulation 1198642(1198842)can adequately embody their efficiency Under the conditions

Mathematical Problems in Engineering 7

of 119890minus119890minus1 lt 119902 lt 1 and 1 le 1198962 lt 1198961 and according to (3) and (6)the expectation difference1198641(1198841)minus1198642(1198842) is shown as follows

1198641 (1198841) minus 1198642 (1198842)= 11198961 + (1 minus 1199021198961)

minus 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962)= (1 minus 1199021198961) times 1 minus ( 11198962 + 1 minus 1199021198962) gt 0

(12)

According to Lemma 5 (1 minus 1199021198961) gt 0 and 119891119902(1198962) = 11198962 + 1 minus1199021198962 lt 1 always hold then we can get 1minus(11198962+1minus1199021198962) gt 0 Sothe inequality 1198641(119883) minus 1198642(119883) gt 0 is proved successfully

Theorem 13 illustrates that it can reduce classificationtimes by continuing to granulate the abnormal groups intothe 2nd layer when 1198961 gt 1 There is attempt to prove thatthe total of classification times will be further reduced bycontinuously granulating the abnormal groups into 119894th layersuntil a grouprsquos number is no less than 1

Theorem 14 Supposing the prevalence rate 119901 isin (0 03)if a group is tested to be abnormal (namely this groupcontains abnormal objects) the average classification times ofeach object will be reduced by continuously subdividing theabnormal group until the objects number of its subgroup is noless than 1

Proof The expectation difference between (119894 minus 1)-levelsgranulation 119864119894minus1(119884119894minus1) and 119894-levels granulation 119864119894(119884119894) reflectstheir efficiency On the condition of 119890minus119890minus1 lt 119902 lt 1 and1 le 119896119894 lt 119896119894minus1 and according to (11) the expectation difference119864119894minus1(119884119894minus1) minus 119864119894(119884119894) is shown as follows

119864119894minus1 (119884119894minus1) minus 119864119894 (119884119894) = 11198961 +119894minus1sum119897=2

[[1119896119897 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119894minus1prod119897=1

(1 minus 119902119896119897) minus 11198961 +

119894sum119897=2

[[1119896119894 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119897=1

(1 minus 119902119896119897)= (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1)

times 1 minus ( 1119896119894 + 1 minus 119902119896119894) gt 0

(13)

Because (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1) gt 0 is known accordingto Lemma 5 and 119896119894 ge 1 we can get (1119896119894 + 1 minus 119902119896119894) lt 1namely 1 minus (1119896119894 + 1 minus 119902119896119894) gt 0 So 119864119894minus1 minus 119864119894 gt 0 is provedsuccessfully

Theorem 14 shows that this method will continuouslyimprove the searching efficiency in the process of granu-lating abnormal groups from 1st layer to 119894th layer because

119864119894minus1(119884119894minus1) minus119864119894(119884119894) gt 0 always holds However it is found thatthe classification times cannot be reduced when the objectsnumber of an abnormal group is less than or equal to 4 so theobjects of this abnormal group should be tested one by oneIn order to achieve the best efficiency then we will explorehow to determine the optimum granulation namely how todetermine the optimum objects number of each group andhow to obtain the optimum granulation levels

34 The Optimum Granulation It is a difficult and key pointto explore an appropriate granularity space for dealing witha complex problem And it not only requires us to keep theintegrity of the original information but also simplify thecomplex problem So we take the blood analysis case as anexample to explain how to obtain the optimum granularityspace in this paper Suppose the condition 119890minus119890minus1 lt 119902 lt 1always holds

Case 1 (granulating abnormal groups from the 1st layer to the2nd layer) (a) If 1198961 is an even number every group whichcontains 1198961 objects in 1st layer will be subdivided into twosubgroups into 2nd layer

Scheme 15 Supposing the one subgroup of the 2nd layer has119894 (1 le 119894 lt 11989612) objects according to formula (6) theexpectation of classification times for each object is1198642(119894) Andthe other subgroup has (1198961 minus 119894) objects so the expectation ofclassification times for each object is 1198642(1198961 minus 119894) The averageexpectation of classification times for each object in the 2ndlayer is shown as follows

119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (1198961 minus 119894)1198961 (14)

Scheme 16 Suppose every abnormal group in 1st layer is aver-age subdivided into two subgroups namely each subgrouphas 11989612 objects in the 2nd layer According to formula (6)the average expectation of classification times for each objectin the 2nd layer is shown as follows

2 times 11989612 times 1198642 (11989612)1198961 = 1198961 times 1198642 (11989612)1198961 (15)

The expectation difference between the above twoschemes embodies their efficiency In order to prove thatScheme 16 is more efficient than Scheme 15 we only need toprove that the following inequality is correct namely

(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))1198961 minus 1198961 times 1198642 (11989612)1198961

gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (16)

8 Mathematical Problems in Engineering

Table 5 The changes of average expectation with different objects number in two groups

(11989621 11989622) (1 15) (2 14) (3 13) (4 12) (5 11) (6 10) (7 9) (8 8)1198642 007367 007329 007297 007270 007249 007234 007225 007222

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + (1198961 minus 119894)

times ( 11198961 + (1 minus 1199021198961) times (1 + 1(1198961 minus 119894) minus 119902(1198961minus119894)))

gt 1198961 ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))

1198961 minus 1198961 times 1198642 (11989612)1198961gt 0

(17)

The proof is completed

Therefore if every group has 1198961 (1198961 is an even number and1198961 gt 1) objects in the 1st layer that need to be subdivided intotwo subgroup Scheme 16 is more efficient than Scheme 15

The experiment results have verified the above conclusionin Table 5 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the expectationof classification times obtainsminimumvalue for each objectwhere 11989621 is the number of the one subgroup in the 2ndlayer 11989622 is the number of the other subgroup and 1198642 isthe corresponding expectation of classification times for eachobject

(b) If 1198961 is an even number every group which contains1198961 objects in 1st layer will be subdivided into three subgroupsinto 2nd layer

Scheme 17 In the 2nd layer if the first subgroup has 119894 (1 le 119894 lt11989612) objects the average expectation of classification timesfor each object is 1198642(119894) If the second subgroup has 119895 (1 le119895 lt 11989612) objects the expectation of classification times foreach object is 1198642(119895) Then the third subgroup has (1198961 minus 119894 minus119895) objects and the average expectation of classification timesfor each object is 1198642(1198961 minus 119894 minus 119895) So the average expectationof classification times for each object in the 2nd is shown asfollows

119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895)1198961 (18)

Similarly it is easy to be prove that Scheme 16 is alsomoreefficient than Scheme 17 In other words we only need toprove the following inequality namely

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (19)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119905 times 119902119894 + (1198961 minus 119905) times 119902(1198961minus119905))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + 119895

times ( 11198961 + (1 minus 1199021198961) times (1 + 1119895 minus 119902119895)) + (1198961 minus 119894 minus 119895)times ( 11198961 + (1 minus 1199021198961) times (1 + 1

(1198961 minus 119894 minus 119895) minus 119902(1198961minus119894minus119895)))gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0

(20)

The proof is completed

Therefore if every group which contains 1198961 (is an evennumber and 1198961 gt 1) objects needs to be subdivided into threesubgroups in the first layer Scheme 16 is more efficient thanScheme 17

The experimental results have verified the above conclu-sion in Table 6 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the averageexpectation of classification times reachesminimumvalue foreach object In Table 6 the first line stands for the objectsnumber of first group in the 2nd layer and the first row standsfor the objects number of second group and data stands forthe corresponding average expectation of classification timesFor example (1 1 77143) expresses that the objects number

Mathematical Problems in Engineering 9

Table 6 The changes of average expectation with different objectsnumber of three groups

Objects Objects1 2 3 4 5

Expectation (times10minus2)1 77143 76786 76488 76250 760722 76786 76458 76189 75980 758313 76488 76189 75950 75770 756514 76250 75980 75770 75620 755305 76072 75831 75651 75530 754706 75953 75742 75591 75500 754707 75894 75712 75591 75530 755308 75894 75742 75651 75620 756519 75953 75831 75770 75770 75980

of three groups respectively is 1 1 and 14 and the averageclassification time for each object is 1198642 = 0077143 in the 2ndlayer

(c) When an abnormal group contains 1198961 (even) objectsand it needs to be further granulated into the 2nd layerScheme 16 still has the best efficient

There are two granulation schemes Scheme 18 is that theabnormal groups are randomly subdivided into 119904 (119904 lt 1198961)subgroups in the 1st layer and Scheme 16 is that the abnormalgroups are averagely subdivided into two subgroups in the 1stlayer

Scheme 18 Supposing an abnormal group will be subdividedinto 119904 (119904 lt 1198961) subgroups The first group has 1199091 (1 le 1199091 lt11989612) objects and the average expectation of classificationtimes for each object is 1198642(1199091) the 2nd subgroup has1199092 (1 le 1199092 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(1199092) 119894th subgrouphas 119909119894 (1 le 119909119894 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(119909119894) 119904th subgrouphas 119909119904 (1 le 119909119904 lt 11989612) objects and the average expectationof classification times for each object is 1198642(119909119904) Hence theaverage expectation of classification times for each object inthe 2nd layer is shown as follows

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) (21)

Similarly in order to prove that Scheme 16 is moreefficient than Scheme 18 we only need to prove the followinginequality namely

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0(119890minus119890minus1 lt 119902 lt 1 1198961 gt 1)

(22)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 Then we have

sum119904119894=1 119892 (119909119894)2 lt 119892 ( 1198882) 997904rArrsum119904119894=1 119909119894 times 119902119909119894

2 lt 11989612 times 11990211989612 997904rArr119904sum119895=1

119909119895 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119909119895 minus 119902119909119895))

gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0

(23)

The proof is completed

Therefore when every abnormal groupwhich contains 1198961(which is an even number and 1198961 gt 1) in the 1st layer objectsneeds to be granulated into many subgroups Scheme 16 ismore efficient than other schemes

(d) In a similar way when every abnormal group whichcontains 1198961 (1198961 is an odd number and 1198961 gt 1) objects in the 1stlayerwill be granulated intomany subgroups the best schemeis that every abnormal group is uniformly subdivided intotwo subgroups namely each subgroup contains (1198961 minus 1)2or (1198961 + 1)2 objects in the 2nd layer

Case 2 (granulating abnormal groups from the 1st layer to 119894thlayer)

Theorem 19 In 119894th layer if the objects number of eachabnormal group is more than 4 then the total of classificationtimes can be reduced by keeping on subdividing the abnormalgroups into two subgroups which contain equal objects as far aspossible Namely if each group contains 119896119894 objects in 119894th layerthen each subgroupmay contain 1198961198942 or (1198961minus1)2 or (1198961+1)2objects in (119894 + 1)th layerProof In the multigranulation method the objects numberof each subgroup in the next layer is determined by the objectsnumber of group in the current layer In other words theobjects number of each subgroup in the (119894 + 1)th layer isdetermined by the known objects number of each group in119894th layer

According to recursive idea the process of granulatingabnormal group from 119894th layer into (119894+1)th layer is similar tothat 1st layer into 2nd layer It is known that the best efficientway is as far as possible uniformly subdividing an abnormalgroup in current layer into two subgroups in next layer whengranulating abnormal group from 1st layer into 2nd layerTherefore the best efficient way is also as far as possibleuniformly subdividing each abnormal group in 119894th layer intotwo subgroups in (119894 + 1)th layer The proof is completed

Based on 1198961 which is the optimum objects number ofeach group in 1th layer then the optimum granulation levels

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

Mathematical Problems in Engineering 7

of 119890minus119890minus1 lt 119902 lt 1 and 1 le 1198962 lt 1198961 and according to (3) and (6)the expectation difference1198641(1198841)minus1198642(1198842) is shown as follows

1198641 (1198841) minus 1198642 (1198842)= 11198961 + (1 minus 1199021198961)

minus 11198961 + (1 minus 1199021198961) times ( 11198962 + 1 minus 1199021198962)= (1 minus 1199021198961) times 1 minus ( 11198962 + 1 minus 1199021198962) gt 0

(12)

According to Lemma 5 (1 minus 1199021198961) gt 0 and 119891119902(1198962) = 11198962 + 1 minus1199021198962 lt 1 always hold then we can get 1minus(11198962+1minus1199021198962) gt 0 Sothe inequality 1198641(119883) minus 1198642(119883) gt 0 is proved successfully

Theorem 13 illustrates that it can reduce classificationtimes by continuing to granulate the abnormal groups intothe 2nd layer when 1198961 gt 1 There is attempt to prove thatthe total of classification times will be further reduced bycontinuously granulating the abnormal groups into 119894th layersuntil a grouprsquos number is no less than 1

Theorem 14 Supposing the prevalence rate 119901 isin (0 03)if a group is tested to be abnormal (namely this groupcontains abnormal objects) the average classification times ofeach object will be reduced by continuously subdividing theabnormal group until the objects number of its subgroup is noless than 1

Proof The expectation difference between (119894 minus 1)-levelsgranulation 119864119894minus1(119884119894minus1) and 119894-levels granulation 119864119894(119884119894) reflectstheir efficiency On the condition of 119890minus119890minus1 lt 119902 lt 1 and1 le 119896119894 lt 119896119894minus1 and according to (11) the expectation difference119864119894minus1(119884119894minus1) minus 119864119894(119884119894) is shown as follows

119864119894minus1 (119884119894minus1) minus 119864119894 (119884119894) = 11198961 +119894minus1sum119897=2

[[1119896119897 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119894minus1prod119897=1

(1 minus 119902119896119897) minus 11198961 +

119894sum119897=2

[[1119896119894 times119897minus1prod119895=1

(1 minus 119902119896119895)]]

+ 119871prod119897=1

(1 minus 119902119896119897)= (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1)

times 1 minus ( 1119896119894 + 1 minus 119902119896119894) gt 0

(13)

Because (1 minus 1199021198961) times sdot sdot sdot times (1 minus 119902119896119894minus1) gt 0 is known accordingto Lemma 5 and 119896119894 ge 1 we can get (1119896119894 + 1 minus 119902119896119894) lt 1namely 1 minus (1119896119894 + 1 minus 119902119896119894) gt 0 So 119864119894minus1 minus 119864119894 gt 0 is provedsuccessfully

Theorem 14 shows that this method will continuouslyimprove the searching efficiency in the process of granu-lating abnormal groups from 1st layer to 119894th layer because

119864119894minus1(119884119894minus1) minus119864119894(119884119894) gt 0 always holds However it is found thatthe classification times cannot be reduced when the objectsnumber of an abnormal group is less than or equal to 4 so theobjects of this abnormal group should be tested one by oneIn order to achieve the best efficiency then we will explorehow to determine the optimum granulation namely how todetermine the optimum objects number of each group andhow to obtain the optimum granulation levels

34 The Optimum Granulation It is a difficult and key pointto explore an appropriate granularity space for dealing witha complex problem And it not only requires us to keep theintegrity of the original information but also simplify thecomplex problem So we take the blood analysis case as anexample to explain how to obtain the optimum granularityspace in this paper Suppose the condition 119890minus119890minus1 lt 119902 lt 1always holds

Case 1 (granulating abnormal groups from the 1st layer to the2nd layer) (a) If 1198961 is an even number every group whichcontains 1198961 objects in 1st layer will be subdivided into twosubgroups into 2nd layer

Scheme 15 Supposing the one subgroup of the 2nd layer has119894 (1 le 119894 lt 11989612) objects according to formula (6) theexpectation of classification times for each object is1198642(119894) Andthe other subgroup has (1198961 minus 119894) objects so the expectation ofclassification times for each object is 1198642(1198961 minus 119894) The averageexpectation of classification times for each object in the 2ndlayer is shown as follows

119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (1198961 minus 119894)1198961 (14)

Scheme 16 Suppose every abnormal group in 1st layer is aver-age subdivided into two subgroups namely each subgrouphas 11989612 objects in the 2nd layer According to formula (6)the average expectation of classification times for each objectin the 2nd layer is shown as follows

2 times 11989612 times 1198642 (11989612)1198961 = 1198961 times 1198642 (11989612)1198961 (15)

The expectation difference between the above twoschemes embodies their efficiency In order to prove thatScheme 16 is more efficient than Scheme 15 we only need toprove that the following inequality is correct namely

(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))1198961 minus 1198961 times 1198642 (11989612)1198961

gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (16)

8 Mathematical Problems in Engineering

Table 5 The changes of average expectation with different objects number in two groups

(11989621 11989622) (1 15) (2 14) (3 13) (4 12) (5 11) (6 10) (7 9) (8 8)1198642 007367 007329 007297 007270 007249 007234 007225 007222

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + (1198961 minus 119894)

times ( 11198961 + (1 minus 1199021198961) times (1 + 1(1198961 minus 119894) minus 119902(1198961minus119894)))

gt 1198961 ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))

1198961 minus 1198961 times 1198642 (11989612)1198961gt 0

(17)

The proof is completed

Therefore if every group has 1198961 (1198961 is an even number and1198961 gt 1) objects in the 1st layer that need to be subdivided intotwo subgroup Scheme 16 is more efficient than Scheme 15

The experiment results have verified the above conclusionin Table 5 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the expectationof classification times obtainsminimumvalue for each objectwhere 11989621 is the number of the one subgroup in the 2ndlayer 11989622 is the number of the other subgroup and 1198642 isthe corresponding expectation of classification times for eachobject

(b) If 1198961 is an even number every group which contains1198961 objects in 1st layer will be subdivided into three subgroupsinto 2nd layer

Scheme 17 In the 2nd layer if the first subgroup has 119894 (1 le 119894 lt11989612) objects the average expectation of classification timesfor each object is 1198642(119894) If the second subgroup has 119895 (1 le119895 lt 11989612) objects the expectation of classification times foreach object is 1198642(119895) Then the third subgroup has (1198961 minus 119894 minus119895) objects and the average expectation of classification timesfor each object is 1198642(1198961 minus 119894 minus 119895) So the average expectationof classification times for each object in the 2nd is shown asfollows

119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895)1198961 (18)

Similarly it is easy to be prove that Scheme 16 is alsomoreefficient than Scheme 17 In other words we only need toprove the following inequality namely

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (19)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119905 times 119902119894 + (1198961 minus 119905) times 119902(1198961minus119905))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + 119895

times ( 11198961 + (1 minus 1199021198961) times (1 + 1119895 minus 119902119895)) + (1198961 minus 119894 minus 119895)times ( 11198961 + (1 minus 1199021198961) times (1 + 1

(1198961 minus 119894 minus 119895) minus 119902(1198961minus119894minus119895)))gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0

(20)

The proof is completed

Therefore if every group which contains 1198961 (is an evennumber and 1198961 gt 1) objects needs to be subdivided into threesubgroups in the first layer Scheme 16 is more efficient thanScheme 17

The experimental results have verified the above conclu-sion in Table 6 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the averageexpectation of classification times reachesminimumvalue foreach object In Table 6 the first line stands for the objectsnumber of first group in the 2nd layer and the first row standsfor the objects number of second group and data stands forthe corresponding average expectation of classification timesFor example (1 1 77143) expresses that the objects number

Mathematical Problems in Engineering 9

Table 6 The changes of average expectation with different objectsnumber of three groups

Objects Objects1 2 3 4 5

Expectation (times10minus2)1 77143 76786 76488 76250 760722 76786 76458 76189 75980 758313 76488 76189 75950 75770 756514 76250 75980 75770 75620 755305 76072 75831 75651 75530 754706 75953 75742 75591 75500 754707 75894 75712 75591 75530 755308 75894 75742 75651 75620 756519 75953 75831 75770 75770 75980

of three groups respectively is 1 1 and 14 and the averageclassification time for each object is 1198642 = 0077143 in the 2ndlayer

(c) When an abnormal group contains 1198961 (even) objectsand it needs to be further granulated into the 2nd layerScheme 16 still has the best efficient

There are two granulation schemes Scheme 18 is that theabnormal groups are randomly subdivided into 119904 (119904 lt 1198961)subgroups in the 1st layer and Scheme 16 is that the abnormalgroups are averagely subdivided into two subgroups in the 1stlayer

Scheme 18 Supposing an abnormal group will be subdividedinto 119904 (119904 lt 1198961) subgroups The first group has 1199091 (1 le 1199091 lt11989612) objects and the average expectation of classificationtimes for each object is 1198642(1199091) the 2nd subgroup has1199092 (1 le 1199092 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(1199092) 119894th subgrouphas 119909119894 (1 le 119909119894 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(119909119894) 119904th subgrouphas 119909119904 (1 le 119909119904 lt 11989612) objects and the average expectationof classification times for each object is 1198642(119909119904) Hence theaverage expectation of classification times for each object inthe 2nd layer is shown as follows

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) (21)

Similarly in order to prove that Scheme 16 is moreefficient than Scheme 18 we only need to prove the followinginequality namely

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0(119890minus119890minus1 lt 119902 lt 1 1198961 gt 1)

(22)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 Then we have

sum119904119894=1 119892 (119909119894)2 lt 119892 ( 1198882) 997904rArrsum119904119894=1 119909119894 times 119902119909119894

2 lt 11989612 times 11990211989612 997904rArr119904sum119895=1

119909119895 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119909119895 minus 119902119909119895))

gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0

(23)

The proof is completed

Therefore when every abnormal groupwhich contains 1198961(which is an even number and 1198961 gt 1) in the 1st layer objectsneeds to be granulated into many subgroups Scheme 16 ismore efficient than other schemes

(d) In a similar way when every abnormal group whichcontains 1198961 (1198961 is an odd number and 1198961 gt 1) objects in the 1stlayerwill be granulated intomany subgroups the best schemeis that every abnormal group is uniformly subdivided intotwo subgroups namely each subgroup contains (1198961 minus 1)2or (1198961 + 1)2 objects in the 2nd layer

Case 2 (granulating abnormal groups from the 1st layer to 119894thlayer)

Theorem 19 In 119894th layer if the objects number of eachabnormal group is more than 4 then the total of classificationtimes can be reduced by keeping on subdividing the abnormalgroups into two subgroups which contain equal objects as far aspossible Namely if each group contains 119896119894 objects in 119894th layerthen each subgroupmay contain 1198961198942 or (1198961minus1)2 or (1198961+1)2objects in (119894 + 1)th layerProof In the multigranulation method the objects numberof each subgroup in the next layer is determined by the objectsnumber of group in the current layer In other words theobjects number of each subgroup in the (119894 + 1)th layer isdetermined by the known objects number of each group in119894th layer

According to recursive idea the process of granulatingabnormal group from 119894th layer into (119894+1)th layer is similar tothat 1st layer into 2nd layer It is known that the best efficientway is as far as possible uniformly subdividing an abnormalgroup in current layer into two subgroups in next layer whengranulating abnormal group from 1st layer into 2nd layerTherefore the best efficient way is also as far as possibleuniformly subdividing each abnormal group in 119894th layer intotwo subgroups in (119894 + 1)th layer The proof is completed

Based on 1198961 which is the optimum objects number ofeach group in 1th layer then the optimum granulation levels

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

8 Mathematical Problems in Engineering

Table 5 The changes of average expectation with different objects number in two groups

(11989621 11989622) (1 15) (2 14) (3 13) (4 12) (5 11) (6 10) (7 9) (8 8)1198642 007367 007329 007297 007270 007249 007234 007225 007222

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + (1198961 minus 119894) times 119902(1198961minus119894)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + (1198961 minus 119894)

times ( 11198961 + (1 minus 1199021198961) times (1 + 1(1198961 minus 119894) minus 119902(1198961minus119894)))

gt 1198961 ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr(119894 times 1198642 (119894) + (1198961 minus 119894) times 1198642 (11989611 minus 119894))

1198961 minus 1198961 times 1198642 (11989612)1198961gt 0

(17)

The proof is completed

Therefore if every group has 1198961 (1198961 is an even number and1198961 gt 1) objects in the 1st layer that need to be subdivided intotwo subgroup Scheme 16 is more efficient than Scheme 15

The experiment results have verified the above conclusionin Table 5 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the expectationof classification times obtainsminimumvalue for each objectwhere 11989621 is the number of the one subgroup in the 2ndlayer 11989622 is the number of the other subgroup and 1198642 isthe corresponding expectation of classification times for eachobject

(b) If 1198961 is an even number every group which contains1198961 objects in 1st layer will be subdivided into three subgroupsinto 2nd layer

Scheme 17 In the 2nd layer if the first subgroup has 119894 (1 le 119894 lt11989612) objects the average expectation of classification timesfor each object is 1198642(119894) If the second subgroup has 119895 (1 le119895 lt 11989612) objects the expectation of classification times foreach object is 1198642(119895) Then the third subgroup has (1198961 minus 119894 minus119895) objects and the average expectation of classification timesfor each object is 1198642(1198961 minus 119894 minus 119895) So the average expectationof classification times for each object in the 2nd is shown asfollows

119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895)1198961 (18)

Similarly it is easy to be prove that Scheme 16 is alsomoreefficient than Scheme 17 In other words we only need toprove the following inequality namely

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0 (119890minus119890minus1 lt 119902 lt 1 1198961 gt 1) (19)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 then we have

119892 (119894) + 119892 (119888 minus 119894)2 lt 119892 ( 1198882) 997904rArr

(119905 times 119902119894 + (1198961 minus 119905) times 119902(1198961minus119905))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895))2 lt 11989612 times 11990211989612 997904rArr

(119894 times 119902119894 + 119895 times 119902119895 + (1198961 minus 119894 minus 119895) times 119902(1198961minus119894minus119895)) lt 1198961 times 11990211989612 997904rArr119894 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119894 minus 119902119894)) + 119895

times ( 11198961 + (1 minus 1199021198961) times (1 + 1119895 minus 119902119895)) + (1198961 minus 119894 minus 119895)times ( 11198961 + (1 minus 1199021198961) times (1 + 1

(1198961 minus 119894 minus 119895) minus 119902(1198961minus119894minus119895)))gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr

(119894 times 1198642 (119894) + 119895 times 1198642 (119895) + (1198961 minus 119894 minus 119895) times 1198642 (1198961 minus 119894 minus 119895))1198961

minus 1198961 times 1198642 (11989612)1198961 gt 0

(20)

The proof is completed

Therefore if every group which contains 1198961 (is an evennumber and 1198961 gt 1) objects needs to be subdivided into threesubgroups in the first layer Scheme 16 is more efficient thanScheme 17

The experimental results have verified the above conclu-sion in Table 6 Let 119901 = 0004 and 1198961 = 16 When everysubgroup contains 8 objects in the 2nd layer the averageexpectation of classification times reachesminimumvalue foreach object In Table 6 the first line stands for the objectsnumber of first group in the 2nd layer and the first row standsfor the objects number of second group and data stands forthe corresponding average expectation of classification timesFor example (1 1 77143) expresses that the objects number

Mathematical Problems in Engineering 9

Table 6 The changes of average expectation with different objectsnumber of three groups

Objects Objects1 2 3 4 5

Expectation (times10minus2)1 77143 76786 76488 76250 760722 76786 76458 76189 75980 758313 76488 76189 75950 75770 756514 76250 75980 75770 75620 755305 76072 75831 75651 75530 754706 75953 75742 75591 75500 754707 75894 75712 75591 75530 755308 75894 75742 75651 75620 756519 75953 75831 75770 75770 75980

of three groups respectively is 1 1 and 14 and the averageclassification time for each object is 1198642 = 0077143 in the 2ndlayer

(c) When an abnormal group contains 1198961 (even) objectsand it needs to be further granulated into the 2nd layerScheme 16 still has the best efficient

There are two granulation schemes Scheme 18 is that theabnormal groups are randomly subdivided into 119904 (119904 lt 1198961)subgroups in the 1st layer and Scheme 16 is that the abnormalgroups are averagely subdivided into two subgroups in the 1stlayer

Scheme 18 Supposing an abnormal group will be subdividedinto 119904 (119904 lt 1198961) subgroups The first group has 1199091 (1 le 1199091 lt11989612) objects and the average expectation of classificationtimes for each object is 1198642(1199091) the 2nd subgroup has1199092 (1 le 1199092 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(1199092) 119894th subgrouphas 119909119894 (1 le 119909119894 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(119909119894) 119904th subgrouphas 119909119904 (1 le 119909119904 lt 11989612) objects and the average expectationof classification times for each object is 1198642(119909119904) Hence theaverage expectation of classification times for each object inthe 2nd layer is shown as follows

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) (21)

Similarly in order to prove that Scheme 16 is moreefficient than Scheme 18 we only need to prove the followinginequality namely

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0(119890minus119890minus1 lt 119902 lt 1 1198961 gt 1)

(22)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 Then we have

sum119904119894=1 119892 (119909119894)2 lt 119892 ( 1198882) 997904rArrsum119904119894=1 119909119894 times 119902119909119894

2 lt 11989612 times 11990211989612 997904rArr119904sum119895=1

119909119895 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119909119895 minus 119902119909119895))

gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0

(23)

The proof is completed

Therefore when every abnormal groupwhich contains 1198961(which is an even number and 1198961 gt 1) in the 1st layer objectsneeds to be granulated into many subgroups Scheme 16 ismore efficient than other schemes

(d) In a similar way when every abnormal group whichcontains 1198961 (1198961 is an odd number and 1198961 gt 1) objects in the 1stlayerwill be granulated intomany subgroups the best schemeis that every abnormal group is uniformly subdivided intotwo subgroups namely each subgroup contains (1198961 minus 1)2or (1198961 + 1)2 objects in the 2nd layer

Case 2 (granulating abnormal groups from the 1st layer to 119894thlayer)

Theorem 19 In 119894th layer if the objects number of eachabnormal group is more than 4 then the total of classificationtimes can be reduced by keeping on subdividing the abnormalgroups into two subgroups which contain equal objects as far aspossible Namely if each group contains 119896119894 objects in 119894th layerthen each subgroupmay contain 1198961198942 or (1198961minus1)2 or (1198961+1)2objects in (119894 + 1)th layerProof In the multigranulation method the objects numberof each subgroup in the next layer is determined by the objectsnumber of group in the current layer In other words theobjects number of each subgroup in the (119894 + 1)th layer isdetermined by the known objects number of each group in119894th layer

According to recursive idea the process of granulatingabnormal group from 119894th layer into (119894+1)th layer is similar tothat 1st layer into 2nd layer It is known that the best efficientway is as far as possible uniformly subdividing an abnormalgroup in current layer into two subgroups in next layer whengranulating abnormal group from 1st layer into 2nd layerTherefore the best efficient way is also as far as possibleuniformly subdividing each abnormal group in 119894th layer intotwo subgroups in (119894 + 1)th layer The proof is completed

Based on 1198961 which is the optimum objects number ofeach group in 1th layer then the optimum granulation levels

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

Mathematical Problems in Engineering 9

Table 6 The changes of average expectation with different objectsnumber of three groups

Objects Objects1 2 3 4 5

Expectation (times10minus2)1 77143 76786 76488 76250 760722 76786 76458 76189 75980 758313 76488 76189 75950 75770 756514 76250 75980 75770 75620 755305 76072 75831 75651 75530 754706 75953 75742 75591 75500 754707 75894 75712 75591 75530 755308 75894 75742 75651 75620 756519 75953 75831 75770 75770 75980

of three groups respectively is 1 1 and 14 and the averageclassification time for each object is 1198642 = 0077143 in the 2ndlayer

(c) When an abnormal group contains 1198961 (even) objectsand it needs to be further granulated into the 2nd layerScheme 16 still has the best efficient

There are two granulation schemes Scheme 18 is that theabnormal groups are randomly subdivided into 119904 (119904 lt 1198961)subgroups in the 1st layer and Scheme 16 is that the abnormalgroups are averagely subdivided into two subgroups in the 1stlayer

Scheme 18 Supposing an abnormal group will be subdividedinto 119904 (119904 lt 1198961) subgroups The first group has 1199091 (1 le 1199091 lt11989612) objects and the average expectation of classificationtimes for each object is 1198642(1199091) the 2nd subgroup has1199092 (1 le 1199092 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(1199092) 119894th subgrouphas 119909119894 (1 le 119909119894 lt 11989612) objects and the average expectation ofclassification times for each object is 1198642(119909119894) 119904th subgrouphas 119909119904 (1 le 119909119904 lt 11989612) objects and the average expectationof classification times for each object is 1198642(119909119904) Hence theaverage expectation of classification times for each object inthe 2nd layer is shown as follows

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) (21)

Similarly in order to prove that Scheme 16 is moreefficient than Scheme 18 we only need to prove the followinginequality namely

11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0(119890minus119890minus1 lt 119902 lt 1 1198961 gt 1)

(22)

Proof Let 119892(119909) = 119909119902119909 (119890minus119890minus1 lt 119902 lt 1) and according toLemma 7 Then we have

sum119904119894=1 119892 (119909119894)2 lt 119892 ( 1198882) 997904rArrsum119904119894=1 119909119894 times 119902119909119894

2 lt 11989612 times 11990211989612 997904rArr119904sum119895=1

119909119895 times ( 11198961 + (1 minus 1199021198961) times (1 + 1119909119895 minus 119902119909119895))

gt 1198961 times ( 11198961 + (1 minus 1199021198961) times (1 + 21198962 minus 11990211989612)) 997904rArr11198961 times119904sum119895=1

119909119895 times 1198642 (119909119895) minus 1198961 times 1198642 (11989612)1198961 gt 0

(23)

The proof is completed

Therefore when every abnormal groupwhich contains 1198961(which is an even number and 1198961 gt 1) in the 1st layer objectsneeds to be granulated into many subgroups Scheme 16 ismore efficient than other schemes

(d) In a similar way when every abnormal group whichcontains 1198961 (1198961 is an odd number and 1198961 gt 1) objects in the 1stlayerwill be granulated intomany subgroups the best schemeis that every abnormal group is uniformly subdivided intotwo subgroups namely each subgroup contains (1198961 minus 1)2or (1198961 + 1)2 objects in the 2nd layer

Case 2 (granulating abnormal groups from the 1st layer to 119894thlayer)

Theorem 19 In 119894th layer if the objects number of eachabnormal group is more than 4 then the total of classificationtimes can be reduced by keeping on subdividing the abnormalgroups into two subgroups which contain equal objects as far aspossible Namely if each group contains 119896119894 objects in 119894th layerthen each subgroupmay contain 1198961198942 or (1198961minus1)2 or (1198961+1)2objects in (119894 + 1)th layerProof In the multigranulation method the objects numberof each subgroup in the next layer is determined by the objectsnumber of group in the current layer In other words theobjects number of each subgroup in the (119894 + 1)th layer isdetermined by the known objects number of each group in119894th layer

According to recursive idea the process of granulatingabnormal group from 119894th layer into (119894+1)th layer is similar tothat 1st layer into 2nd layer It is known that the best efficientway is as far as possible uniformly subdividing an abnormalgroup in current layer into two subgroups in next layer whengranulating abnormal group from 1st layer into 2nd layerTherefore the best efficient way is also as far as possibleuniformly subdividing each abnormal group in 119894th layer intotwo subgroups in (119894 + 1)th layer The proof is completed

Based on 1198961 which is the optimum objects number ofeach group in 1th layer then the optimum granulation levels

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

10 Mathematical Problems in Engineering

Table 7Thebest testing strategy in different layerswith the differentprevalence rates

119901 119864119894 (1198961 1198962 119896119894)001 0157649743271 (11 5 2)0001 0034610328332 (32 16 8 4)00001 0010508158027 (101 50 25 12 6 3)000001 0003301655870 (317 158 79 39 19 9 4)0000001 0001041044160 (1001 500 250 125 62 31 15 7 3)

and their corresponding objects number of each group couldbe obtained by Theorem 19 That is to say 119896119894+1 = 1198961198942 (or119896119894+1 = (119896119894 minus 1)2 or 119896119894+1 = (119896119894 + 1)2) where 119896119894 (119896119894 gt 4) is theobjects number of each abnormal group in 119894th (1 le 119894 le 119904 minus 1)layer and 119904 is the optimum granulation levels Namely inthis multilevels granulation method the final structure ofgranulating abnormal group from the 1st layer to last layer issimilar to a binary tree and the origin space can be granulatedto the structure which contains many binary trees

According to Theorem 19 multigranulation strategy canbe used to solve the blood analysis case When facing thedifferent prevalence rates such as 1199011 = 001 1199012 = 00011199013 = 00001 1199014 = 000001 and 1199015 = 0000001 the bestsearching strategy is that the objects number of each groupin the different layers is shown in Table 7 (119896119894 stands for theobjects number of each groups in 119894th layer and 119864119894 standsfor the average expectation of classification times for eachobject)

Theorem20 In the abovemultilevels granulationmethod if119901which is the prevalence rate of a sickness (or the negative sampleratio in domain) tends to 0 the average classification timesfor each object tend to be 11198961 in other words the followingequation always holds

lim119901rarr0

119864119894 = 11198961 (24)

Proof According to Definition 12 let 119902 = 1 minus 119901 119902 rarr 1 wehave

119864119894 = 11198961 + (1 minus 1199021198961) times ( 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963)times ( 11198964 + (1 minus 1199021198964)times (sdot sdot sdot times ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894))) sdot sdot sdot))))

(25)

According to Lemma 6 1198961 = [1(119901+ (11990122))] or 1198961 = [1(119901+(11990122))] + 1 And then let

119879 = 11198962 + (1 minus 1199021198962) times ( 11198963 + (1 minus 1199021198963) times ( 11198964 + (1minus 1199021198964) times (sdot sdot sdottimes ( 1119896119894minus1 + (1 minus 119902119896119894minus1) times ( 1119896119894 + (1 minus 119902119896119894)))sdot sdot sdot)))

(26)

q

T

E

080 085 090 095 1075

02

04

06

08

1

1k

_1E

_i(

)

Figure 3 The changing trend about 119879 and 119864 with 119902

so lim119902rarr1119879119894 = 0 and lim119902rarr1119864119894 = 11198961 The proof iscompleted

Let 119864 = (11198961)119864119894 The changing trend between 119879 and 119864with the variable 119902 is shown in Figure 3

35 Binary Classification of Multigranulation Searching Algo-rithm In this paper a kind of efficient binary classificationof multigranulation searching algorithm is proposed throughdiscussing the best testing strategy of the blood analysis caseThe algorithm is illuminated as follows

Algorithm 21 Binary classification of multigranulationsearching algorithm (BCMSA)

Input A probability quotient space 119876 = (119880 2119880 119901)Output The average classification times expectation of eachobject 119864Step 1 1198961 will be obtained based on Lemma 6 119894 = 1 119895 = 0and searching numbers = 0 will be initializedStep 2 Randomly dividing 119880119894119895 into 119904119894 subgroups1198801198941 1198801198942 119880119894119904119894 (119904119894 = 119873119894119895119896119894 where 119873119894119895 stands for thenumber of objects in 119880119894119895 11988010 = 1198801)Step 3 For 119894 to lfloorlog1198731198941198952 rfloor (lfloorlog1198731198941198952 rfloor stands for log1198731198941198952 and willround down to the nearest integer which is less than it )

For 119895 to 119904119894If Test(119880119894119895) gt 0 and 119873119894119895 gt 4 Thensearching numbers + 1 119880119894+1 = 119880119894119895 119894 + 1 go toStep 2 (Test function is a searching method)If Test(119880119894119895) = 0 Then searching numbers + 1 119894 + 1If119873119894119895 le 4 Go to Step 4

Step 4 searching numbers+sum119880119894119895 119864 = (searching numbers+119880119873)119873

Step 5 Return 119864The algorithm flowchart is shown in Figure 4

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

Mathematical Problems in Engineering 11

Begin

Output E

End

TrueFalse

False

True

Input Q = (U 2U p)

k1 i = 1 j = 0

searching_numbers = 0

si = NijkiUij rarr Ui1 Ui2 middot middot middot Uis119894

Nij gt 4 Test(Uij) = 0 earching_numbers + 1

i + 1

earching_numbers + sumUij

E+ UN

N

earching_numbers=

s

s

s

Figure 4 Flowchart of BCMSA

Complexity Analysis of Algorithm 21 In this algorithm thebest case is 119901 where the prevalence rate tends to be 0 and theclassification time is119873lowast119864119894 asymp 1198731198961 asymp 119873lowast(119901+(11990122)) whichtends to 1 so the time complexity of computing is 119874(1) Butthe worst case is119901which tends to be 03 and the classificationtimes tend to 119873 so the time complexity of computing is119874(119873)4 Comparative Analysis on

Experimental Results

In order to verify the efficiency of the proposed BCMSAin this paper suppose there are two large domains 1198731 =1times 104 1198732 = 100times 104 and five kinds of different prevalencerates which are 1199011 = 001 1199012 = 0001 1199013 = 00001 1199014 =000001 and 1199015 = 0000001 In the experiment of bloodanalysis case the number ldquo0rdquo stands for a sick sample (neg-ative sample) and ldquo1rdquo stands for a healthy sample (positivesample) then randomly generating119873 numbers in which theprobability of generating ldquo0rdquo denoted as 119901 and the probabilityof generating ldquo1rdquo denoted as 1 minus 119901 stand for all the domainobjects The binary classifier is that summing all numbers ina group (subgroup) if the sum is more than 1 it means thisgroup is tested to be abnormal and if the sum equals 0 itmeans this group has been tested to be normal

Experimental environment is 4G RAM 25GHz CPUand WIN 8 system the program language is Python and theexperimental results are shown in Table 8

In Table 8 item ldquo119901rdquo stands for the prevalence rateitem ldquolevelsrdquo stands for the granulation levels of differentmethods and item ldquo119864(119883)rdquo stands for the average expectationof classification times for each object Item ldquo1198961rdquo stands forthe objects number of each group in the 1st layer Item ldquoℓrdquostands for the degree of 119864(119883) close to 111989611198731 = 1 times 104 and1198732 = 1times106 respectively stand for the objects number of twooriginal domains Items ldquoMethod 9rdquo and ldquoMethod 10rdquo respec-tively stand for the improved efficiency where Method 11compares with Method 9 and Method 10

Form Table 8 diagnosing all objects needs to expend10000 classification times by Method 9 (traditional method)201 classification times in Method 10 (single-level groupingmethod) and only 113 classification times in Method 11(multilevels grouping method) for confirming the testingresults of all objects when 1198731 = 1 times 104 and 119901 = 00001Obviously the proposed algorithm is more efficient thanMethod 9 and Method 10 and the classification times canrespectively be reduced to 9889 and 4733 At the sametime when the probability 119901 is gradually reduced BCMSAhas gradually become more efficient than Method 10 and ℓtends to 100 that is to say the average classification timefor each object tends to 11198961 in the BCMSA In addition

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

12 Mathematical Problems in Engineering

Table 8 Comparative result of efficiency among 3 kinds of methods

119901 Levels 119864(119883) 1198961 ℓ 1198731 1198732 Method 9 Method 10

001 Single-level 019557083665 11 4648 1944 195817 8144 mdash2 levels 015764974327 5767 1627 164235 8358 1613

0001 Single-level 006275892424 32 4979 633 62674 9472 mdash4 levels 003461032833 9029 413 41184 9682 3462

00001 Single-level 001995065634 101 4962 201 19799 9800 mdash6 levels 001050815802 9422 113 11212 9889 4733

000001 Single-level 000631957079 318 4976 633 6325 9937 mdash7 levels 000330165587 9526 333 3324 9967 4775

0000001 Single-level 000199950067 1001 4996 15 2001 9980 mdash9 levels 000104104416 9596 15 1022 9989 4794

Multilevels granulation method

00033

00063 0020011

00350063

0196

0158

000001 00001 0001 001

Single-level granulation method

0

005

01

015

02

025

Figure 5 Comparative analysis between 2 kinds of methods

the BCMSA can save 0sim50 of the classification timescompared with Method 10 The efficiency of Method 10(single-level granulationmethod) andMethod 11 (multilevelsgranulation method) is shown in Figure 5 the 119909-axis standsfor prevalence rate (or the negative sample rate) and the 119910-axis stands for the average expectation of classification timesfor each object

In this paper BCMSA is proposed and it can greatlyimprove searching efficiency when dealing with complexsearching problems If there is a binary classifier which is notonly valid to an object but also valid to a group with manyobjects the efficiency of searching all objectswill be enhancedby BCMSA such as blood analysis case As the same time itmay play an important role for promoting the development ofgranular computing Of course this algorithm also has somelimitations For example if the prevalence rate of a sickness(or the occurrence rate of event 119860) 119901 gt 03 it will haveno advantage compared with traditional method In otherwords the original problem need not be subdivided intomany subproblems when 119901 gt 03 And when the prevalencerate of a sickness (or the negative sample rate in domain) isunknown this algorithmneeds to be further improved so thatit can adapt to the new environment

5 Conclusions

With the development of intelligence computation multi-granulation computing has gradually become an important

tool to process the complex problems Specially in the processof knowledge cognition granulating a huge problem intolots of small subproblems means to simplify the originalcomplex problem and deal with these subproblems in dif-ferent granularity spaces [64] This hierarchical computingmodel is very effective for getting a complete solution orapproximate solution of the original problem due to its ideaof divide and conquer Recently many scholars pay theirattention to efficient searching algorithms based on granularcomputing theory For example a kind of algorithm fordealing with complex network on the basis of quotient spacemodel is proposed by L Zhang and B Zhang [65] In thispaper combining hierarchical multigranulation computingmodel and principle of probability statistics a new efficientbinary classifier of multigranulation searching algorithm isestablished on the basis of mathematical expectation of prob-ability statistics and this searching algorithm is proposedaccording to recursive method in multigranulation spacesMany experimental results have shown that the proposedmethod is effective and can save lots of classification timesThese results may promote the development of intelligentcomputation and speed up the application of multigranu-lation computing However this method also causes someshortcomings For example on the one hand this methodhas strict limitation for the probability value of 119901 namely119901 lt 03 On the contrary if 119901 gt 03 the proposed searchingalgorithm probably is not the most effective method and theimproved methods need to be found On the other handit needs a binary classifier which is not only valid to anobject but also valid to a group with many objects In theend with the decrease of probability value of 119901 (even itinfinitely closes to zero) for every object its mathematicalexpectation of searching time will gradually close to 11198961In our future research we will focus on the issue on howto granulate the huge granule space without any probabilityvalue of each object and try our best to establish a kind ofeffective searching algorithm under which we do not knowthe probability of negative samples in domain We hopethese researches can promote the development of artificialintelligence

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 13: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

Mathematical Problems in Engineering 13

Competing Interests

The authors declared that they have no conflict of interestsrelated to this work

Acknowledgments

This work is supported by the National Natural ScienceFoundation of China (no 61472056) and the Natural Sci-ence Foundation of Chongqing of China (no CSTC2013jjb40003)

References

[1] A Gacek ldquoSignal processing and time series description aperspective of computational intelligence and granular comput-ingrdquo Applied Soft Computing Journal vol 27 pp 590ndash601 2015

[2] O Hryniewicz and K Kaczmarek ldquoBayesian analysis of timeseries using granular computing approachrdquo Applied Soft Com-puting vol 47 pp 644ndash652 2016

[3] C Liu ldquoCovering multi-granulation rough sets based on max-imal descriptorsrdquo Information Technology Journal vol 13 no 7pp 1396ndash1400 2014

[4] Z Y Li ldquoCovering-based multi-granulation decision-theoreticrough setsmodelrdquo Journal of LanzhouUniversity no 2 pp 245ndash250 2014

[5] Y Y Yao and Y She ldquoRough set models in multigranulationspacesrdquo Information Sciences vol 327 pp 40ndash56 2016

[6] J Xu Y Zhang D Zhou et al ldquoUncertain multi-granulationtime series modeling based on granular computing and theclustering practicerdquo Journal of Nanjing University vol 50 no1 pp 86ndash94 2014

[7] Y T Guo ldquoVariable precision 120573 multi-granulation rough setsbased on limited tolerance relationrdquo Journal of Minnan NormalUniversity no 1 pp 1ndash11 2015

[8] X U Yi J H Yang and J I Xia ldquoNeighborhood multi-granulation rough set model based on double granulate crite-rionrdquo Control and Decision vol 30 no 8 pp 1469ndash1478 2015

[9] L A Zadeh ldquoTowards a theory of fuzzy information granu-lation and its centrality in human reasoning and fuzzy logicrdquoFuzzy Sets and Systems vol 19 pp 111ndash127 1997

[10] J R Hobbs ldquoGranularityrdquo in Proceedings of the 9th InternationalJoint Conference on Artificial Intelligence Los Angeles CalifUSA 1985

[11] L Zhang and B Zhang ldquoTheory of fuzzy quotient space (meth-ods of fuzzy granular computing)rdquo Journal of Software vol14 no 4 pp 770ndash776 2003

[12] J Li C MeiW Xu and Y Qian ldquoConcept learning via granularcomputing a cognitive viewpointrdquo Information Sciences vol298 no 1 pp 447ndash467 2015

[13] X HuW Pedrycz and XWang ldquoComparative analysis of logicoperators a perspective of statistical testing and granular com-putingrdquo International Journal of Approximate Reasoning vol66 pp 73ndash90 2015

[14] M G C A Cimino B Lazzerini FMarcelloni andW PedryczldquoGenetic interval neural networks for granular data regressionrdquoInformation Sciences vol 257 pp 313ndash330 2014

[15] P Honko ldquoUpgrading a granular computing based data miningframework to a relational caserdquo International Journal of Intelli-gent Systems vol 29 no 5 pp 407ndash438 2014

[16] M-Y Chen and B-T Chen ldquoA hybrid fuzzy time series modelbased on granular computing for stock price forecastingrdquoInformation Sciences vol 294 pp 227ndash241 2015

[17] R Al-Hmouz W Pedrycz and A Balamash ldquoDescription andprediction of time series a general framework of GranularComputingrdquo Expert Systems with Applications vol 42 no 10pp 4830ndash4839 2015

[18] M Hilbert ldquoBig data for development a review of promises andchallengesrdquo Social Science Electronic Publishing vol 34 no 1 pp135ndash174 2016

[19] T J Sejnowski S P Churchland and J A Movshon ldquoPuttingbig data to good use in neurosciencerdquoNature Neuroscience vol17 no 11 pp 1440ndash1441 2014

[20] G George M R Haas and A Pentland ldquoBIG DATA andmanagementrdquo Academy of Management Journal vol 30 no 2pp 39ndash52 2014

[21] M Chen S Mao and Y Liu ldquoBig data a surveyrdquo MobileNetworks and Applications vol 19 no 2 pp 171ndash209 2014

[22] X Wu X Zhu G Q Wu and W Ding ldquoData mining with bigdatardquo IEEE Transactions on Knowledge ampData Engineering vol26 no 1 pp 97ndash107 2014

[23] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595Garden Villa Kaohsiung Taiwan 2011

[24] H Hu and Z Zhong ldquoPerception learning as granular comput-ingrdquo Natural Computation vol 3 pp 272ndash276 2008

[25] Z-H Chen Y Zhang and G Xie ldquoMining algorithm forconcise decision rules based on granular computingrdquo Controland Decision vol 30 no 1 pp 143ndash148 2015

[26] K Kambatla G Kollias V Kumar and A Grama ldquoTrends inbig data analyticsrdquo Journal of Parallel amp Distributed Computingvol 74 no 7 pp 2561ndash2573 2014

[27] A Katal M Wazid and R H Goudar ldquoBig data issueschallenges tools and good practicesrdquo in Proceedings of the 6thInternational Conference on Contemporary Computing (IC3 rsquo13)pp 404ndash409 IEEE New Delhi India August 2013

[28] V Cevher S Becker andM Schmidt ldquoConvex optimization forbig data scalable randomized and parallel algorithms for bigdata analyticsrdquo IEEE Signal Processing Magazine vol 31 no 5pp 32ndash43 2014

[29] J Fan F Han and H Liu ldquoChallenges of big data analysisrdquoNational Science Review vol 1 no 2 pp 293ndash314 2014

[30] QH Zhang K Xu andGYWang ldquoFuzzy equivalence relationand its multigranulation spacesrdquo Information Sciences vol 346-347 pp 44ndash57 2016

[31] Z Liu and Y Hu ldquoMulti-granularity pattern ant colony opti-mization algorithm and its application in path planningrdquoJournal of Central South University (Science and Technology)vol 9 pp 3713ndash3722 2013

[32] Q H Zhang G YWang and X Q Liu ldquoHierarchical structureanalysis of fuzzy quotient spacerdquo Pattern Recognition andArtificial Intelligence vol 21 no 5 pp 627ndash634 2008

[33] Z C Shi Y X Xia and J Z Zhou ldquoDiscrete algorithm basedon granular computing and its applicationrdquo Computer Sciencevol 40 pp 133ndash135 2013

[34] Y P Zhang B luo Y Y Yao D Q Miao L Zhang and BZhang Quotient Space and Granular Computing The Theoryand Method of Problem Solving on Structured Problems SciencePress Beijing China 2010

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 14: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

14 Mathematical Problems in Engineering

[35] G Y Wang Q H Zhang and J Hu ldquoA survey on the granularcomputingrdquo Transactions on Intelligent Systems vol 6 no 2 pp8ndash26 2007

[36] J Jonnagaddala R T Jue and H J Dai ldquoBinary classificationof Twitter posts for adverse drug reactionsrdquo in Proceedings ofthe Social Media Mining Shared Task Workshop at the PacificSymposium on Biocomputing pp 4ndash8 Big Island Hawaii USA2016

[37] M Haungs P Sallee and M Farrens ldquoBranch transition rate anew metric for improved branch classification analysisrdquo in Pro-ceedings of the International Symposium on High-PerformanceComputer Architecture (HPCA rsquo00) pp 241ndash250 2000

[38] RW Proctor and Y S Cho ldquoPolarity correspondence a generalprinciple for performance of speeded binary classificationtasksrdquo Psychological Bulletin vol 132 no 3 pp 416ndash442 2006

[39] T H Chow P Berkhin E Eneva et al ldquoEvaluating performanceof binary classification systemsrdquo US US 8554622 B2 2013

[40] DG LiDQMiaoDX Zhang andHY Zhang ldquoAnoverviewof granular computingrdquoComputer Science vol 9 pp 1ndash12 2005

[41] X Gang and L Jing ldquoA review of the present studying state andprospect of granular computingrdquo Journal of Software vol 3 pp5ndash10 2011

[42] L X Zhong ldquoThe predication about optimal blood analyzemethodrdquo Academic Forum of Nandu vol 6 pp 70ndash71 1996

[43] X Mingmin and S Junli ldquoThe mathematical proof of methodof group blood test and a new formula in quest of optimumnumber in grouprdquo Journal of Sichuan Institute of BuildingMaterials vol 01 pp 97ndash104 1986

[44] B Zhang and L Zhang ldquoDiscussion on future development ofgranular computingrdquo Journal of Chongqing University of Postsand Telecommunications Natural Science Edition vol 22 no 5pp 538ndash540 2010

[45] A Skowron J Stepaniuk and R Swiniarski ldquoModeling roughgranular computing based on approximation spacesrdquo Informa-tion Sciences vol 184 no 1 pp 20ndash43 2012

[46] J T Yao A V Vasilakos andW Pedrycz ldquoGranular computingperspectives and challengesrdquo IEEE Transactions on Cyberneticsvol 43 no 6 pp 1977ndash1989 2013

[47] Y Y Yao N Zhang D Q Miao and F F Xu ldquoSet-theoreticapproaches to granular computingrdquo Fundamenta Informaticaevol 115 no 2-3 pp 247ndash264 2012

[48] H Li andX PMa ldquoResearch on four-elementmodel of granularcomputingrdquoComputer Engineering andApplications vol 49 no4 pp 9ndash13 2013

[49] J Hu and C Guan ldquoGranular computingmodel based on quan-tum computing theoryrdquo in Proceedings of the 10th InternationalConference on Computational Intelligence and Security pp 156ndash160 November 2014

[50] Y Shuo and Y Lin ldquoDecomposition of decision systems basedon granular computingrdquo inProceedings of the IEEE InternationalConference on Granular Computing (GrC rsquo11) pp 590ndash595IEEE Kaohsiung Taiwan November 2011

[51] F Li J Xie and K Xie ldquoGranular computing theory in theapplicatiorlpf fault diagnosisrdquo in Proceedings of the ChineseControl and Decision Conference (CCDC rsquo08) pp 595ndash597 July2008

[52] Q-H Zhang Y-K Xing and Y-L Zhou ldquoThe incrementalknowledge acquisition algorithm based on granular comput-ingrdquo Journal of Electronics and Information Technology vol 33no 2 pp 435ndash441 2011

[53] Y Zeng Y Y Yao and N Zhong ldquoThe knowledge search baseon the granular structurerdquo Computer Science vol 35 no 3 pp194ndash196 2008

[54] G-Y Wang Q-H Zhang X-A Ma and Q-S Yang ldquoGranularcomputing models for knowledge uncertaintyrdquo Journal of Soft-ware vol 22 no 4 pp 676ndash694 2011

[55] J Li Y Ren C Mei Y Qian and X Yang ldquoA comparativestudy of multigranulation rough sets and concept lattices viarule acquisitionrdquoKnowledge-Based Systems vol 91 pp 152ndash1642016

[56] H-L Yang and Z-L Guo ldquoMultigranulation decision-theoreticrough sets in incomplete information systemsrdquo InternationalJournal of Machine Learning amp Cybernetics vol 6 no 6 pp1005ndash1018 2015

[57] M AWaller and S E Fawcett ldquoData science predictive analyt-ics and big data a revolution that will transform supply chaindesign and managementrdquo Journal of Business Logistics vol34 no 2 pp 77ndash84 2013

[58] R Kitchin ldquoThe real-time city Big data and smart urbanismrdquoGeoJournal vol 79 no 1 pp 1ndash14 2014

[59] X Dong and D Srivastava Big Data Integration Morgan ampClaypool 2015

[60] L Zhang andB ZhangTheory andApplications of ProblemSolv-ing Quotient Space Based Granular Computing (The SecondVersion) Tsinghua University Press Beijing China 2007

[61] L Zhang and B Zhang ldquoThe quotient space theory of problemsolvingrdquo in Rough Sets Fuzzy Sets Data Mining and GranularComputing G Wang Q Liu Y Yao and A Skowron Eds vol2639 of Lecture Notes in Computer Science pp 11ndash15 SpringerBerlin Germany 2003

[62] J Sheng S Q Xie and C Y Pan Probability Theory andMathematical Statistics Higher Education Press Beijing China4th edition 2008

[63] L Z Zhang X Zhao and Y Ma ldquoThe simple math demon-stration and rrecise calculation method of the blood grouptestrdquoMathematics in Practice and Theory vol 22 pp 143ndash146 2010

[64] J Chen S Zhao and Y Zhang ldquoHierarchical covering algo-rithmrdquo Tsinghua Science amp Technology vol 19 no 1 pp 76ndash812014

[65] L Zhang and B Zhang ldquoDynamic quotient space model and itsbasic propertiesrdquo Pattern Recognition and Artificial Intelligencevol 25 no 2 pp 181ndash185 2012

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 15: Research Article Binary Classification of Multigranulation Searching Algorithm …downloads.hindawi.com/journals/mpe/2016/9329812.pdf · 2019. 7. 30. · Research Article Binary Classification

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of