[ieee 2009 international conference on artificial intelligence and computational intelligence -...

Download [IEEE 2009 International Conference on Artificial Intelligence and Computational Intelligence - Shanghai, China (2009.11.7-2009.11.8)] 2009 International Conference on Artificial Intelligence and Computational Intelligence - Integrated Study in Incomplete Information System

Post on 16-Apr-2017




3 download

Embed Size (px)


  • Integrated Study in Incomplete Information System

    ZHANG Rui Computer Center, Institute of Information Technology

    Yangzhou University Yangzhou,China

    line 4: e-mail: zhangrui@yzu.edu.cn

    AbstractBoth rough set theory and D-S evidence theory are important methods in uncertainty reasoning, and each one has its own advantages and disadvantages. Incomplete information system exists widely in real life. In this paper, two theories are used in combination to study the incomplete information system. First, reduction algorithm for the incomplete information system is put forward based on rough set theory; and then D-S evidence theory is used to optimize the obtained rules, and the results were verified by example.

    Keywords-Incomplete Information System; Decision Table; Rough Set Theory; D-S Evidence Theory; Reduction

    I. INTRODUCTION Rough set theory (RST) was introduced by Z. Pawlak

    [1]in 1982. It is a theory used for the study of intelligent systems characterized by inexact, uncertain or vague information. RST has been successfully devoted to problems of vague and uncertain information and it has provided many exciting resulting results in a remarkably wide range of fields, such as expert systems, machine learning, pattern recognition, decision analysis, process control, and knowledge discovery in databases.

    DempsterShafer[2] evidence theory, also called DS evidence theory, which originated from the upper and lower probability, was firstly proposed by Dempster and further developed and refined by Shafer. The method has been applied in a wide range of fields. DS evidence theory can deal with uncertainty and ignorance; it takes belief function, instead of probability, as the measure of uncertainty.

    RST is greatly different from other theories dealing with uncertainty, because it needs not to provide any prior information. In addition, it is strongly complementary to other theories. DS evidence theory has a great advantage in expressing the ignorance. However, every theory also has its own disadvantages. This paper combines the two theories and applies those in the rule extraction of incomplete information system.


    A. General Description and Definition of Information System The information system is a database with objects and

    attributes which implies the relationship between them. The knowledge pattern is ultimately expressed by attributes. It

    has explicit intuitive meaning and can be understood. Todays information system takes computer and modern communication technology as the basic information processing means, and applies mathematical methods, providing information service for administrative decisions. Definition 1[3].Let ),( ATUS = be a information system, where },...,{ 21 nXXXU = is the non-empty finite set of objects, generally called as the universe of discourse, and

    },...,{ 21 maaaAT = is the non-empty finite set of attributes, i.e., aVUa : for any ATa , UxVxa a ,)( .Here, aV is called the domain of attribute a. If there is at least one attribute ATa in S letting aV has missing value, then S is called an incomplete information system (IIS), and the missing value is expressed as *. Definition 2[3].An incomplete decision table (IDT) is an incomplete information system }}{,{ dATUIDT = , where ),*( dVATdd is a complete attribute, called as decision attribute, AT is called the set of conditional attribute.

    B. Rough Set Theory in Incomplete Information System Corresponding to the indiscernibility relation in complete

    information system, similarity relation in incomplete information system is defined in this paper. Definition 3[3].In an IIS ),( ATUS = , let each non-empty subset ATA , then the similarity relation can be defined as:




    Property 1.SIM (A) is a compatibility relation: })({)( aSIMASIM


    Let )}(),(|{)( ASIMyxUyxSA = .For A , )(xSA is the largest set of objects which may be

    indiscernible from x . SA(x) is called the similarity class of A in S, and the

    family of all similarity classes is denoted by )(/ ASIMU : }|)({)(/ UxxSASIMU A = .

    In general, )(/ ASIMU constitutes the covering of U , instead of the partition of U .

    Define )}()(|),{( ydxdUUyxRd == ,this constitutes a partition },...,{/ 21 kd DDDRU = of U into

    2009 International Conference on Artificial Intelligence and Computational Intelligence

    978-0-7695-3816-7/09 $26.00 2009 IEEEDOI 10.1109/AICI.2009.454


  • decision classes. Here, ),...,( 21 kd wwwV = })(|{ ll wxdUxD == ,and kl . Definition 4[3].For the IIS UXATAATUS = ,),,( , X can be characterized by a pair of lower and upper

    approximations: })(|{)( = XxSUxXA A

    })(|{)( XxSUxXA A = Objects in )(XA can certainly be classified as the

    elements of X , while objects in )(XA can only be possibly classified as the elements of X . Definition 5[3].Generalized decision function

    ATAVPU dA ),(: in IDT is defined as: )}(),(|{)( xSyydiix AA == ,

    Here, )( dVP is the power set of dV .

    Definition 6[3].If ATAATA = , and ATBAB , then A is the reduction of AT.

    An IDT can be seen as the decision rule set with following forms:

    ),(...),(),()...,(),( 12211 mnn wdwdvcvcvc Or )w(d,)v,(c jii ,

    ni ,...2,1= ,icii VvATc , dj Vwmj = ,,...2,1 ,

    v),(ci is the condition part of rule, and )w(d, j is the decision part of rule.

    If and only if YXC )( , }|{ ATccC ii = , the decision rule )w(d,)v,(c: jii r is true,

    })(|{ ii vxcUxX == , })(|{ jwydUyY == . If and only if rule r is true and any other rules constituted

    by the proper subsets of conjunction and disjunction in r are all false, rule r is the optimal.

    C. D-S Evidence Theory in Incomplete Information System Data in the IDT can be taken as the evidence of existence

    of knowledge and is represented in the form of data mass functions. Missing values in IDT are represented as ignorance[4]. Definition 7.For attribute ATa , data mass function is defined as:


    UcardXcardvam =


    UcardXUcardam =

    aVv , *},)(|{ == vvxcUxX Definition 8.For rule )w(d,)v,(c: jii r , the rule mass function is defined as:

    )()())w(d,)v,(c( jii Xcard

    Ycardmr =

    )})(())((|{ jii wydvycUyY === ,

    })(|{ ii vxcUxX == , icii VvATc ,

    rm measures the uncertainty of occurrence at the same time for the decision part when the condition part occurs.


    A. Significance of Attributes in IDT Definition 9.For }}{,{ dATUIDT = , the dependence of d on A is:



    lA DAcard

    DAcardd= , dl RUDATA /,

    In order to measure the significance of attributes, corresponding attribute (or attribute set) will be removed from IDT, and then the change of classification without that attribute will be observed. According to the observed result, a big change indicates high significance; otherwise, a small change indicates low significance. Definition 10.The significance of A is:

    ATAddA AATAT = ),()()( Definition 11.The core of AT relative to d in IDT is defined as:

    }0})({|{)( >= aATaATcored .

    B. Algorithm For Attribute Reduction in IDT Input: an IDT Output: a reduction B Step1. Compute )(ATcored in IDT

    Step 2.Let )(ATcoreB d= , if ATB = , go to Step 5 Step 3.


  • Step 4.

    Step 5.End the algorithm, B is the reduction in question.

    It can be seen from the definition, for )( Uxx , the decision part of the optimal rule is ),(...),( 1 mwdwd and )(}...,{ 21 xwww ATm = . Therefore, the problem of extracting an optimal rule will turn to find the reduction of attributes actually.


    Both rough set theory and D-S evidence theory are applied, and the procedure of rule extraction is shown as follows:

    1) Using the algorithm mentioned above to get the reduction B in IDT.

    2) Listing the corresponding rules in B for each attribute.

    3) Evaluating rm of each rule; and selecting the rule into result set R if 1=rm .

    4) ,, bVtBb defining }0)(|{ >= jrdj wtmVww ,and selecting wtr : into

    result set R if ATw .

    V. EXAMPLE Extracting the optimal rules according to Table I:

    1) The significance of each attribute is: (omit the detailed process here for length limit):

    0)135/()12()135/()12()(Pr =++++++=ice 0)135/()12()135/()12()( =++++++=Mileage

    214)446/(2)135/()12()( =+++++=Size





    }_,{ SpeedMaxSizeB = , = ATB B is the reduction.

    2) List of rules: goodfull-:1 =>= dSizeR

    xceldSizeR efull-:2 =>= poordompactSizeR =>= -c:3

    good-_:4 =>= dlowSpeedMaxR good-_:5 =>= dhighSpeedMaxR xceldhighSpeedMaxR e-_:6 =>= oordhighSpeedMaxR p-_:7 =>=

    3) rm of each rule is:



    321 === RRR mmm


    311 654 === RRR mmm


    7 =Rm

    Thus, two optimal rules can be got:



    icBB =

    icBB =

    icBiComputeCc ,



    1+= ii


    1, == icoreBC



    1,0, === inm


    1+= ii

    nBATc BcBi i > )(,

    iBcB cmn i == ,

    )( BATcardi



    BComputemBB += },{







  • poordompactSizeR =>= -c:1' good-_:2' =>= dlowSpeedMaxR

    4) When fullSize = , = },{ excelgoodw ATw , so there is the third rule:

    exceldgood-:3 ==>= dfullSizeR It can be seen from the definition of op


View more >