# [ieee 2009 international conference on artificial intelligence and computational intelligence -...

Post on 16-Apr-2017

215 views

Embed Size (px)

TRANSCRIPT

Integrated Study in Incomplete Information System

ZHANG Rui Computer Center, Institute of Information Technology

Yangzhou University Yangzhou,China

line 4: e-mail: zhangrui@yzu.edu.cn

AbstractBoth rough set theory and D-S evidence theory are important methods in uncertainty reasoning, and each one has its own advantages and disadvantages. Incomplete information system exists widely in real life. In this paper, two theories are used in combination to study the incomplete information system. First, reduction algorithm for the incomplete information system is put forward based on rough set theory; and then D-S evidence theory is used to optimize the obtained rules, and the results were verified by example.

Keywords-Incomplete Information System; Decision Table; Rough Set Theory; D-S Evidence Theory; Reduction

I. INTRODUCTION Rough set theory (RST) was introduced by Z. Pawlak

[1]in 1982. It is a theory used for the study of intelligent systems characterized by inexact, uncertain or vague information. RST has been successfully devoted to problems of vague and uncertain information and it has provided many exciting resulting results in a remarkably wide range of fields, such as expert systems, machine learning, pattern recognition, decision analysis, process control, and knowledge discovery in databases.

DempsterShafer[2] evidence theory, also called DS evidence theory, which originated from the upper and lower probability, was firstly proposed by Dempster and further developed and refined by Shafer. The method has been applied in a wide range of fields. DS evidence theory can deal with uncertainty and ignorance; it takes belief function, instead of probability, as the measure of uncertainty.

RST is greatly different from other theories dealing with uncertainty, because it needs not to provide any prior information. In addition, it is strongly complementary to other theories. DS evidence theory has a great advantage in expressing the ignorance. However, every theory also has its own disadvantages. This paper combines the two theories and applies those in the rule extraction of incomplete information system.

II. INCOMPLETE INFORMATION SYSTEM

A. General Description and Definition of Information System The information system is a database with objects and

attributes which implies the relationship between them. The knowledge pattern is ultimately expressed by attributes. It

has explicit intuitive meaning and can be understood. Todays information system takes computer and modern communication technology as the basic information processing means, and applies mathematical methods, providing information service for administrative decisions. Definition 1[3].Let ),( ATUS = be a information system, where },...,{ 21 nXXXU = is the non-empty finite set of objects, generally called as the universe of discourse, and

},...,{ 21 maaaAT = is the non-empty finite set of attributes, i.e., aVUa : for any ATa , UxVxa a ,)( .Here, aV is called the domain of attribute a. If there is at least one attribute ATa in S letting aV has missing value, then S is called an incomplete information system (IIS), and the missing value is expressed as *. Definition 2[3].An incomplete decision table (IDT) is an incomplete information system }}{,{ dATUIDT = , where ),*( dVATdd is a complete attribute, called as decision attribute, AT is called the set of conditional attribute.

B. Rough Set Theory in Incomplete Information System Corresponding to the indiscernibility relation in complete

information system, similarity relation in incomplete information system is defined in this paper. Definition 3[3].In an IIS ),( ATUS = , let each non-empty subset ATA , then the similarity relation can be defined as:

*})(*)()()(,|),{()(

====

yaxayaxaAaUUyxASIM

Property 1.SIM (A) is a compatibility relation: })({)( aSIMASIM

Aa=

Let )}(),(|{)( ASIMyxUyxSA = .For A , )(xSA is the largest set of objects which may be

indiscernible from x . SA(x) is called the similarity class of A in S, and the

family of all similarity classes is denoted by )(/ ASIMU : }|)({)(/ UxxSASIMU A = .

In general, )(/ ASIMU constitutes the covering of U , instead of the partition of U .

Define )}()(|),{( ydxdUUyxRd == ,this constitutes a partition },...,{/ 21 kd DDDRU = of U into

2009 International Conference on Artificial Intelligence and Computational Intelligence

978-0-7695-3816-7/09 $26.00 2009 IEEEDOI 10.1109/AICI.2009.454

346

decision classes. Here, ),...,( 21 kd wwwV = })(|{ ll wxdUxD == ,and kl . Definition 4[3].For the IIS UXATAATUS = ,),,( , X can be characterized by a pair of lower and upper

approximations: })(|{)( = XxSUxXA A

})(|{)( XxSUxXA A = Objects in )(XA can certainly be classified as the

elements of X , while objects in )(XA can only be possibly classified as the elements of X . Definition 5[3].Generalized decision function

ATAVPU dA ),(: in IDT is defined as: )}(),(|{)( xSyydiix AA == ,

Here, )( dVP is the power set of dV .

Definition 6[3].If ATAATA = , and ATBAB , then A is the reduction of AT.

An IDT can be seen as the decision rule set with following forms:

),(...),(),()...,(),( 12211 mnn wdwdvcvcvc Or )w(d,)v,(c jii ,

ni ,...2,1= ,icii VvATc , dj Vwmj = ,,...2,1 ,

v),(ci is the condition part of rule, and )w(d, j is the decision part of rule.

If and only if YXC )( , }|{ ATccC ii = , the decision rule )w(d,)v,(c: jii r is true,

})(|{ ii vxcUxX == , })(|{ jwydUyY == . If and only if rule r is true and any other rules constituted

by the proper subsets of conjunction and disjunction in r are all false, rule r is the optimal.

C. D-S Evidence Theory in Incomplete Information System Data in the IDT can be taken as the evidence of existence

of knowledge and is represented in the form of data mass functions. Missing values in IDT are represented as ignorance[4]. Definition 7.For attribute ATa , data mass function is defined as:

)()(),(

UcardXcardvam =

)()(,*)(

UcardXUcardam =

aVv , *},)(|{ == vvxcUxX Definition 8.For rule )w(d,)v,(c: jii r , the rule mass function is defined as:

)()())w(d,)v,(c( jii Xcard

Ycardmr =

)})(())((|{ jii wydvycUyY === ,

})(|{ ii vxcUxX == , icii VvATc ,

rm measures the uncertainty of occurrence at the same time for the decision part when the condition part occurs.

III. ATTRIBUTE REDUCTION IN IDT

A. Significance of Attributes in IDT Definition 9.For }}{,{ dATUIDT = , the dependence of d on A is:

))(())(()(

l

lA DAcard

DAcardd= , dl RUDATA /,

In order to measure the significance of attributes, corresponding attribute (or attribute set) will be removed from IDT, and then the change of classification without that attribute will be observed. According to the observed result, a big change indicates high significance; otherwise, a small change indicates low significance. Definition 10.The significance of A is:

ATAddA AATAT = ),()()( Definition 11.The core of AT relative to d in IDT is defined as:

}0})({|{)( >= aATaATcored .

B. Algorithm For Attribute Reduction in IDT Input: an IDT Output: a reduction B Step1. Compute )(ATcored in IDT

Step 2.Let )(ATcoreB d= , if ATB = , go to Step 5 Step 3.

347

Step 4.

Step 5.End the algorithm, B is the reduction in question.

It can be seen from the definition, for )( Uxx , the decision part of the optimal rule is ),(...),( 1 mwdwd and )(}...,{ 21 xwww ATm = . Therefore, the problem of extracting an optimal rule will turn to find the reduction of attributes actually.

IV. RULE EXTRACTION IN INCOMPLETE DECISION TABLE

Both rough set theory and D-S evidence theory are applied, and the procedure of rule extraction is shown as follows:

1) Using the algorithm mentioned above to get the reduction B in IDT.

2) Listing the corresponding rules in B for each attribute.

3) Evaluating rm of each rule; and selecting the rule into result set R if 1=rm .

4) ,, bVtBb defining }0)(|{ >= jrdj wtmVww ,and selecting wtr : into

result set R if ATw .

V. EXAMPLE Extracting the optimal rules according to Table I:

1) The significance of each attribute is: (omit the detailed process here for length limit):

0)135/()12()135/()12()(Pr =++++++=ice 0)135/()12()135/()12()( =++++++=Mileage

214)446/(2)135/()12()( =+++++=Size

338

)155/(1)135/()12()_(

=

+++++=SpeedMax

}_,{ SpeedMaxSizeB = , = ATB B is the reduction.

2) List of rules: goodfull-:1 =>= dSizeR

xceldSizeR efull-:2 =>= poordompactSizeR =>= -c:3

good-_:4 =>= dlowSpeedMaxR good-_:5 =>= dhighSpeedMaxR xceldhighSpeedMaxR e-_:6 =>= oordhighSpeedMaxR p-_:7 =>=

3) rm of each rule is:

151

54

321 === RRR mmm

31

311 654 === RRR mmm

31

7 =Rm

Thus, two optimal rules can be got:

begin

true

icBB =

icBB =

icBiComputeCc ,

tru

false

1+= ii

end

1, == icoreBC

)(Ccardi

begin

1,0, === inm

true

1+= ii

nBATc BcBi i > )(,

iBcB cmn i == ,

)( BATcardi

true

false

BComputemBB += },{

ATB

false

end

true

false

348

poordompactSizeR =>= -c:1' good-_:2' =>= dlowSpeedMaxR

4) When fullSize = , = },{ excelgoodw ATw , so there is the third rule:

exceldgood-:3 ==>= dfullSizeR It can be seen from the definition of op

Recommended