[ieee 2009 international conference on artificial intelligence and computational intelligence -...

4
Integrated Study in Incomplete Information System ZHANG Rui Computer Center, Institute of Information Technology Yangzhou University Yangzhou,China line 4: e-mail: [email protected] Abstract—Both rough set theory and D-S evidence theory are important methods in uncertainty reasoning, and each one has its own advantages and disadvantages. Incomplete information system exists widely in real life. In this paper, two theories are used in combination to study the incomplete information system. First, reduction algorithm for the incomplete information system is put forward based on rough set theory; and then D-S evidence theory is used to optimize the obtained rules, and the results were verified by example. Keywords-Incomplete Information System; Decision Table; Rough Set Theory; D-S Evidence Theory; Reduction I. INTRODUCTION Rough set theory (RST) was introduced by Z. Pawlak [1] in 1982. It is a theory used for the study of intelligent systems characterized by inexact, uncertain or vague information. RST has been successfully devoted to problems of vague and uncertain information and it has provided many exciting resulting results in a remarkably wide range of fields, such as expert systems, machine learning, pattern recognition, decision analysis, process control, and knowledge discovery in databases. Dempster–Shafer [2] evidence theory, also called D–S evidence theory, which originated from the upper and lower probability, was firstly proposed by Dempster and further developed and refined by Shafer. The method has been applied in a wide range of fields. D–S evidence theory can deal with uncertainty and ignorance; it takes belief function, instead of probability, as the measure of uncertainty. RST is greatly different from other theories dealing with uncertainty, because it needs not to provide any prior information. In addition, it is strongly complementary to other theories. D–S evidence theory has a great advantage in expressing the ignorance. However, every theory also has its own disadvantages. This paper combines the two theories and applies those in the rule extraction of incomplete information system. II. INCOMPLETE INFORMATION SYSTEM A. General Description and Definition of Information System The information system is a database with objects and attributes which implies the relationship between them. The knowledge pattern is ultimately expressed by attributes. It has explicit intuitive meaning and can be understood. Today’s information system takes computer and modern communication technology as the basic information processing means, and applies mathematical methods, providing information service for administrative decisions. Definition 1 [3] .Let ) , ( AT U S = be a information system, where } ,... , { 2 1 n X X X U = is the non-empty finite set of objects, generally called as the universe of discourse, and } ,... , { 2 1 m a a a AT = is the non-empty finite set of attributes, i.e., a V U a : for any AT a , U x V x a a , ) ( .Here, a V is called the domain of attribute a. If there is at least one attribute AT a in S letting a V has missing value, then S is called an incomplete information system (IIS), and the missing value is expressed as “*”. Definition 2 [3] .An incomplete decision table (IDT) is an incomplete information system }} { , { d AT U IDT = , where ) ,* ( d V AT d d is a complete attribute, called as decision attribute, AT is called the set of conditional attribute. B. Rough Set Theory in Incomplete Information System Corresponding to the indiscernibility relation in complete information system, similarity relation in incomplete information system is defined in this paper. Definition 3 [3] .In an IIS ) , ( AT U S = , let each non-empty subset AT A , then the similarity relation can be defined as: *} ) ( * ) ( ) ( ) ( , | ) , {( ) ( = = = × = y a x a y a x a A a U U y x A SIM Property 1.SIM (A) is a compatibility relation: }) ({ ) ( a SIM A SIM A a= Let )} ( ) , ( | { ) ( A SIM y x U y x S A = .For A , ) ( x S A is the largest set of objects which may be indiscernible from x . S A (x) is called the similarity class of A in S, and the family of all similarity classes is denoted by ) ( / A SIM U : } | ) ( { ) ( / U x x S A SIM U A = . In general, ) ( / A SIM U constitutes the covering of U , instead of the partition of U . Define )} ( ) ( | ) , {( y d x d U U y x R d = × = ,this constitutes a partition } ,... , { / 2 1 k d D D D R U = of U into 2009 International Conference on Artificial Intelligence and Computational Intelligence 978-0-7695-3816-7/09 $26.00 © 2009 IEEE DOI 10.1109/AICI.2009.454 346

Upload: rui

Post on 16-Apr-2017

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: [IEEE 2009 International Conference on Artificial Intelligence and Computational Intelligence - Shanghai, China (2009.11.7-2009.11.8)] 2009 International Conference on Artificial Intelligence

Integrated Study in Incomplete Information System

ZHANG Rui Computer Center, Institute of Information Technology

Yangzhou University Yangzhou,China

line 4: e-mail: [email protected]

Abstract—Both rough set theory and D-S evidence theory are important methods in uncertainty reasoning, and each one has its own advantages and disadvantages. Incomplete information system exists widely in real life. In this paper, two theories are used in combination to study the incomplete information system. First, reduction algorithm for the incomplete information system is put forward based on rough set theory; and then D-S evidence theory is used to optimize the obtained rules, and the results were verified by example.

Keywords-Incomplete Information System; Decision Table; Rough Set Theory; D-S Evidence Theory; Reduction

I. INTRODUCTION Rough set theory (RST) was introduced by Z. Pawlak

[1]in 1982. It is a theory used for the study of intelligent systems characterized by inexact, uncertain or vague information. RST has been successfully devoted to problems of vague and uncertain information and it has provided many exciting resulting results in a remarkably wide range of fields, such as expert systems, machine learning, pattern recognition, decision analysis, process control, and knowledge discovery in databases.

Dempster–Shafer[2] evidence theory, also called D–S evidence theory, which originated from the upper and lower probability, was firstly proposed by Dempster and further developed and refined by Shafer. The method has been applied in a wide range of fields. D–S evidence theory can deal with uncertainty and ignorance; it takes belief function, instead of probability, as the measure of uncertainty.

RST is greatly different from other theories dealing with uncertainty, because it needs not to provide any prior information. In addition, it is strongly complementary to other theories. D–S evidence theory has a great advantage in expressing the ignorance. However, every theory also has its own disadvantages. This paper combines the two theories and applies those in the rule extraction of incomplete information system.

II. INCOMPLETE INFORMATION SYSTEM

A. General Description and Definition of Information System The information system is a database with objects and

attributes which implies the relationship between them. The knowledge pattern is ultimately expressed by attributes. It

has explicit intuitive meaning and can be understood. Today’s information system takes computer and modern communication technology as the basic information processing means, and applies mathematical methods, providing information service for administrative decisions. Definition 1[3].Let ),( ATUS = be a information system, where },...,{ 21 nXXXU = is the non-empty finite set of objects, generally called as the universe of discourse, and

},...,{ 21 maaaAT = is the non-empty finite set of attributes, i.e., aVUa →: for any ATa ∈ , UxVxa a ∈∈ ,)( .Here, aV

is called the domain of attribute a. If there is at least one attribute ATa ∈ in S letting aV has missing value, then S is called an incomplete information system (IIS), and the missing value is expressed as “*”. Definition 2[3].An incomplete decision table (IDT) is an incomplete information system }}{,{ dATUIDT ∪= , where ),*( dVATdd ∉∉ is a complete attribute, called as decision attribute, AT is called the set of conditional attribute.

B. Rough Set Theory in Incomplete Information System Corresponding to the indiscernibility relation in complete

information system, similarity relation in incomplete information system is defined in this paper. Definition 3[3].In an IIS ),( ATUS = , let each non-empty subset ATA∈ , then the similarity relation can be defined as:

*})(*)()()(,|),{()(

=∨=∨=∈∀×∈=

yaxayaxaAaUUyxASIM

Property 1.SIM (A) is a compatibility relation: })({)( aSIMASIM

Aa∈= ∩

Let )}(),(|{)( ASIMyxUyxSA ∈∈= .For A , )(xSA is the largest set of objects which may be

indiscernible from x . SA(x) is called the similarity class of A in S, and the

family of all similarity classes is denoted by )(/ ASIMU : }|)({)(/ UxxSASIMU A ∈= .

In general, )(/ ASIMU constitutes the covering of U , instead of the partition of U .

Define )}()(|),{( ydxdUUyxRd =×∈= ,this constitutes a partition },...,{/ 21 kd DDDRU = of U into

2009 International Conference on Artificial Intelligence and Computational Intelligence

978-0-7695-3816-7/09 $26.00 © 2009 IEEE

DOI 10.1109/AICI.2009.454

346

Page 2: [IEEE 2009 International Conference on Artificial Intelligence and Computational Intelligence - Shanghai, China (2009.11.7-2009.11.8)] 2009 International Conference on Artificial Intelligence

decision classes. Here, ),...,( 21 kd wwwV = })(|{ ll wxdUxD =∈= ,and kl ≤ . Definition 4[3].For the IIS UXATAATUS ⊆⊆= ,),,( , X can be characterized by a pair of lower and upper

approximations: })(|{)( Φ≠∈= XxSUxXA A ∩

})(|{)( XxSUxXA A ⊆∈= Objects in )(XA can certainly be classified as the

elements of X , while objects in )(XA can only be possibly classified as the elements of X . Definition 5[3].Generalized decision function

ATAVPU dA ⊆→∂ ),(: in IDT is defined as: )}(),(|{)( xSyydiix AA ∈==∂ ,

Here, )( dVP is the power set of dV .

Definition 6[3].If ATAATA ⊆∂=∂ , and ATBAB ∂≠∂⇒⊂∀ , then A is the reduction of AT.

An IDT can be seen as the decision rule set with following forms:

),(...),(),()...,(),( 12211 mnn wdwdvcvcvc ∨∨→∧∧ Or )w(d,)v,(c jii ∨→∧ ,

ni ,...2,1= ,icii VvATc ∈∈ , dj Vwmj ∈= ,,...2,1 ,

v),(ci∧ is the condition part of rule, and )w(d, j∨ is the decision part of rule.

If and only if YXC ⊆)( , }|{ ATccC ii ∈= , the decision rule )w(d,)v,(c: jii ∨→∧r is true,

})(|{ ii vxcUxX =∧∈= , })(|{ jwydUyY =∨∈= . If and only if rule r is true and any other rules constituted

by the proper subsets of conjunction and disjunction in r are all false, rule r is the optimal.

C. D-S Evidence Theory in Incomplete Information System Data in the IDT can be taken as the evidence of existence

of knowledge and is represented in the form of data mass functions. Missing values in IDT are represented as ignorance[4]. Definition 7.For attribute ATa ∈ , data mass function is defined as:

)()(),(

UcardXcardvam =

)()(,*)(

UcardXUcardam −=

aVv ∈ , *},)(|{ ≠=∈= vvxcUxX Definition 8.For rule )w(d,)v,(c: jii ∨→∧r , the rule mass function is defined as:

)()())w(d,)v,(c( jii Xcard

Ycardmr =∨→∧

)})(())((|{ jii wydvycUyY =∨∧=∧∈= ,

})(|{ ii vxcUxX =∧∈= ,icii VvATc ∈∈ ,

rm measures the uncertainty of occurrence at the same time for the decision part when the condition part occurs.

III. ATTRIBUTE REDUCTION IN IDT

A. Significance of Attributes in IDT Definition 9.For }}{,{ dATUIDT ∪= , the dependence of d on A is:

))(())(()(

l

lA DAcard

DAcardd∑∑=γ , dl RUDATA /, ∈⊆

In order to measure the significance of attributes, corresponding attribute (or attribute set) will be removed from IDT, and then the change of classification without that attribute will be observed. According to the observed result, a big change indicates high significance; otherwise, a small change indicates low significance. Definition 10.The significance of A is:

ATAddA AATAT ⊆−= − ),()()( γγσ Definition 11.The core of AT relative to d in IDT is defined as:

}0})({|{)( >∈= aATaATcored σ .

B. Algorithm For Attribute Reduction in IDT Input: an IDT Output: a reduction B Step1. Compute )(ATcored in IDT

Step 2.Let )(ATcoreB d= , if ATB ∂=∂ , go to Step 5 Step 3.

347

Page 3: [IEEE 2009 International Conference on Artificial Intelligence and Computational Intelligence - Shanghai, China (2009.11.7-2009.11.8)] 2009 International Conference on Artificial Intelligence

Step 4.

Step 5.End the algorithm, B is the reduction in question.

It can be seen from the definition, for )( Uxx ∈ , the decision part of the optimal rule is ),(...),( 1 mwdwd ∨∨ and )(}...,{ 21 xwww ATm ∂= . Therefore, the problem of extracting an optimal rule will turn to find the reduction of attributes actually.

IV. RULE EXTRACTION IN INCOMPLETE DECISION TABLE

Both rough set theory and D-S evidence theory are applied, and the procedure of rule extraction is shown as follows:

1) Using the algorithm mentioned above to get the reduction B in IDT.

2) Listing the corresponding rules in B for each attribute.

3) Evaluating rm of each rule; and selecting the rule into result set R′ if 1=rm .

4) ,, bVtBb ∈∈∀ defining }0)(|{ >→∈= jrdj wtmVww ,and selecting wtr →: into

result set R′ if ATw ∂⊆ .

V. EXAMPLE Extracting the optimal rules according to Table I:

1) The significance of each attribute is: (omit the detailed process here for length limit):

0)135/()12()135/()12()(Pr =+++−+++=iceσ 0)135/()12()135/()12()( =+++−+++=Mileageσ

214)446/(2)135/()12()( =++−+++=Sizeσ

338

)155/(1)135/()12()_(

=

++−+++=SpeedMaxσ

}_,{ SpeedMaxSizeB = , ⇒∂=∂ ATB B is the reduction.

2) List of rules: goodfull-:1 =>= dSizeR

xceldSizeR efull-:2 =>= poordompactSizeR =>= -c:3

good-_:4 =>= dlowSpeedMaxR good-_:5 =>= dhighSpeedMaxR xceldhighSpeedMaxR e-_:6 =>= oordhighSpeedMaxR p-_:7 =>=

3) rm of each rule is:

151

54

321 === RRR mmm

31

311 654 === RRR mmm

31

7 =Rm

Thus, two optimal rules can be got:

begin

true

icBB −∂=∂

icBB −=

icBi ComputeCc −∂∈ ,tru

false

1+= ii

end

1, =−= icoreBC

)(Ccardi ≤

begin

1,0, ==Φ= inm

true

1+= ii

nBATc BcBi i>−−∈ )(, γγ ∪

iBcB cmni

=−= ,γγ ∪

)( BATcardi −≤

true

false

BComputemBB ∂+= },{

ATB ∂≠∂

false

end

true

false

348

Page 4: [IEEE 2009 International Conference on Artificial Intelligence and Computational Intelligence - Shanghai, China (2009.11.7-2009.11.8)] 2009 International Conference on Artificial Intelligence

poordompactSizeR =>= -c:1' good-_:2' =>= dlowSpeedMaxR

4) When fullSize = , ⇒= },{ excelgoodw ATw ∂⊆ , so there is the third rule:

exceldgood-:3 =∨=>=′ dfullSizeR It can be seen from the definition of optimal rule that the

three rules above are all optimal.

CONCLUSIONS Uncertainty in real life is widespread, so research of

knowledge discovery and rule extraction the incomplete information system is of prominent practical significance. This paper tries to use two theories to study the rule extraction in incomplete information system, and obtains valuable results. However, it is only a preliminary investigation which needs to be further improved. Other approaches of attribute reduction and rule extraction will also be discussed.

TABLE I. ICOMPLETE INFORMATION SYSTEM

Car Table Column Head

Price Mileage Size Max_Speed d

1 High High Full Low Good

2 Low * Full Low Good

3 * * Compact High Poor

4 High * Full High Good

5 * * Full High Excel

6 Low High Full High Good

TABLE II. GENERALIZED DECISION FUNCTION

Car AT∂

1 {Good}

2 {Good}

3 {Poor}

4 {Good,Excel}

5 {Good,Excel}

6 {Good,Excel}

ACKNOWLEDGMENT My deepest gratitude goes to my leaders and colleagues

for the advice and help they have given during the course of conception and writing of this paper, and also to all my family and friends who care about me!

REFERENCES

[1] Z.Pawlak. Rough sets: Theoretical Aspects of Reasoning about Data [M]. Dordrecht: Kluwer Academic Publishers, 1991.89-95

[2] A.P.Dempster. Upper and lower probability inferences based on a sample from a finite univariate population [J]. Biometrika, 1967, 54(3): 515-528.

[3] W.X.Zhang,W.Z.Wu, J.Y. Liang,D.Y.Li. Rough Set Theory&Method[M].Peking: Science Press,2001

[4] Sarabjot .S .Anand, David.A. Bell, John G Hughes. EDM: A general framework for Data Mining based on Evidence Theory [J]. Data &Knowledge Engineering, 1996, 18:189-223

[5] K.Huang , S.F.Chen ,Z.G.Zhou,W.H.Zhang. Multi-source Information Fusion Method Based on Rough Sets Theory and Evidence Theory. Information and Control.2004, 33(4):422-425

[6] J.Y. Liang, Z.B.Xu. The Algorithm on Knowledge Reduction in Incomplete Information Systems [J].

[7] G.Shafer. A mathematical Theory of Evidence [M]. Princeton University Press, 1976.

[8] Z. Pawlak, Rough sets, International Journal of Computer and Information Sciences.1982,11:341–356.

[9] Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, Boston, 1991.

[10] Z. Pawlak, A. Skowron, Rudiments of rough sets, Information Sciences. 2007,177: 3–27.

[11] Z. Pawlak, A. Skowron, Rough sets: some extensions, Information Sciences.2007,177:28–40.

[12] L. Polkowski, S. Tsumoto, T.Y. Lin (Eds.), Rough Set Methods and Applications, Physica-Verlag, Berlin, 2000.

[13] M. Quafatou, a-RST: a generalization of rough set theory, Information Sciences. 2000,124:301–316.

[14] Pan Wei, Wang Yangsheng.Yang Hongji Decision Rule Analysis of Dempster-Shafer Theory of Evidence .Computer Engineering and Application. 2004,14: 14-17.

[15] Yue Chaoyuan.Theories and Methods of Decision Making . Beijing: Science Press. 2003,194-221.

349