# [IEEE 2009 International Conference on Artificial Intelligence and Computational Intelligence - Shanghai, China (2009.11.7-2009.11.8)] 2009 International Conference on Artificial Intelligence and Computational Intelligence - Integrated Study in Incomplete Information System

Post on 16-Apr-2017

215 views

Embed Size (px)

TRANSCRIPT

<ul><li><p>Integrated Study in Incomplete Information System </p><p>ZHANG Rui Computer Center, Institute of Information Technology </p><p>Yangzhou University Yangzhou,China </p><p>line 4: e-mail: zhangrui@yzu.edu.cn </p><p>AbstractBoth rough set theory and D-S evidence theory are important methods in uncertainty reasoning, and each one has its own advantages and disadvantages. Incomplete information system exists widely in real life. In this paper, two theories are used in combination to study the incomplete information system. First, reduction algorithm for the incomplete information system is put forward based on rough set theory; and then D-S evidence theory is used to optimize the obtained rules, and the results were verified by example. </p><p>Keywords-Incomplete Information System; Decision Table; Rough Set Theory; D-S Evidence Theory; Reduction </p><p>I. INTRODUCTION Rough set theory (RST) was introduced by Z. Pawlak </p><p>[1]in 1982. It is a theory used for the study of intelligent systems characterized by inexact, uncertain or vague information. RST has been successfully devoted to problems of vague and uncertain information and it has provided many exciting resulting results in a remarkably wide range of fields, such as expert systems, machine learning, pattern recognition, decision analysis, process control, and knowledge discovery in databases. </p><p>DempsterShafer[2] evidence theory, also called DS evidence theory, which originated from the upper and lower probability, was firstly proposed by Dempster and further developed and refined by Shafer. The method has been applied in a wide range of fields. DS evidence theory can deal with uncertainty and ignorance; it takes belief function, instead of probability, as the measure of uncertainty. </p><p>RST is greatly different from other theories dealing with uncertainty, because it needs not to provide any prior information. In addition, it is strongly complementary to other theories. DS evidence theory has a great advantage in expressing the ignorance. However, every theory also has its own disadvantages. This paper combines the two theories and applies those in the rule extraction of incomplete information system. </p><p>II. INCOMPLETE INFORMATION SYSTEM </p><p>A. General Description and Definition of Information System The information system is a database with objects and </p><p>attributes which implies the relationship between them. The knowledge pattern is ultimately expressed by attributes. It </p><p>has explicit intuitive meaning and can be understood. Todays information system takes computer and modern communication technology as the basic information processing means, and applies mathematical methods, providing information service for administrative decisions. Definition 1[3].Let ),( ATUS = be a information system, where },...,{ 21 nXXXU = is the non-empty finite set of objects, generally called as the universe of discourse, and </p><p>},...,{ 21 maaaAT = is the non-empty finite set of attributes, i.e., aVUa : for any ATa , UxVxa a ,)( .Here, aV is called the domain of attribute a. If there is at least one attribute ATa in S letting aV has missing value, then S is called an incomplete information system (IIS), and the missing value is expressed as *. Definition 2[3].An incomplete decision table (IDT) is an incomplete information system }}{,{ dATUIDT = , where ),*( dVATdd is a complete attribute, called as decision attribute, AT is called the set of conditional attribute. </p><p>B. Rough Set Theory in Incomplete Information System Corresponding to the indiscernibility relation in complete </p><p>information system, similarity relation in incomplete information system is defined in this paper. Definition 3[3].In an IIS ),( ATUS = , let each non-empty subset ATA , then the similarity relation can be defined as: </p><p>*})(*)()()(,|),{()(</p><p>====</p><p>yaxayaxaAaUUyxASIM</p><p>Property 1.SIM (A) is a compatibility relation: })({)( aSIMASIM</p><p>Aa= </p><p> Let )}(),(|{)( ASIMyxUyxSA = .For A , )(xSA is the largest set of objects which may be </p><p>indiscernible from x . SA(x) is called the similarity class of A in S, and the </p><p>family of all similarity classes is denoted by )(/ ASIMU : }|)({)(/ UxxSASIMU A = . </p><p>In general, )(/ ASIMU constitutes the covering of U , instead of the partition of U . </p><p>Define )}()(|),{( ydxdUUyxRd == ,this constitutes a partition },...,{/ 21 kd DDDRU = of U into </p><p>2009 International Conference on Artificial Intelligence and Computational Intelligence</p><p>978-0-7695-3816-7/09 $26.00 2009 IEEEDOI 10.1109/AICI.2009.454</p><p>346</p></li><li><p>decision classes. Here, ),...,( 21 kd wwwV = })(|{ ll wxdUxD == ,and kl . Definition 4[3].For the IIS UXATAATUS = ,),,( , X can be characterized by a pair of lower and upper </p><p>approximations: })(|{)( = XxSUxXA A </p><p>})(|{)( XxSUxXA A = Objects in )(XA can certainly be classified as the </p><p>elements of X , while objects in )(XA can only be possibly classified as the elements of X . Definition 5[3].Generalized decision function </p><p>ATAVPU dA ),(: in IDT is defined as: )}(),(|{)( xSyydiix AA == , </p><p>Here, )( dVP is the power set of dV . </p><p>Definition 6[3].If ATAATA = , and ATBAB , then A is the reduction of AT. </p><p>An IDT can be seen as the decision rule set with following forms: </p><p>),(...),(),()...,(),( 12211 mnn wdwdvcvcvc Or )w(d,)v,(c jii , </p><p>ni ,...2,1= ,icii VvATc , dj Vwmj = ,,...2,1 , </p><p>v),(ci is the condition part of rule, and )w(d, j is the decision part of rule. </p><p>If and only if YXC )( , }|{ ATccC ii = , the decision rule )w(d,)v,(c: jii r is true, </p><p>})(|{ ii vxcUxX == , })(|{ jwydUyY == . If and only if rule r is true and any other rules constituted </p><p>by the proper subsets of conjunction and disjunction in r are all false, rule r is the optimal. </p><p>C. D-S Evidence Theory in Incomplete Information System Data in the IDT can be taken as the evidence of existence </p><p>of knowledge and is represented in the form of data mass functions. Missing values in IDT are represented as ignorance[4]. Definition 7.For attribute ATa , data mass function is defined as: </p><p>)()(),(</p><p>UcardXcardvam = </p><p>)()(,*)(</p><p>UcardXUcardam = </p><p>aVv , *},)(|{ == vvxcUxX Definition 8.For rule )w(d,)v,(c: jii r , the rule mass function is defined as: </p><p>)()())w(d,)v,(c( jii Xcard</p><p>Ycardmr = </p><p>)})(())((|{ jii wydvycUyY === , </p><p>})(|{ ii vxcUxX == , icii VvATc , </p><p> rm measures the uncertainty of occurrence at the same time for the decision part when the condition part occurs. </p><p>III. ATTRIBUTE REDUCTION IN IDT </p><p>A. Significance of Attributes in IDT Definition 9.For }}{,{ dATUIDT = , the dependence of d on A is: </p><p>))(())(()(</p><p>l</p><p>lA DAcard</p><p>DAcardd= , dl RUDATA /, </p><p>In order to measure the significance of attributes, corresponding attribute (or attribute set) will be removed from IDT, and then the change of classification without that attribute will be observed. According to the observed result, a big change indicates high significance; otherwise, a small change indicates low significance. Definition 10.The significance of A is: </p><p>ATAddA AATAT = ),()()( Definition 11.The core of AT relative to d in IDT is defined as: </p><p>}0})({|{)( >= aATaATcored . </p><p>B. Algorithm For Attribute Reduction in IDT Input: an IDT Output: a reduction B Step1. Compute )(ATcored in IDT </p><p>Step 2.Let )(ATcoreB d= , if ATB = , go to Step 5 Step 3. </p><p>347</p></li><li><p> Step 4. </p><p> Step 5.End the algorithm, B is the reduction in question. </p><p>It can be seen from the definition, for )( Uxx , the decision part of the optimal rule is ),(...),( 1 mwdwd and )(}...,{ 21 xwww ATm = . Therefore, the problem of extracting an optimal rule will turn to find the reduction of attributes actually. </p><p>IV. RULE EXTRACTION IN INCOMPLETE DECISION TABLE </p><p>Both rough set theory and D-S evidence theory are applied, and the procedure of rule extraction is shown as follows: </p><p>1) Using the algorithm mentioned above to get the reduction B in IDT. </p><p>2) Listing the corresponding rules in B for each attribute. </p><p>3) Evaluating rm of each rule; and selecting the rule into result set R if 1=rm . </p><p>4) ,, bVtBb defining }0)(|{ >= jrdj wtmVww ,and selecting wtr : into </p><p>result set R if ATw . </p><p>V. EXAMPLE Extracting the optimal rules according to Table I: </p><p>1) The significance of each attribute is: (omit the detailed process here for length limit): </p><p>0)135/()12()135/()12()(Pr =++++++=ice 0)135/()12()135/()12()( =++++++=Mileage </p><p>214)446/(2)135/()12()( =+++++=Size </p><p>338</p><p>)155/(1)135/()12()_(</p><p>=</p><p>+++++=SpeedMax </p><p>}_,{ SpeedMaxSizeB = , = ATB B is the reduction. </p><p>2) List of rules: goodfull-:1 =>= dSizeR </p><p>xceldSizeR efull-:2 =>= poordompactSizeR =>= -c:3 </p><p>good-_:4 =>= dlowSpeedMaxR good-_:5 =>= dhighSpeedMaxR xceldhighSpeedMaxR e-_:6 =>= oordhighSpeedMaxR p-_:7 =>= </p><p>3) rm of each rule is: </p><p>151</p><p>54</p><p>321 === RRR mmm </p><p>31</p><p>311 654 === RRR mmm </p><p>31</p><p>7 =Rm </p><p>Thus, two optimal rules can be got: </p><p> begin </p><p>true </p><p>icBB =</p><p>icBB =</p><p>icBiComputeCc ,</p><p>tru</p><p>false </p><p>1+= ii</p><p>end </p><p>1, == icoreBC</p><p>)(Ccardi </p><p> begin </p><p>1,0, === inm</p><p>true </p><p>1+= ii</p><p>nBATc BcBi i > )(, </p><p>iBcB cmn i == , </p><p>)( BATcardi </p><p>true </p><p>false</p><p>BComputemBB += },{</p><p>ATB </p><p>false </p><p>end </p><p>true </p><p>false </p><p>348</p></li><li><p>poordompactSizeR =>= -c:1' good-_:2' =>= dlowSpeedMaxR </p><p>4) When fullSize = , = },{ excelgoodw ATw , so there is the third rule: </p><p>exceldgood-:3 ==>= dfullSizeR It can be seen from the definition of optimal rule that the </p><p>three rules above are all optimal. </p><p>CONCLUSIONS Uncertainty in real life is widespread, so research of </p><p>knowledge discovery and rule extraction the incomplete information system is of prominent practical significance. This paper tries to use two theories to study the rule extraction in incomplete information system, and obtains valuable results. However, it is only a preliminary investigation which needs to be further improved. Other approaches of attribute reduction and rule extraction will also be discussed. </p><p>TABLE I. ICOMPLETE INFORMATION SYSTEM </p><p>Car Table Column Head </p><p>Price Mileage Size Max_Speed d </p><p>1 High High Full Low Good </p><p>2 Low * Full Low Good </p><p>3 * * Compact High Poor </p><p>4 High * Full High Good </p><p>5 * * Full High Excel </p><p>6 Low High Full High Good </p><p>TABLE II. GENERALIZED DECISION FUNCTION </p><p>Car AT </p><p>1 {Good} </p><p>2 {Good} </p><p>3 {Poor} </p><p>4 {Good,Excel} </p><p>5 {Good,Excel} </p><p>6 {Good,Excel} </p><p>ACKNOWLEDGMENT My deepest gratitude goes to my leaders and colleagues </p><p>for the advice and help they have given during the course of conception and writing of this paper, and also to all my family and friends who care about me! </p><p>REFERENCES </p><p>[1] Z.Pawlak. Rough sets: Theoretical Aspects of Reasoning about Data [M]. Dordrecht: Kluwer Academic Publishers, 1991.89-95 </p><p>[2] A.P.Dempster. Upper and lower probability inferences based on a sample from a finite univariate population [J]. Biometrika, 1967, 54(3): 515-528. </p><p>[3] W.X.Zhang,W.Z.Wu, J.Y. Liang,D.Y.Li. Rough Set Theory&Method[M].Peking: Science Press,2001 </p><p>[4] Sarabjot .S .Anand, David.A. Bell, John G Hughes. EDM: A general framework for Data Mining based on Evidence Theory [J]. Data &Knowledge Engineering, 1996, 18:189-223 </p><p>[5] K.Huang , S.F.Chen ,Z.G.Zhou,W.H.Zhang. Multi-source Information Fusion Method Based on Rough Sets Theory and Evidence Theory. Information and Control.2004, 33(4):422-425 </p><p>[6] J.Y. Liang, Z.B.Xu. The Algorithm on Knowledge Reduction in Incomplete Information Systems [J]. </p><p>[7] G.Shafer. A mathematical Theory of Evidence [M]. Princeton University Press, 1976. </p><p>[8] Z. Pawlak, Rough sets, International Journal of Computer and Information Sciences.1982,11:341356. </p><p>[9] Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, Boston, 1991. </p><p>[10] Z. Pawlak, A. Skowron, Rudiments of rough sets, Information Sciences. 2007,177: 327. </p><p>[11] Z. Pawlak, A. Skowron, Rough sets: some extensions, Information Sciences.2007,177:2840. </p><p>[12] L. Polkowski, S. Tsumoto, T.Y. Lin (Eds.), Rough Set Methods and Applications, Physica-Verlag, Berlin, 2000. </p><p>[13] M. Quafatou, a-RST: a generalization of rough set theory, Information Sciences. 2000,124:301316. </p><p>[14] Pan Wei, Wang Yangsheng.Yang Hongji Decision Rule Analysis of Dempster-Shafer Theory of Evidence .Computer Engineering and Application. 2004,14: 14-17. </p><p>[15] Yue Chaoyuan.Theories and Methods of Decision Making . Beijing: Science Press. 2003,194-221. </p><p>349</p></li></ul>