[ieee 2009 international conference on artificial intelligence and computational intelligence -...
TRANSCRIPT
Integrated Study in Incomplete Information System
ZHANG Rui Computer Center, Institute of Information Technology
Yangzhou University Yangzhou,China
line 4: e-mail: [email protected]
Abstract—Both rough set theory and D-S evidence theory are important methods in uncertainty reasoning, and each one has its own advantages and disadvantages. Incomplete information system exists widely in real life. In this paper, two theories are used in combination to study the incomplete information system. First, reduction algorithm for the incomplete information system is put forward based on rough set theory; and then D-S evidence theory is used to optimize the obtained rules, and the results were verified by example.
Keywords-Incomplete Information System; Decision Table; Rough Set Theory; D-S Evidence Theory; Reduction
I. INTRODUCTION Rough set theory (RST) was introduced by Z. Pawlak
[1]in 1982. It is a theory used for the study of intelligent systems characterized by inexact, uncertain or vague information. RST has been successfully devoted to problems of vague and uncertain information and it has provided many exciting resulting results in a remarkably wide range of fields, such as expert systems, machine learning, pattern recognition, decision analysis, process control, and knowledge discovery in databases.
Dempster–Shafer[2] evidence theory, also called D–S evidence theory, which originated from the upper and lower probability, was firstly proposed by Dempster and further developed and refined by Shafer. The method has been applied in a wide range of fields. D–S evidence theory can deal with uncertainty and ignorance; it takes belief function, instead of probability, as the measure of uncertainty.
RST is greatly different from other theories dealing with uncertainty, because it needs not to provide any prior information. In addition, it is strongly complementary to other theories. D–S evidence theory has a great advantage in expressing the ignorance. However, every theory also has its own disadvantages. This paper combines the two theories and applies those in the rule extraction of incomplete information system.
II. INCOMPLETE INFORMATION SYSTEM
A. General Description and Definition of Information System The information system is a database with objects and
attributes which implies the relationship between them. The knowledge pattern is ultimately expressed by attributes. It
has explicit intuitive meaning and can be understood. Today’s information system takes computer and modern communication technology as the basic information processing means, and applies mathematical methods, providing information service for administrative decisions. Definition 1[3].Let ),( ATUS = be a information system, where },...,{ 21 nXXXU = is the non-empty finite set of objects, generally called as the universe of discourse, and
},...,{ 21 maaaAT = is the non-empty finite set of attributes, i.e., aVUa →: for any ATa ∈ , UxVxa a ∈∈ ,)( .Here, aV
is called the domain of attribute a. If there is at least one attribute ATa ∈ in S letting aV has missing value, then S is called an incomplete information system (IIS), and the missing value is expressed as “*”. Definition 2[3].An incomplete decision table (IDT) is an incomplete information system }}{,{ dATUIDT ∪= , where ),*( dVATdd ∉∉ is a complete attribute, called as decision attribute, AT is called the set of conditional attribute.
B. Rough Set Theory in Incomplete Information System Corresponding to the indiscernibility relation in complete
information system, similarity relation in incomplete information system is defined in this paper. Definition 3[3].In an IIS ),( ATUS = , let each non-empty subset ATA∈ , then the similarity relation can be defined as:
*})(*)()()(,|),{()(
=∨=∨=∈∀×∈=
yaxayaxaAaUUyxASIM
Property 1.SIM (A) is a compatibility relation: })({)( aSIMASIM
Aa∈= ∩
Let )}(),(|{)( ASIMyxUyxSA ∈∈= .For A , )(xSA is the largest set of objects which may be
indiscernible from x . SA(x) is called the similarity class of A in S, and the
family of all similarity classes is denoted by )(/ ASIMU : }|)({)(/ UxxSASIMU A ∈= .
In general, )(/ ASIMU constitutes the covering of U , instead of the partition of U .
Define )}()(|),{( ydxdUUyxRd =×∈= ,this constitutes a partition },...,{/ 21 kd DDDRU = of U into
2009 International Conference on Artificial Intelligence and Computational Intelligence
978-0-7695-3816-7/09 $26.00 © 2009 IEEE
DOI 10.1109/AICI.2009.454
346
decision classes. Here, ),...,( 21 kd wwwV = })(|{ ll wxdUxD =∈= ,and kl ≤ . Definition 4[3].For the IIS UXATAATUS ⊆⊆= ,),,( , X can be characterized by a pair of lower and upper
approximations: })(|{)( Φ≠∈= XxSUxXA A ∩
})(|{)( XxSUxXA A ⊆∈= Objects in )(XA can certainly be classified as the
elements of X , while objects in )(XA can only be possibly classified as the elements of X . Definition 5[3].Generalized decision function
ATAVPU dA ⊆→∂ ),(: in IDT is defined as: )}(),(|{)( xSyydiix AA ∈==∂ ,
Here, )( dVP is the power set of dV .
Definition 6[3].If ATAATA ⊆∂=∂ , and ATBAB ∂≠∂⇒⊂∀ , then A is the reduction of AT.
An IDT can be seen as the decision rule set with following forms:
),(...),(),()...,(),( 12211 mnn wdwdvcvcvc ∨∨→∧∧ Or )w(d,)v,(c jii ∨→∧ ,
ni ,...2,1= ,icii VvATc ∈∈ , dj Vwmj ∈= ,,...2,1 ,
v),(ci∧ is the condition part of rule, and )w(d, j∨ is the decision part of rule.
If and only if YXC ⊆)( , }|{ ATccC ii ∈= , the decision rule )w(d,)v,(c: jii ∨→∧r is true,
})(|{ ii vxcUxX =∧∈= , })(|{ jwydUyY =∨∈= . If and only if rule r is true and any other rules constituted
by the proper subsets of conjunction and disjunction in r are all false, rule r is the optimal.
C. D-S Evidence Theory in Incomplete Information System Data in the IDT can be taken as the evidence of existence
of knowledge and is represented in the form of data mass functions. Missing values in IDT are represented as ignorance[4]. Definition 7.For attribute ATa ∈ , data mass function is defined as:
)()(),(
UcardXcardvam =
)()(,*)(
UcardXUcardam −=
aVv ∈ , *},)(|{ ≠=∈= vvxcUxX Definition 8.For rule )w(d,)v,(c: jii ∨→∧r , the rule mass function is defined as:
)()())w(d,)v,(c( jii Xcard
Ycardmr =∨→∧
)})(())((|{ jii wydvycUyY =∨∧=∧∈= ,
})(|{ ii vxcUxX =∧∈= ,icii VvATc ∈∈ ,
rm measures the uncertainty of occurrence at the same time for the decision part when the condition part occurs.
III. ATTRIBUTE REDUCTION IN IDT
A. Significance of Attributes in IDT Definition 9.For }}{,{ dATUIDT ∪= , the dependence of d on A is:
))(())(()(
l
lA DAcard
DAcardd∑∑=γ , dl RUDATA /, ∈⊆
In order to measure the significance of attributes, corresponding attribute (or attribute set) will be removed from IDT, and then the change of classification without that attribute will be observed. According to the observed result, a big change indicates high significance; otherwise, a small change indicates low significance. Definition 10.The significance of A is:
ATAddA AATAT ⊆−= − ),()()( γγσ Definition 11.The core of AT relative to d in IDT is defined as:
}0})({|{)( >∈= aATaATcored σ .
B. Algorithm For Attribute Reduction in IDT Input: an IDT Output: a reduction B Step1. Compute )(ATcored in IDT
Step 2.Let )(ATcoreB d= , if ATB ∂=∂ , go to Step 5 Step 3.
347
Step 4.
Step 5.End the algorithm, B is the reduction in question.
It can be seen from the definition, for )( Uxx ∈ , the decision part of the optimal rule is ),(...),( 1 mwdwd ∨∨ and )(}...,{ 21 xwww ATm ∂= . Therefore, the problem of extracting an optimal rule will turn to find the reduction of attributes actually.
IV. RULE EXTRACTION IN INCOMPLETE DECISION TABLE
Both rough set theory and D-S evidence theory are applied, and the procedure of rule extraction is shown as follows:
1) Using the algorithm mentioned above to get the reduction B in IDT.
2) Listing the corresponding rules in B for each attribute.
3) Evaluating rm of each rule; and selecting the rule into result set R′ if 1=rm .
4) ,, bVtBb ∈∈∀ defining }0)(|{ >→∈= jrdj wtmVww ,and selecting wtr →: into
result set R′ if ATw ∂⊆ .
V. EXAMPLE Extracting the optimal rules according to Table I:
1) The significance of each attribute is: (omit the detailed process here for length limit):
0)135/()12()135/()12()(Pr =+++−+++=iceσ 0)135/()12()135/()12()( =+++−+++=Mileageσ
214)446/(2)135/()12()( =++−+++=Sizeσ
338
)155/(1)135/()12()_(
=
++−+++=SpeedMaxσ
}_,{ SpeedMaxSizeB = , ⇒∂=∂ ATB B is the reduction.
2) List of rules: goodfull-:1 =>= dSizeR
xceldSizeR efull-:2 =>= poordompactSizeR =>= -c:3
good-_:4 =>= dlowSpeedMaxR good-_:5 =>= dhighSpeedMaxR xceldhighSpeedMaxR e-_:6 =>= oordhighSpeedMaxR p-_:7 =>=
3) rm of each rule is:
151
54
321 === RRR mmm
31
311 654 === RRR mmm
31
7 =Rm
Thus, two optimal rules can be got:
begin
true
icBB −∂=∂
icBB −=
icBi ComputeCc −∂∈ ,tru
false
1+= ii
end
1, =−= icoreBC
)(Ccardi ≤
begin
1,0, ==Φ= inm
true
1+= ii
nBATc BcBi i>−−∈ )(, γγ ∪
iBcB cmni
=−= ,γγ ∪
)( BATcardi −≤
true
false
BComputemBB ∂+= },{
ATB ∂≠∂
false
end
true
false
348
poordompactSizeR =>= -c:1' good-_:2' =>= dlowSpeedMaxR
4) When fullSize = , ⇒= },{ excelgoodw ATw ∂⊆ , so there is the third rule:
exceldgood-:3 =∨=>=′ dfullSizeR It can be seen from the definition of optimal rule that the
three rules above are all optimal.
CONCLUSIONS Uncertainty in real life is widespread, so research of
knowledge discovery and rule extraction the incomplete information system is of prominent practical significance. This paper tries to use two theories to study the rule extraction in incomplete information system, and obtains valuable results. However, it is only a preliminary investigation which needs to be further improved. Other approaches of attribute reduction and rule extraction will also be discussed.
TABLE I. ICOMPLETE INFORMATION SYSTEM
Car Table Column Head
Price Mileage Size Max_Speed d
1 High High Full Low Good
2 Low * Full Low Good
3 * * Compact High Poor
4 High * Full High Good
5 * * Full High Excel
6 Low High Full High Good
TABLE II. GENERALIZED DECISION FUNCTION
Car AT∂
1 {Good}
2 {Good}
3 {Poor}
4 {Good,Excel}
5 {Good,Excel}
6 {Good,Excel}
ACKNOWLEDGMENT My deepest gratitude goes to my leaders and colleagues
for the advice and help they have given during the course of conception and writing of this paper, and also to all my family and friends who care about me!
REFERENCES
[1] Z.Pawlak. Rough sets: Theoretical Aspects of Reasoning about Data [M]. Dordrecht: Kluwer Academic Publishers, 1991.89-95
[2] A.P.Dempster. Upper and lower probability inferences based on a sample from a finite univariate population [J]. Biometrika, 1967, 54(3): 515-528.
[3] W.X.Zhang,W.Z.Wu, J.Y. Liang,D.Y.Li. Rough Set Theory&Method[M].Peking: Science Press,2001
[4] Sarabjot .S .Anand, David.A. Bell, John G Hughes. EDM: A general framework for Data Mining based on Evidence Theory [J]. Data &Knowledge Engineering, 1996, 18:189-223
[5] K.Huang , S.F.Chen ,Z.G.Zhou,W.H.Zhang. Multi-source Information Fusion Method Based on Rough Sets Theory and Evidence Theory. Information and Control.2004, 33(4):422-425
[6] J.Y. Liang, Z.B.Xu. The Algorithm on Knowledge Reduction in Incomplete Information Systems [J].
[7] G.Shafer. A mathematical Theory of Evidence [M]. Princeton University Press, 1976.
[8] Z. Pawlak, Rough sets, International Journal of Computer and Information Sciences.1982,11:341–356.
[9] Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer Academic Publishers, Boston, 1991.
[10] Z. Pawlak, A. Skowron, Rudiments of rough sets, Information Sciences. 2007,177: 3–27.
[11] Z. Pawlak, A. Skowron, Rough sets: some extensions, Information Sciences.2007,177:28–40.
[12] L. Polkowski, S. Tsumoto, T.Y. Lin (Eds.), Rough Set Methods and Applications, Physica-Verlag, Berlin, 2000.
[13] M. Quafatou, a-RST: a generalization of rough set theory, Information Sciences. 2000,124:301–316.
[14] Pan Wei, Wang Yangsheng.Yang Hongji Decision Rule Analysis of Dempster-Shafer Theory of Evidence .Computer Engineering and Application. 2004,14: 14-17.
[15] Yue Chaoyuan.Theories and Methods of Decision Making . Beijing: Science Press. 2003,194-221.
349