Download - Identifying Attributes
Identifying attributes TMRA 2009 open space session
Peter-Paul Kruijsen Morpheus
Problem statement ! Domain: Merge external data into topic map
! Solution: add PSIs in both topic maps to enable merging
! Consequence: Add PSI to almost every topic ! Cumbersome
! Tricky for customers to grasp
! Solution: Merge without hand-coded PSIs
Hand-coded PSIs ! PSIs are usually added by Topic Maps expert based
on identifying attributes ! http://example.org/people/ssn/12345789 ! http://example.org/keywords/topic_maps ! http://example.org/system/IPK719
! Not everyone is able to define perfect PSIs ! Unique ! Stable
Solution ! Compare topics based on fingerprints
! SSN ! Codes ! Topic name
! Auto-generate PSIs based on these uniquely identifying attributes ! http://psi.mssm.nl/random/1258041512117–030586nsZN5Gs6Tq
! Apply these PSIs to topics before merge
! Configuration can be stored in topic map ! k:identifying-attribute(i:person : k:topic-type, i:ssn : k:attribute) ! k:identifying-attribute(i:system : k:topic-type, i:code : k:attribute) ! k:identifying-attribute(i:keyword : k:topic-type, k:untyped-name : k:attribute)
Example !"#$%&'()*"+,-./012***!"'(#3($"45)67%2*****!37#869:%2;(<)*=(%!>37#869:%2***!>"'(#3($"45)67%2***!"#'')2!"#$%&'()!>"#'')2***!"#+63%5(?5@"&3<2.ABA5CD5.E!>"#+63%5(?5@"&3<2*!>"#$%&'()2*
!"#$%&'()*"+,-FGB12***!"'(#3($"45)67%2*****!37#869:%2=(%H*;(<)!>37#869:%2***!>"'(#3($"45)67%2***!"#'')2!"#$%&'()!>"#'')2***!"#$<()%):7@%&2CDCC5GGG5./0F!>"#$<()%):7@%&2*!>"#$%&'()2*
!"#$%&'()*"+,-EDA12***!37#"+%)3"?"%&2<33$#>>$'"I7''7I)9>&6)+(7>./GDCF.G./..EJC0CGDB)'KLGM'BNO!>37#"+%)3"?"%&2***!"'(#3($"45)67%2!37#869:%2=(%H*;(<)!>37#869:%2!>"'(#3($"45)67%2***!"'(#3($"45)67%2!37#869:%2;(<)*=(%!>37#869:%2!>"'(#3($"45)67%2***!"#'')2./0FGBEDA!>"#'')2***!"#+63%5(?5@"&3<2.ABA5CD5.E!>"#+63%5(?5@"&3<2**!"#$<()%):7@%&2CDCC5GGG5./0F!>"#$<()%):7@%&2*!>"#$%&'()2*
!"#$%&'()*"+,-./012***!37#"+%)3"?"%&2******++,-..,/012//2134.563782.!"%(9$!%!"!!':
9#9%(&3/;<%=/&>?@**!>37#"+%)3"?"%&2***!"'(#3($"45)67%2*****!37#869:%2;(<)*=(%!>37#869:%2***!>"'(#3($"45)67%2***!"#'')2./0FGBEDA!>"#'')2***!"#+63%5(?5@"&3<2.ABA5CD5.E!>"#+63%5(?5@"&3<2*!>"#$%&'()2*
!"#$%&'()*"+,-FGB12**!37#"+%)3"?"%&2******++,-..,/012//2134.563782.!"%(9$!%!"!!':
9#9%(&3/;<%=/&>?@**!>37#"+%)3"?"%&2***!"'(#3($"45)67%2*****!37#869:%2=(%H*;(<)!>37#869:%2***!>"'(#3($"45)67%2***!"#'')2./0FGBEDA!>"#'')2***!"#$<()%):7@%&2CDCC5GGG5./0F!>"#$<()%):7@%&2*!>"#$%&'()2*
Algorithm ! For two topic maps and a configuration
! For each topic in source topic map
! For each identifying attribute for topic type
! Lookup attribute value in target topic map
! If no PSI present: randomly generate PSI
! Apply PSIs from one topic to the other
! After this loop: merge topic maps
Demo
Before After
Ups/Downs ! Benefits
! Merging no longer requires mastering PSI but only describing uniquely identifying attributes ! Customers write their own XSLT to generate TM/XML
! Applicable even after large imports ! Merge locally based on fingerprints
! Downsides ! Randomly generated PSIs are unreadable
! Possibility to ‘correct’ afterwards
! Enhancement: remove random PSI after merge