soren selmar, institute of production, university of ... - a knowledge based... · soren selmar,...

12
KBDS - A Knowledge based concept for user-friendly selection of data in decision support systems Soren Selmar, Institute of production, University of Aalborg, Denmark Keywords: Infonnation Retrieval. Decision Support Systems. Relational Data Bases, Distributed Databases, Application Generation. End-user Computing, Knowledge Based Systems, Thesauri. Data Modelling and Prototyping. The KBDS-concept (Knowledge Based Data Selection) provides a new and more user-friendly approach to the retrieval of data from databases. A system based on the KBDS-concept can act as an intennediary system between deci- sion support tools and one or more connected databases. Such a system makes it possible for the manager himself, in a few minutes, to carry out any data se- lection across one or more databases. In the first part of this paper, the problems related to data selection from databases, will be dealt with. Only relational databases are concerned. In the next part, the KBDS system will be described. The KBDS system exists as a tested prototype, implemented in an ORACLE Fourth-Generation System. However, in this paper the KBDS-system will be described, as it would appear in a window-based interface connected to distributed data bases. The third part of the paper gives examples of the use of the KBDS-system em- phasizing the ease-of-use and the generality of the system. Conclusions will be made at the end of the paper. Context of the KBDS task Data selection is a critical and problematic task, no matter whether your deci- sion support simply is based on a traditional Fourth-Generation System, or your company uses more advanced integrated DSS- or EIS-systems. 108

Upload: dangkien

Post on 05-Jun-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

KBDS - A Knowledge based concept for user-friendly selection of data in decision support systems Soren Selmar, Institute of production, University of Aalborg, Denmark

Keywords: Infonnation Retrieval. Decision Support Systems. Relational Data Bases, Distributed Databases, Application Generation. End-user Computing, Knowledge Based Systems, Thesauri. Data Modelling and Prototyping.

The KBDS-concept (Knowledge Based Data Selection) provides a new and more user-friendly approach to the retrieval of data from databases. A system based on the KBDS-concept can act as an intennediary system between deci­sion support tools and one or more connected databases. Such a system makes it possible for the manager himself, in a few minutes, to carry out any data se­lection across one or more databases.

In the first part of this paper, the problems related to data selection from databases, will be dealt with. Only relational databases are concerned.

In the next part, the KBDS system will be described. The KBDS system exists as a tested prototype, implemented in an ORACLE Fourth-Generation System. However, in this paper the KBDS-system will be described, as it would appear in a window-based interface connected to distributed data bases.

The third part of the paper gives examples of the use of the KBDS-system em­phasizing the ease-of-use and the generality of the system. Conclusions will be made at the end of the paper.

Context of the KBDS task Data selection is a critical and problematic task, no matter whether your deci­sion support simply is based on a traditional Fourth-Generation System, or your company uses more advanced integrated DSS- or EIS-systems.

108

t; \' 1 " ~ ;

~.; ~~ , r· ~. t.,'

" F: Ii ft. ~ ~ :

'"' r., I' .C -;<!

" .~.;

.. {-

-,,-

~ t' :fJ;

:~ '5 t ~~

{ ;"'

-.::' "t r ):

~ :f~ :.£ '::.p ,:E' 'tf '.j>

:1 ..\".

f" J: ;:-:

"

\~ \

In the future, increased needs for on-line data and ad hoc analysis, can be ex­pected. And you will also increasingly see very large distributed data bases used as data sources for decision support. At the same time, a lot of efforts are carried out to move the decision support tools from the EDP-professional's desk to the manager's and other end-users' desks. That is why you need more advanced and intelligent tools appropriate for selecting the valuable decision support information.

It is necessary to give advanced data selection tools to managers, if they shall be able to meet the challenge of the future. They will need tools, which in tenns of minutes are able to combine data from one or more databases. It will often be preferred that the task can be perfonned in an interactive way and done by the manager himself.

The KBDS system meets these future demands. Before documenting how the KBDS system works, some essential problems according to data selection from data bases will be discussed.

Data Selection Problems When a manager needs data for decision support, and no standard application shows to be appropriate, he has a problem. The company's data related to a spe­cific object - for instance a specific employee - will often be spread across sev­eral tables and perhaps across more than one database. An exhaustive search can easily include the opening and combining (join-operations) of several ta­bles. If you have only few minutes left for selecting such a composed set of data, you will nonnally be recommended to give up your attempts. Even data base professionals need hours or days to perfonn advanced data selections of that kind. Especially if you at the same time want an appropriate interactive search application built.

A fundamental problem for the end-user is, that it is normally impossible to remember in which tables and in which databases data is stored. Data-dictio­nary data will only be helpful, if you already know, what is hidden behind those often meaningless abbreviations used as table-names and column-names in the data bases.

An Entity/Relationship data model for data bases contains valuable infonna­tion, if you are going to perfonn advanced data selection. However, data mod­els of that kind presumes, that you also know the complex relations between the data models and the actual data base designs related to them. Unfortunately data models are quite difficult to understand and they are nonnally not inte­grated with the data base. That makes it almost impossible for end-users to use the infonnation kept in the data models as useful information when perfonning data selection.

109

Another problem is, that the standardized search language SQL (Structured Query Language) is to complicated to use for managers and other ordinary end­users. If you need to select data across more than one data base table or across more than one data base, you normally need special SQL-skilled people to as­sist you.

The problems mentioned above are only some of the problems, you will have to deal with, if you want to select data directly from one or more databases. Other problems will occur, if you in a hurry want to create an application, which in an interactive way can help you to select the needed data set. In fact it will very often be too expensive to carry out such data selections, especially if it shall only be used in an ad hoc situation.

In figure 1, some of the problems in selecting data from data bases are illus­trated. As you see in the figure, the user's head must hold and operate on a great deal of knowledge if he wants to carry out advanced data selection with no help from professionals.

USER'S MODELS Data Models

Data Base Design Models k"'" § Data Base Allocations

Sal-language or Tool Skills Application Design Skills

'It' -I' ..", ~

~ @

" § - -

Figure 1. The user's head must hold a great deal of knowledge and skills if he wants to select data directly from one ore more relational data bases.

The KBDS system The KBDS system solves the data selection problems discussed above. It could be seen as "the missing link" between users and data bases.

The KBDS system consists of 2 subsystems, which eventually are connected only by a data network. The first one is the interface, through which the user interacts, based on his own well-known models. The second one is the knowl-

110

edge base, which holds all the knowledge necessary for supporting the users. In figure 2 the principles for the KBDS system are illustrated.

User's Models

KBDS KNOWLEDGE BASE

KBDSSYSTEM

Figure 2. The interface and the knowledge base. of the KBDS system. The KBDS sys­tem serves as an intennediary system between the manager and one or more relational databases.

The KBDS interface is a dynamic and database-neutral interface, which auto­matically produces a customized data selection application, as the user asks for infonnation on a specific object. At the same time, the application selects and merge the needed set of data.

The KBDS knowledge base is implemented into a relational data base, and it represents and integrates all infonnation and knowledge necessary for support­ing the user through the search sessions. The knowledge base keeps the major part of the knowledge, which was shown inside the head of the manager in fig­ure 1. It means, that the knowledge base keeps information about how to search, where to search and how to combine and merge data. The user needs only to specify in one of his own models, which of the objects he is interested in.

The interface and the knowledge base will be discussed in details in the follow­ing. (If you are only interested in how to use the KBDS system, you can con­tinue your reading at the headline "The Use of Models in the KBDS System").

The KBDS Interface The best way to implement the KBDS interface is in a graphical, window­based interface. And the best way to interact with the system is to use a mouse. In this way you will be able to take the best advantages of the KBDS system.

i

J

The KBDS-interface is unique, because it is general, self-generative, dynamic and database-neutral. These characteristics will be explained in the following.

The interface is general because it gives you access to any data set in all of the data bases connected to the KBDS system.

The interface is self-generative because the system during the search session automatically builds up a customized data selection application. Thanks to this application, the user is, in an interactive way, able to select and extract the needed data sets. And you do not need to write one line of code to get that far. A few clicks with the mouse and the system will do the work for you.

The user (and eventually the programmer) will also be able to pse the KBDS interface as a prototyping tool, when trying to make an standard application suitable for selecting a specific set of data.

The interface is dynamic, because data models, data base designs and data in all the data bases connected to the KBDS system, are allowed to change with­out affecting the functionality of the KBDS interface. You will only need to update some descriptions in the knowledge base, every time data models or designs models are changed in connected data bases.

The interface is database-neutral, because the interface talks SQL and because the interface will be able to act in the same way, no matter which relational data base is to be used for implementation of the knowledge base. If a certain data base talks ANSI-SQL, and you are able to connect it by network, then it can be connected to the KBDS system. If you want the knowledge base of the KBDS system to be stored in such a data base, you will be free to do that, too.

Later in this paper, some examples, which illustrates the facilities in the inter­. faces, will be described.

Most of the special characteristics of the KBDS system are caused by the ad­vanced, integrated knowledge base, which will be described in the following.

The KBDS Knowledge Base The KBDS knowledge base is a data base, which captures much more mean­ing, than you see in traditional data bases. The KBDS knowledge base supports inheritance and it supports a very high level of integrity. More details on these characteristics will be given below.

Inheritance means that you are operating with superterms and subterms in a tree-like structure - often called a Thesaurus. The ISO Standard 2788 "Documentation - Guidelines for the establishment and development of mono­lingual thesauri" is used when building thesauri in the KBDS system. Use of inheritance makes the KBDS system able to deal with generalization and spe-

112

I· ,

f~ I, i·~ 1: t: \, ;: E t f! "

I ~ ~:

~ k oj

',': ~,

'. I t; j' (,

:~ t .. j: j~.

f:~

~ ~ f ,. /.

N 1~: r· '.f. .~:"

t ir

.~ ,;:, ~~

\ ~' ~.

~ \.. :;;

~H' , T

~~

'}3 )C

".~ .;.:

cialization in dialogues with the users, and it supports a powerful data selection method used by the system.

The integrity in the knowledge base is a crucial feature of the system. The knowledge base is designed according to a fully integrated meta-model shown in figure 3.

Entity-type i level

(Thesaurus) Model levels

Entity ~ level

Signatures: 0 Entity-type 0 Entity

o Relationship • Data ® SOL-statement

Figure 3. The meta-model of the KBDS knowledge base.

The meta-model shown in figure 3 is coherent from entity-type level to data level. The entity-types constitute the inheritance structure, which is also the system's thesaurus. It serves as an effective data selection tool for users, and every user is allowed to customize his part and version of the system's the­saurus for using it in his personal KBOS interface.

The entity-types from the system's thesaurus will at the same time be suitable for tenn indexing of all sorts of documents, ftles, etc .. The thesaurus can then be used as a controlled indexing vocabulary, which gives the user an effective tool for retrieval of all sort of large knowledge representations like documents, computer programs, etc. It was indeed what the thesauri originally was meant for, according to ISO 2788.

On the bottom of the meta-model in figure 3 you can chose between saving data directly into the knowledge base or saving a SQL-statement to be used on data net, when you need contact to external data bases. These SQL-statements can be used to bring on-line data to the KBOS system or they can be used to download data from external data base to the KBOS knowledge base.

113

If you want to download or carry out an on-line search from other relational data bases, you have to tell the knowledge base, where the data sets logically is stored. Then the KBDS system will be able to use such data sets, as if they were stored in the knowledge base itself.

Besides the meta-model shown in figure 3, some meta-rules are used to control and maintain the KBDS knowledge base. Some of these rules are shown in fig­ure 4. Together, the meta-model and the meta-rules make it possible to build up and maintain a very high degree of integrity and consistency in the knowledge base.

A. Every entity-type must be included in the same coherent thesaurus B. Every individual entity must be connected to at least one entity-type C. Every knowledge statement must include exactly one relationship

Figure 4. Some of the most important meta-rules of the KBDS knowledge base.

Meta-rule A in figure 4 makes it sure, that you will always fmd one coherent system thesaurus holding all the entity-types of the knowledge base. Typically this thesaurus could be a shared company thesaurus, and you will nonnally find most of the needed entity-types if you look for the entity-types in the compa­nies Entity/Relationship-models.

Multiplicity in the thesaurus gives no problems, as long as you avoid inconsis­tency in the inheritance structure. Loops are not allowed.

The meta-rule B makes it sure, that all individual entities are classified accord­ing to the thesaurus. Of cause many entities belong to more than one entity­type. For example a specific employee can belong to both "Sales Managers" and "Project Managers". But if you specify, that he belongs to "Sales Man­agers", you do not need to tell the system, that he also belongs to the "Managers". Because of the built in inheritance structure, the system already knows, that all "Sales Managers" are "Managers". That is one of the ways in­heritance is used in the KBDS system.

The meta-rule C in figure 4 tells, how to fonnulate any "knowledge representa­tion element" kept in the KBDS knowledge base. In fact every knowledge rep­resentation element is fonnulated in a readable natural language in the knowl­edge base. That makes it easier to read and manipulate the knowledge represen­tation directly.

On the lowest logical level in the knowledge base, surrogates are used inten­sively for perfonnance purposes. However, the user never has to use these in­ternal identification numbers.

114

The KBDS knowledge base has also got some other remarkable characteristics. The systems knowledge base can keep itself free of redundancy, and it is au­tomatically kept fully normalized and fully indexed. These features will not be discussed further in this paper.

The Use of Models in the KBDS System The KBDS system is model-based, which indicates that the knowledge base contains and operates with different sorts of models or knowledge structures defmed by the users. .

Tree-structures or tree-like structures can be dealt with dynamically by the KBDS system, which means that the system uses such models graphically in the interface, and the system is able to generate and maintain such structures. Figure 5 shows some examples of models, which can be used by the user.

Inheritance Structure Activity Network

Plant Lay-out M6

Figure 5. All types of models can be used in the KBDS system. Users derme their own models and use them for object-identification and as starting points for the data selections.

In figure 5 you can see an inheritance structure in the upper left corner. Refer­ring to figure 3, this kind of models belong to the entity-type level, because it is made of pure entity-types.

In figure 5 you can also see an organizational structure, which is quite another sort of tree-structure. This structure is built up only of individual entities and you will find no inheritance in such a structure. Because of that it corresponds to the entity level in figure 3.

115

'i'"

The activity network and the plant lay-out in figure 5 are made of individual entities too, but as they are not tree-like structures, they cannot be generated and maintained automatically by the system. However, they can be used like other entity-level models for identifying entities in the interface of the KBDS system.

How to Select Data by the KBDS system Thanks to the integrity and coherence of the KBDS system's meta-model, you will be able to start your search wherever you like. I refer to the meta-model in figure 3.

As you can see in the top menu bar of the KBDS interface lay-out in figure 6, you can choose different starting points for your search session. If you click on MODELS, the system makes you choose between Thesauri. Trees and ~ models.

Traditional data search in the data tables of the KBDS knowledge base is pos­sible too, but it is not recommended.

Data Selection Based on the User's Thesaurus. An example. In the following example you will see, how the KBDS interface in a user­friendly and effective way supports the search strategy. The Staff Thesaurus from figure 5 is used as a starting point. Se figure 6.

Company Th_auru. Private Th_uru. stan Thesaurus Product Th_urua

Figure 6. Search based on the user's Staff Thesaurus. The tree-structured thesaurus is ready for a mouse-click on the actual object.

This screen in figure 6 illustrates how MODELS and afterwards Thesauri have been chosen. Following the system shows a list of different thesauri. The user

116

! I·

I', . ,

then choses Staff Thesaurus and following this thesaurus is shown, ready for a click on one of the 6 types of Employees.

When you make a choice in a thesaurus, you are recommended to make your choice as specific as possible. That makes it possible for the system to optimize the search. To specify, you must chose a suitable entity-type as low as possible in the tree-structure.

If you click on Managers, the next screen automatically shows a distinct list of column names and similar names of relationships. These names of columns and relationships correspond partly to columns in different data base tables. See the list of columns and relationships in figure 7.

MOOELS

no. name

4 Staff 5 Projects 1 Surnames

Salary 3 Telephon 2 Name

Figure 7. All columns and relationships connected to Mana~ers in the knowledge base and in connected data bases are shown. The user is free to click on which of the columns, he wants data from.

To get the data-output from the system, you have to click on the columns, from which you want data included. As you click, you determine at the same time, the ordering of the columns in the following data table. When you have fin­ished pointing out columns, you click on DATATABLE in the top menu bar. See screen in figure 7.

Then the KBDS system· selects and merges data from one or more connected databases, or eventually only from the knowledge base itself. Some seconds later the system shows your data set in a data table as illustrated in figure 8.

117

MODELS I RELATIONSHIPS I DATATABLE LANGUAGE I MACROES I DATAEXC.

Data on: Managers Query D Score ._ Surnames .. Name < •.. ··lTelephone ··Staff· .•.... ······Projects

Star Pet .. ~ ........0468970DeI'l'I~ .•• [:ITtchnology l' Hanson·· •.·•· •. ·Allal:i ···IO~5~78[]SuI11~~p Pat1y .. proJect GraCE!·· ....BrI.n. . .....•... 1... 24 34530. B1ac:k ··OD-proJect Smith David C055768 0Wayn.. ..Envlronment F Cassidy ···.····Slmon ···246879 OSQtlmlth>COD-proJect .... Smart ·······JotinnYH 1268790.Jan$()n CPat1y-proJect Adams ············BO ... .•. .....1454623 OKlmbl,rOTechnology 1 ,

Figure 8. The KBDS system shows a virtual data table as data output. This data table consists of data selected and merged from one or more data bases or from the knowl­edge base itself.

Now the most critical and problematical part of the data selection is carried out. The user have been able to perform an advanced data selection maybe based on several tables and several data bases.

Some of the data elements in figure 8 are prefIXed by small squares, which in­dicates that only one of the data elements in the field is shown. If you click on the square, the system will show you a list of all the values in the data field.

If you only need a part of the data set shown in figure 8, you are allowed to use traditional Query By Example to reduce the numbers of records.

As you see, any user will be able to deal with data selection, if they are given an intelligent and user-friendly tool like the KBDS system. There is no need for the user to learn anything about data models, data base design, data base lan­guages and so on. The system makes it sure that he gets exactly the data he needs for further analysis and decision support.

For each column you want to add to the output data table, you only have to click out one more column in the screen shown in figure 7.

It is very hard to imagine, how to perform advanced data selection in a more simple and user-friendly way.

118

If you from the beginning had recorded your clicks with the mouse, you would afterwards be able to run the application again as a macro-application. And then the application would automatically be able to select the needed data set based on on-line data, every time you need it.

Search Based on Non-Thesaurus Models If you chose a model, which is not a thesaurus, you will see, that the interface looks and acts exactly the same way as before. The only difference is, that the system will not use inheritance, because inheritance does not exist between in­dividual entities.

There is no need to show another data selection session. Any of the user-mod­els shown in figure 5 could be used as starting point for a data selection ses­sion, and the KBDS interface would be able to supports the data selections in exactly the same way.

Conclusion If you are using knowledge based techniques together with graphical, window­based interfaces, almost every user can be made able to carry out advanced data selection across one or more data bases. And at the same time, the user can develop an interactive data selection application for future use.

Prototypes of the KBDS system have shown, that it is possible to develop and use a data selection system like the one described in this paper.

The next step for the KBDS system must be implementation and tests in indus­trial environments.

That is why we are looking for co-operation with companies or software houses, to help us bringing this interesting tool from the laboratories to ex­ploitation in real life.

Please contact the author for questions, or if you are interested in co-operation.

SjlSren Selmar Ph.D., M. of Science University of Aalborg

Fibigerstraede 13 DK- 9220 Aalborg 0

Denmark Phone. 45-98-158522 Fax.no.45-98-153030

119