1
Intelligent Databases
An Overview of Ideas and DevelopmentsJenny Carter – De Montfort University, UK
Web ref: www.jennycarter.com
2
Introduction
Overview of developments & research w.r.t. Database/AI integration
Active databases Overview of KBS (Knowledge Based
Systems) Deductive Databases Coupling of KBS and standard DBMS State of the Art Other developments
3
1. Active Databases
Traditional databases are passive: i.e. queries, updates, transactions executed only when requested.
Certain applications e.g. inventory control, factory
automation, etc. are not well supported by passive DBMS
Capabilities such as automatic monitoring of
conditions & ability to take actions (e.g. re: timing) require an ACTIVE DBMS.
Uses the idea of TRIGGERS.
4
1. Active Databases
Two initial approaches:
specific code in applications programs to perform these tasks (problems – maintenance can be difficult – conditions/actions might be spread over a few applications programs. Also, can be hard to understand such code fragments)
building special applications software that periodically polls the DB to determine relevant events (generally all coded in one application program. Frequency of poling is an issue though)
Due to problems with both of these methods, many systems extended with built in centralised sub-system to provide active capabilities (i.e. active rules, or Triggers)
5
Basic Concepts of Active rulesActive rules are in form ECA:
On Event IF Condition Do Action e.g. On Update of Employees Salary If the new salary < 10000 Do rollback the
update Example events: insertions/ deletions/ updates on columns; temporal
events - time when rule should be activated; application defined events – can be external to database e.g. temperature as measured by a sensor. Application needs to tell DBMS.
Example conditions: like WHERE clause in an SQL statement, or even a complete query. Or from procedure written in host language with possible embedded database queries. Can be related to special system variables e.g. Current User etc.
Actions: data updates, further queries, other DB operations (commit, abort, etc.), calls to applications procedures.
6
Active Rules
Format for writing a trigger in oracle is:
Create trigger name
{before | after} {insert | delete | update [of list-of column-names]}
on table-name
[referencing references]
[for each row]
[when condition]
PL/SQL block;
7
Example of Trigger in Oraclecreate trigger salary_checkbefore insert OR update of salary, jobon employeefor each rowwhen (new.job <> ‘PRESIDENT’)declare /*start of PL/SQL block */
minsal number;Maxsal number;
beginselect minsal, maxsal from sal_guidewhere job =:new.job;if(:new.sal < minsal OR :new.sal > maxsal)then raise_application_error(-20601, ‘salary’||:new.sal||‘out of range for job’ ||:new.job||’for employee’||:new.name);end if;
end; /* end of PL/SQL block*/
Oracle provides commands for trigger management. E.g. alter trigger, drop trigger, enable, disable etc.
8
Active Databases
Most work on active DBs is associated with RDBMS rather than OODBs.
Partly because of OODBs having methods incorporated as well as data
Also because of complexity that including rules would cause – scope issues due to inheritance/ overriding features etc.
Many attempts have taken the approach that rules apply to whole class.
There are a number of research prototypes, & work in this area is ongoing.
9
2. Brief overview of Knowledge Based Systems (KBS)
KBSs differ substantially from traditional DBs. They contain rules (as well as simple facts) and they have an inference engine.
Two main types of representation for KBS are:
Rule based – supporting inference by resolution (could
be in Expert System form, or logic programming form)
Frame Based – supports inference by inheritance.
11
KBS – Rule Based The MYCIN system is probably the best-known example (&
historically important) of an Expert System. Built in the mid 70s & containing about 500 rules, it was designed to perform medical diagnosis in the field of bacterial infections.
Example rule from MYCIN is: IFThe infection type is primary-bacteremia, andThe site of the culture is one of the sterile sites, andThe suspected portal of entry of the organism is the gastro
intestinal tractTHENThere is suggestive evidence (0.7) that the identity of the organism
is bacteroides.
12
KBS – Frame Based
Inheritance is one of the most powerful and popular concepts used in AI.
allows grouping of similar notions into classes, economise on descriptions of some of the attributes;
allows deductions to be made about properties of lower level entities;
allows definition of new classes as variants of existing ones.
13
A simple inheritance hierarchy
Need to incorporate possibilities for over riding where necessary (e.g. all elephants are grey, except for a particular known instance, Nellie, who is pink.)
14
3. Deductive Database Systems
DBs need to store & manage data from which users can extract relevant information
Difficult where there is large amounts of complex data
More difficult when information must be derived according to some complex rules
An approach to this might be to code rules into application programs
15
3. Deductive Databases
Deductive databases attempt to solve the problem by storing explicit data and deductive rules that enable inferences to be made from stored data.
Data obtained via action of deductive rules on stored data is known as Derived Data
Deductive databases are therefore the result of combining logic programming with traditional databases.
Characterised by handling large amounts of data as well as performing reasoning based on that information.
16
Basic Concepts of Deductive DBs
Includes set of data – FACTS (sometimes known as the extensional database)
Includes set of inference rules – RULES (sometimes known as the intentional database)
The DATALOG language offers an approach to this. It is a combination of a database and the logic
language Prolog. It allows definition of both tables & rules. Includes facilities for defining integrity constraints etc. Easier to store facts than with a logic programming
language.
19
Deductive DB Architectures
An example of a heterogeneous system known as NAIL (Not Another implementation in Logic), developed at Stanford University. Links DATALOG to a conventional SQLDB system. DDBs are especially useful for problems involving temporal and/or spatial aspects.
Also see ProDBI:
www.sics.se/isl/quintus/prodbi/db.htm
20
4. Coupling of KBS and ‘standard’ DBMSs. These types of system often use KBS as a front end,
with a DBMS as the back end. Some people say this leads to a fundamental mis-
match due to:
Knowledge representation (KR) – flat DB tables are not compatible with some of the advanced KR techniques used
ESs often have fact base developed in an ad-hoc way - can result in performance problems that a traditional DB system would not have.
Often end up with use of redundant data descriptions in order to make data exchange possible
21
4. Coupling of KBS and ‘standard’ DBMSs.
Use of static inference processing in AI versus dynamic queries in DBs:
DBMS uses operational knowledge from information in applications programs e.g. embedded SQL, stored procedures etc. The operational part of a KBS is represented by declarative knowledge in the rule base.
Granularity mis-match – KBS can’t handle set optimisation that is a benefit in DBMS:
KBS works with a row at a time instead of sets of rows, hence lose effect of optimisation on sets of data.
There are implementations existing already that suit the purpose & that are not seriously affected by these problems.
22
Coupling of DBMS & KBS
Can adopt different levels of coupling the two types of system.
Communication channel between two subsystems
Extract data from DBMS, store & use the snapshot in the KBS (problems here – snapshots soon become obsolete as
DBs are updated frequently, may need snapshots from a few sources at once, slow.)
23
Architectural solutions for KBS/DBMS integration
The first architecture in the diagram was implemented by Trinity College, Dublin - system known as DIFEAD (Dictionary interface for ES and DB). One of the first systems to do this and also first to base the interface functionalities on the data dictionary concept. An earlier similar example is KADBASE. Uses a network data access manager to provide central interface between different components.
(KESE = Knowledge Engineering Software Environment)
24
Extending KBS with DB components
This is the solution adopted by ES tool vendors, so that their systems can use information extracted from a database.
A well known product that operates in this way is KBMS. Written in C
Uses idea of forward & backward chaining
Incorporates NL facility by allowing developers to write rules in English
Includes its own relational DB storage facility.
Uses If-Then type rules
25
5. State of The Art
A series of annual workshops take place that aim to go beyond the classical KBS/DB connection. These are international and the first one was held in 1994. The proceedings for these can be found at: http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-54/
KRDB (Knowledge Representation meets Databases) Workshops
26
CYC project
•Launched in late 1984 as an MCC (Micro-electronic & computer technology corporation, Texas) project. •A very large KB built with a huge amount of common sense knowledge. Includes ideas to do with time, space, substances, contradiction, causality, emotions, beliefs, etc.•Contains more than a million hand inserted assertions, that are made up of facts and rules.•Includes interface tools, runs on various platforms. Currently developing another interface so that general public can insert facts and rules.
Can see website about this at: http://www.CYC.com
27
CYC Project
Attempts to redress ‘narrowness’ problem of domains addressed by KBSs. It is being used in concrete applications now.
29
6. Other Developments
Temporal databases Ontologies Semi structured & un-structured data Internet indexing & retrieval Data Mining
30
Temporal DB example - Temibase
Integrates AI rules with temporal database Can handle incomplete temporal information Supports temporal reasoning Supports learning through derivation
performed on data and rules Supports both active and passive rules
depending on purpose of system Currently developing NL interfaceSee web page for links
31
7. Personal interest – music representation
Symbolic approaches have proved useful Vocabulary of symbols used to represent
concepts or objects Programmer uses the vocabulary to say in its
terms how the programs can achieve results Level of abstraction for music?
No right answer for everyone Jackendoff’s idea of “musical surface”“lowest level of representation which has musical significance”
32
Music representation
Wiggins& Smaill propose 2 dimensions on which to compare music representations: Expressive completeness Structural generality
33
Music representation
They aim for “a represntation with an explicit but not too erstrictive
musical surface, within which the widest possible range of data can be represented”
Enables sharing of data between researchers Better means of expressing and exchanging new and
difficult ideas Propose the CHARM system Language independent but most implementations
have been in Prolog – v. good for symbol manipulation