seed database requirements

41
SEED Database Requirements Version of July 1994 Prepared by the SEED team: Primary Authors James Snyder, Ulrich Flemming with Robert Coyne, Robert Woodbury, Shang-Chia Chiou, Bongjin Choi, Han Kiliccote, Teng-Weng Chang, Sheng-Fen Chien Contact: Jim Snyder Carnegie Mellon/Building Industry Computer-Aided Design Consortium (CBCC) Carnegie Mellon University Pittsburgh, PA 15213 [email protected] (412) 268-6271 Abstract SEED is an acronym for Software Environment to Support the Early Phases in Building Design. The overall architecture of SEED is based on a division of the preliminary design process into phases, each of which addresses a specific task. SEED intends to support each phase by an individual support module based on a shared logic and architecture. The modules envisioned for the first SEED prototype support the following phases: architectural programming, schematic layout design, and schematic configuration design. This document presents a requirements analysis of database related issues for the above and anticipated modules.

Upload: independent

Post on 23-Nov-2023

1 views

Category:

Documents


0 download

TRANSCRIPT

SEED Database Requirements

Version of July 1994

Prepared by the SEED team:

Primary AuthorsJames Snyder, Ulrich Flemming

withRobert Coyne, Robert Woodbury, Shang-Chia Chiou,

Bongjin Choi, Han Kiliccote, Teng-Weng Chang, Sheng-Fen Chien

Contact:Jim Snyder

Carnegie Mellon/Building IndustryComputer-Aided Design Consortium (CBCC)

Carnegie Mellon UniversityPittsburgh, PA [email protected]

(412) 268-6271

Abstract

SEED is an acronym for Software Environment to Support the Early Phasesin BuildingDesign. The overall architecture of SEED is based on a divisionof the preliminary design process into phases, each of which addresses aspecific task. SEED intends to support each phase by an individual supportmodule based on a shared logic and architecture. The modules envisionedfor the first SEED prototype support the following phases: architecturalprogramming, schematic layout design, and schematic configuration design.This document presents a requirements analysis of database related issuesfor the above and anticipated modules.

SEED Database Requirements 7/19/94 Page ii

Table of Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . 11.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 11.2 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 11.3 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . 2

1.3.1 Client Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21.3.2 Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 21.3.3 Architectural Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3.4 Design Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 21.3.5 Functional Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21.3.6 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3.7 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . 31.3.8 Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . 3

2 Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 32.1 Problem Statements, Functional Units, Design Units, and Solutions . . . . . . . . . . . . . 32.2 Common Information Storage and Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Common Vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52.4 Inheritance and Subclass/Subtype Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.4.1 Subclass and Subtype Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.4.2 Required Inheritance Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.5 Object and Attribute Indexing for Case-Based Retrieval . . . . . . . . . . . . . . . . . . . . . . 82.6 Generalized Constraint Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.7 Module and Agent Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.8 Geometric Relationships and Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.9 Coordination and Communication of Constraint Violations . . . . . . . . . . . . . . . . . . . 102.10 New Modules and Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.11 Information System Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.11.1 Media Presentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.11.2 Media Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.11.3 Media Access Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.11.4 Media Hardware Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.12 Requirements Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.12.1 Database Management System Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.12.2 Agent Communication and Coordination Framework . . . . . . . . . . . . . . . . . . 152.12.3 Knowledge Representation Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 Design and Implementation Issues. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.1 Knowledge Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.1.1 Metaobjects or Database Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.1.2 Common Vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.1.3 Consistency Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.1.4 Constraint Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.1.5 Partial Matching of Case Indices and Index Adaptation . . . . . . . . . . . . . . . . . 193.1.6 Case Indexing of Classes, Instances, and Attributes . . . . . . . . . . . . . . . . . . . . 19

Page iii 7/19/94 SEED Database Requirements

3.1.7 The Impact of Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Distributed Communication and Coordination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.2.1 Common Vocabulary and Translation Issues . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2.2 Information Routing, Notification, and Coordination . . . . . . . . . . . . . . . . . . . 213.2.3 Collaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .223.2.4 Constraints and Violations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.2.5 Impact of Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.3 Information Storage and Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.3.1 External Database Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.3.2 Multimedia Manipulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3.3 Module Specific Information Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3.4 The Impact of Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.4 Software Tools and Programming Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.4.1 Software Engineering Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4.2 Knowledge Engineering Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.4.3 The Impact of Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Recommendations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . 274.1 Evaluation of Commercial Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1.1 UniSQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 284.1.2 Versant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 294.1.3 Matisse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . 29

4.2 Evaluation of Research Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.2.1 Exodus and E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .294.2.2 ODE and O++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30

4.3 Similar Research Efforts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.4 Ranked Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . 31

6 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 32

SEED Database Requirements 7/19/94 Page 1

1 IntroductionThe use of database management systems within computational design is critical.

However, the issues of database management within a distributed computing environmentcannot be discussed in isolation. The communication, coordination, and knowledge repre-sentation issues must be discussed in addition to the “pure” database issues, primarilybecause database systems are a supporting facility for these capabilities.

The SEED project (Flemminget al. 1992) focuses on the early stages of the designprocess where many types of diverse information are created, evaluated, stored, andrecalled. A robust information management system is needed to facilitate this informationintensive process.

1.1 Purpose

This document serves four basic purposes. The remainder of Section 1 is a collec-tion of relevant definitions that have an impact on the database implementations. Section 2presents the important requirements surrounding the database environment and are pre-sented independently of any computer implementation. Section 3 focuses on the design andimplementation issues in the light of these requirements. Section 4 evaluates several data-base systems and makes a ranked recommendation based on the requirements, design andimplementation issues. An annotated bibliography on knowledge representation, databasemanagement systems, distributed computing, and computational design systems is alsoincluded.

1.2 Scope

Generally, two categories of information must be stored and represented persis-tently. Project-independent information is stored for reuse across projects, while project-specific information is stored for reuse across module sessions. A module session has beendefined to have run time memory which is separate from database storage (see Flemminget al. 1992, p. 10). This definition has two important consequences. First, it allows for thecreation and deletion of transient information without involving database facilities. Second,it allows for design information within a module to be unsynchronized with database stor-age. It can be argued that memory which is synchronized with the database will also beneeded. The alternatives solutions land between nothing written to the database (i.e. noth-ing stored persistently) and everything immediately written to the database. Neither ofthese extremes is desirable. Therefore, a critical feature of the database environment is theease in which these two memory systems interact.1

1. This creates the need for parallel class structures in database and run time memory.

Page 2 7/19/94 SEED Database Requirements

1.3 Definitions

This section is intended to provide the necessary background and definitions forterms used within this document. The following definitions are taken from other SEEDrelated documents, and their references can be found in (Flemminget al. 1992). Each of thefollowing definitions have been selected because they have bearing on the database man-agement system.

1.3.1 Client Program

A client program specifies the client’s goals in terms of what is to be built, where,at what cost, and under what expectations. An explicit program should state at least the fol-lowing: site and site-related restrictions, building type and overall indication of size, bud-get, and references to applicable codes, standards, and regulations.

1.3.2 Project

A project is initiated by a client program and includes every process executed andevery product produced during the development of a design that satisfies the client’s goalsas expressed in the client program.

1.3.3 Architectural Program

An architectural program elaborates the client program in terms of the followingmajor parts: specification of context, specification of main functional units and componentsand requirements needed to achieve the client’s goals, references to the applicable buildingcodes and regulations.

1.3.4 Design Unit

A design unit is a part of the spatial or physical structure of a building generated anddirectly manipulated by a designer. A design unit consists of geometric as well as otherattributes. Examples might bewidth, height, material or color. Design units can be com-posed of other design units to form hierarchical compositions of space or structure. Func-tional units house the constraints imposed on design units, and each design unit isassociated with one functional unit.

1.3.5 Functional Unit

A functional unit collects functional requirements and constraints on a design unit,which is intended to perform a specific set of functions. Functional units may also have con-stituent functional components; components contained within a functional unit areconstit-uent units.

SEED Database Requirements 7/19/94 Page 3

1.3.6 Problem Statement

Within the context of a module session, the term problem refers to the problem tobe solved in the session. A problem statement is the representation of a problem in a modulesession. An active problem statement is a problem statement in a module session on whichthe designer can operate directly from the screen. There is never more than one active state-ment in a session.

1.3.7 Solution

In the context of a module session, the termsolution refers to a solution to a problemto be solved in the session. The solution of one phase may become part of the problem ofanother phase.

1.3.8 Case

A case is a problem-solution tuple that is stored within the database. Currently, aminimal definition of an index is the problem portion of the first problem-solution pair. Inthe programming phase, a client program (the problem) and an architectural program (thesolution) may be a case. In the schematic layout phase, an architectural program (the prob-lem) stored with a schematic layout (the solution) may be a case. However, a case mayextend across design phases.

2 RequirementsThe following sections are intended to present the major issues related to the data-

base environment for SEED. Issues which are affected by database design and require-ments decisions are presented as well as potential interactions and conflicts between theidentified database requirements.

2.1 Problem Statements, Functional Units, Design Units, and Solutions

Problem statements, functional units, design units, and solutions are all referencedwithin the context of a module session. Because a module session can be suspended (savedand restored), the module’s state must be able to be stored and reloaded at a later time. Eachobject type has the need for browsing the database contents and selecting an item and all ofits associated components. For example, a layout, which is a solution, is a collection ofdesign units, and if a solution is selected for retrieval, all of the associated design units mustalso be retrieved.

Functional units are an example of an object which can have multiple relationshipswith other objects. For example, theadd constituent use case requires the existence of a setof functional units which exist independently of any other objects; these functional unitsact as template objects (Flemminget al. 1992, p. 30). The representation of functional unitsmust support class hierarchies and inheritance mechanisms, which aid in the creation of

Page 4 7/19/94 SEED Database Requirements

new functional units, as well as create the potential for efficiencies in implementation.Browsing the class hierarchy before the selection of a particular functional unit class orinstance implies the need for a graphical user-interface for navigating any set of class hier-archies.

Two issues particular to functional units in SEED-Layout, and potentially to othermodules, are constraints and tests. Constraints (which are really value domain constraints)require some type of inequality on an attribute value such asmin() or max(), and they canbe expressed as ranges of values (i.e. x > 20 and x < 100). The tests are constraints on rela-tionships between functional units and other objects such as design units—an adjacencytest is an example. These types of constraints cannot be viewed as value constraints andrequire a different type of description.2

2.2 Common Information Storage and Retrieval

The current SEED-Layout implementation embeds the storage and retrieval of C++objects into methods and is typical of class library based persistence mechanisms. Essen-tially, these methods “parse” a file format. Data exchange between each module requires,in the worst case, (n - 1)2 implementations from one module to another. The amount of pro-grammer effort to maintain and develop these translators is quite significant. Therefore, anyreduction in level of effort is desirable. Some of this complexity can be reduced if all themodules utilize a common database access mechanism. Another way of viewing thisrequirement is achieving data independence; data independence separates the physical stor-age of data from the manipulation of data. An example of data independence is the SQLlanguage for relational databases.

What is not yet clear is the level of abstraction that this mechanism should provide.For example, the logical model provided by a relational database management system mayor may not be sufficient in meeting the database requirements, but it is clear that anythingof a lesser abstraction will not suffice.

Other examples of abstraction levels are object-oriented databases and modelingsystems. Object-oriented databases can generally be classified into two groups: extensibleand schema evolution. Extensible databases are similar to relational databases but add theability for user-defined abstract datatypes. Schema evolution database systems allow forclasses and instances to be modified dynamically. Most database systems that fall into thesecategories support object versioning or time travelling.3

2. Constraints and tests need to be rethought from a comprehensive point of view. I believe that a more generalconstraint mechanism can be developed and will greatly increase the complexity that can be defined withoutrequiring equivalent programmer effort.

3. Versioning and time travel are very similar concepts. Time travel differs from versioning in that an objectmust be referenced by time; the same object looks different depending upon when you reference it.

SEED Database Requirements 7/19/94 Page 5

2.3 Common Vocabulary

The other issue related to the discussion in section 2.2 is the naming of classes,instances, and attributes of the objects represented in a design. If modules name the objectsdifferently, then the names must be mapped from one modules form to another. If there areno ambiguities between the namings, then a one-to-one function exists that will map thenames correctly. Otherwise, it is not clear how the names should be mapped. If no uniquemapping exists, it becomes difficult to specify how modules should exchange information.Usually this ambiguity results in information loss during the translation process. Assumingthat a mapping does exist, we may assume that the mapping function is redundant in thatthe “original” naming could be used.

2.4 Inheritance and Subclass/Subtype Reasoning

Before a meaningful discussion about inheritance and subclasses can occur, theterms need to be clarified and defined. Essentially, the question “what is a class?” must beanswered. There are three general concepts in use which define the termclass. First, tradi-tional object-oriented programming languages use aclass to refer to anAbstract Data Type(ADT), which is rigorously defined in (Meyer 1988). In this context, subclasses really referto subtypes in that the derived object is used as a datatype and can be type-checked by acompiler. The persistence mechanisms provided within these environments essentiallybuild methods into classes which can read and write object instances, not classes, to andfrom disks—the contents are “parsed”, and memory pointers must be converted to someother representation. These persistence mechanisms are not recoverable or concurrent.4

The second, and perhaps oldest concept, originated from object-centered knowl-edge representations. Semantic networks of frames are a common manifestation of this typeof representation (Quillian 1968, Minsky 1975). These structures were built to support gen-eralized knowledge representations as well as to capture the meaning of taxonometric rela-tionships. Most generalized modeling tools behave similarly to semantic networks.Common to most systems that fall into this category is the notion of a class. Subclasses ofa class are denoted by explicit relationships defined within the subclass; for example, IS-Arelationships are specified directly in the subclass specification. Therefore, subclasses cannaturally answer the question “is class X a subclass of class Y?”; many other types of rela-tionships can also be supported. Smalltalk is an example language that falls between ObjectOriented Programming (OOP) and knowledge representation.

Lastly, within the database management field, classes typically borrow from eitherADTs or knowledge representation, or both. Extensible databases allow for user-defineddatatypes that include the definition of methods for the datatypes. Object databases allow

4. These features are traditionally considered database management functions.

Page 6 7/19/94 SEED Database Requirements

complex object descriptions and modifications including schema evolution that changes theclass structure of the database objects. New types of inheritance can be supported withinthis environment including instance-to-instance and selective inheritance.

2.4.1 Subclass and Subtype Differences

There are two critical differences between ADTs (subtypes) and subclasses. First,the class representations for subtype instances do not exist at run time, while they do existin the subclass environment. Second, subtypes are required to contain all of their parent’sattributes and methods; instances are structural copies of their associated subtypes. In con-trast, subclasses are not required to contain all their parent’s attributes, and instances havethe potential to choose which attributes to share or copy from their derived subclass.Another way of viewing this distinction is that subtypes have implicit hard-coded IS-Arelationships imposed on them, and all instances of a type are supersets of parent attributeaggregations. Subclasses are not bound to implicit relationship types.5 While some of theissues related to subclass versus subtype are implementation dependent, subclasses repre-sent a significantly different notion from subtypes.

The issue of inheritance is also more complicated than the simple inheritance mech-anisms in programming languages such as C++. C++ uses a static, class-to-class, full inher-itance mechanism, which means essentially that all subtypes are complete structural copiesof their parent plus additional attributes. Instances are structural duplicates of the classesfrom which they were derived. In this case, it is not necessary to have the class descriptionaround during execution of a program because the class lattice is static. Because the classdescriptions do not exist at run time, the class descriptions cannot be queried for member-ship. Therefore, any class-based query must be explicitly coded within the methods of theinstances.

In several SEED modules, both inheritance and subclass tests are needed. As anexample, a corridor can be considered a subclass of a circulation space, which in turn canbe considered a space in the generic sense. While the determination of subclass can beachieved utilizing ADTs and inheritance, the issues of inheritance and class membershipare intersecting problem sets. From the examples encountered within the current SEEDrequirements, the desired feature is subclass querying and not subtyping, even though sub-typing can simulate subclassing by explicitly adding methods to instances which cananswer class queries. However, the burden of managing the subclass complexity then fallsto a programmer and is not automatically managed.

Because classes do not exist during run time, subtyping does not allow for classesto be modified at run time. However, the ability to modify both classes and instances has

5. Some systems support efficient implementations of certain relationship types such as IS-A and HAS-A.

SEED Database Requirements 7/19/94 Page 7

been identified and will be discussed from another point of view in section 2.5. During amodule session, new classes need to be added into an existing class lattice. If a class isgoing to inherit from another class, the description of the class must exist during programexecution. Again, subtyping does not permit this to occur. Along a similar vein, subtypingrequires instances to be structurally identically. However, a well known requirement forexceptions exists. The nature of exceptions allows for attributes to be added or removedfrom a class or instance, thereby deviating the structure from the “normal”. Other types ofexceptions include negative links, for example “Nixon is not a pacifist”. It can be arguedthat all relationship types should be made explicit including the IS-A relationship (Rum-baugh 1987, Rumbaugh 1988).6

2.4.2 Required Inheritance Mechanisms

The three required inheritance mechanisms are class—class, class—instance, andinstance—instance. Class—class inheritance allows properties (i.e. attributes, relation-ships, methods, and constraints) to be inherited from one class to another. Single inherit-ance allows a class to inherit from only one class, while multiple inheritance allows classesto inherit from more than one class. In addition, a class may inherit all (full inheritance) orsome (partial inheritance) of the properties from a parent class or classes. In addition, aclass definition may need to be modified at run time. Changes to a class schema can havedramatic impact on existing classes and instances. The most obvious use of class—classinheritance is for specializing or refining classes into more specific ones. For example, itmay useful to represent the class schema shown below.

The most prevalent use of class—instance inheritance will be the propagation ofdefault properties to instances. Any methods, attributes, relationships, or constraints whichshould be part of an instance by default can be automatically added when the instance is

6. More discussion on this topic is needed.

space

office

manageroffice

publiccirculation

corridor

Page 8 7/19/94 SEED Database Requirements

created. An example usage of class—instance inheritance might be the default glazing forwindows. All instances of the window class should have the glazing default value unlessan exception to the glazing is specified. By implication, the value is shared between theclass and its associated instances. In other words, if the class value changes, all the associ-ated instance values are automatically changed as well.

One of the more controversial forms of inheritance is instance—instance inherit-ance. Instance—instance inheritance allows the sharing of selective properties betweeninstances. The behavior is similar to class—instance inheritance but provides a finer grainof control. A particularly useful example is the construction material of all structural col-umns on the first floor of a building. The first floor columns may be required to be of thesame material type. The material attribute can be inherited from another column instance;the value of the material attribute is not copied, it is shared between the instances. There-fore, when any instance is modified, all the other instances are also modified.

2.5 Object and Attribute Indexing for Case-Based Retrieval

The information stored within a database must be in a form that can be used forcased-based indexing, matching, and retrieval. In particular, indexing must occur on bothclasses and instances, the attributes of the classes and instances, and potentially recursivelyon any subclasses and attributes of subclasses. This implies that the class definitions mustexist persistently. The case base should also allow for graphical user navigation in a mannersimilar to class browsers.

Another needed feature is the ability to adapt an index to better match a case. There-fore, a given index may be altered by the matching process. Because the matching is to bedone on both attributes and subclasses, classes, instances, and attribute values must be mod-ifiable during the matching process. Again, the need arises for these objects to exist persis-tently.

Finally, indexing, matching, and retrieval all potentially rely on subclass determi-nation. For example, an index with class “public circulation space” is used to match againstthe case base. Classes or instances of “corridor” or “stairwell” are subclasses of “public cir-culation space” and should have the potential to match on the given index. The motivationfor this feature is at minimum the need for general index specification, which minimizesrequired human system knowledge, and the relaxing of an index which may be too specific.

2.6 Generalized Constraint Mechanisms

An essential feature of functional units is the presence of constraints. These con-straints exist in several different forms including domain value constraints (i.e. ranges andsets), constraints on relationship cardinalities (i.e. a room belongs to only one building),inequalities (i.e. a room dimension must be greater than 15 ft.). Simple constraints refer-

SEED Database Requirements 7/19/94 Page 9

ence information which is contained completely within a particular object (i.e. a functionalunit). More complex constraints may reference information in other objects such as otherfunctional units or design units. Complex constraints are anticipated to be the most preva-lent. For example, constraints within functional units frequently will constraint someattribute within an associated design unit.

A generalized representation of these constraints can be useful in two importantways. First, if constraints are consistently represented, they can be consistently utilized bydifferent modules; the complexity of constraint violation communication also becomesreduced if a single representation is utilized. Second, the storage and retrieval of constraintscan be generalized with the associated representations and mechanisms. The generalizationof the constraints should not group together constraints which are conceptually different.For example, some value domain constraints can be represented as an inequality, but do nottake into account set membership. Careful consideration should be given to which con-straint types are generalized.

2.7 Module and Agent Communication

Eventually, the SEED environment will consist of more than one operating system(e.g. UNIX) process. A process is composed of one or more SEED modules and is fre-quently referred to as an agent within the distributed computing literature. In order to allowcommunication between agents, a software infrastructure that allows for agent communi-cation is needed and must address the following basic issues: the packaging and unpackingof information, the transmission of information, and the coordination of information.

Thepackaging andunpackaging of information pertains to the physical assemblyof information into a form which can be transmitted from one agent to another. A solutionto this issue must address the differences between heterogeneous machine architectures.For example, some machines store the least significant byte first, while others store the mostsignificant byte first. Some standardized mechanism should be developed to achieve thegoal of heterogeneous data exchange.

The transmission of information from one process to another or one machine toanother should be as transparent as possible. Information from one agent must be transmit-ted to other appropriate agents. However several problems must be solved to achieve this.First, the sending agent may or may not know which agent to send the message to; in fact,many times an explicit list of recipients is not desirable because any new agents may needto be added to the list in the sending agent. This means that each agent must be aware of allthe other agents, and may require modification of every existing agent when a new agent isadded. This type of communication mechanism is very “brittle” and should be avoided. Thesecond important issue is how the received information is incorporated within the receivingagent’s environment.

Page 10 7/19/94 SEED Database Requirements

Because many types of information can be transmitted, some types of informationneed to becoordinated. For example, a request for a computation or the result of a calcula-tion may need to be transmitted to another agent. A set of protocols and mechanisms needsto be developed that allows efficient coordination to occur between agents. In addition toagent communication, some information may need to be communicated to the user directly.Again, the determination of the information’s destination might not be included within thesending agent. By implication, some type of message coordination must be performedwhich exists outside “normal” agents—typically referred to as coordination agents. Anoverall philosophy must be developed to determine how and when agents send and receiveinformation.

2.8 Geometric Relationships and Topologies

The need for descriptions of geometric relationships and topologies exists for sev-eral modules. The geometric relationships primarily relate to design unit adjacencies, forexample rooms to other rooms. The geometric relationships between walls and rooms arealso needed. Another important, but currently unused feature is the detection of “inside”and outside. For example, a duct which passes through a space can be considered as bothinside and outside these spaces. In a 2 1/2 D or 3D environment, adjacencies become morecomplicated. The notion of “over” and “under” are also needed. This may or may not besimilar to 2D adjacency calculation, but is nevertheless needed7.

In addition to relationships, more complex spatial arrangement descriptions areneeded, in particular, access paths from one room to another. The goal is to be able to cal-culate the distance from one place to another. An example is the distance from any room toan external exit. While several well known methods for computing this information exist,it is not clear if a unified conceptual framework exists in which these computations can beperformed.

Each type of geometric computation usually utilizes a specialized representation(i.e. wall representation, boundary solid models). These representations do not take theform utilized within typical object modelling or knowledge representation systems. There-fore, some method of translating the geometric representation into an associated knowledgerepresentation must exist. In addition, the two models must be synchronized—changesfrom one model must be reflected in another in as automated a fashion as possible.

2.9 Coordination and Communication of Constraint Violations

SEED modules generally have the ability to evaluate portions of a design. The eval-uation can contain both an identification of a constraint violation and an associated design

7. Obviously, many other types of relationships can be computed, and I expect that others will surface as thediscussion continues.

SEED Database Requirements 7/19/94 Page 11

change which would remedy the violation. In such a case, three important issues must beresolved and are discussed below.

A module must be able to formulate the notification in such a way that other mod-ules can utilize the information within the evaluation and therefore allow some other mod-ule to correct the problem. For example, if a module determines that the area of a space istoo small, the answer can be communicated in at least two forms: the simple attribute areaor the aggregation of the values which make up the area. This type of module communica-tion is different from information exchange in that descriptions of descriptions are beingcommunicated. In other words, constraint violations contain metadata.

The second critical issue is the communication of an evaluation to other modules.If a single CPU and single process are used for computation, this problem is reduced topassing memory pointers between functions within the process. However, if many processand/or many CPUs are used (i.e. distributed computing), then other mechanisms arerequired to communicate evaluations. The distributed mechanisms are more complex andcostly, and therefore careful consideration must be given to this issue.

Lastly, once an evaluation can be formulated and sent to another module, somethingmust be done with the evaluation on the receiving end. However, each type of evaluationmay be intended for one or more modules, and the specification of the destination must alsobe specified somewhere. Again, several implementations are known and possible, but thisissue must be considered with other information communication requirements. The discus-sion in section 2.3 on a common vocabulary is also critical to this issue. If the sender andreceiver do not utilize the same representation, then a translation must occur. The transla-tion is a critical design issue and can occur in one of three places: the sending end, thereceiving end, or in transit.

2.10 New Modules and Agents

A critical requirement for the database and related issues is the ability for new mod-ules and agents to be added to an existing software base with a minimum of change to theexisting system. Any needed changes should be identifiable and minimized to a set of“well-known” modules—coordination agents are an example. Two example systems whichattempt to ease the integration of new modules are the ICADS model (Pohlet al. 1988) andthe Federation architecture (Geneserethet al. 1992).

The database architecture plays an important facilitation role by providing the infor-mation storage and retrieval mechanisms. In addition, depending upon the level of knowl-edge representation, a knowledge structure and vocabulary may be defined. The existingcase-base can play a pivotal role in the speed in which new modules can be developed andtested. As mentioned in previous sections, a comprehensive user interface is necessary to

Page 12 7/19/94 SEED Database Requirements

provide the browsing capabilities to aid in module development. In other words, as the sys-tems become more complex, the need increases for quick information access that assistsmodule developers in managing development complexity.

2.11 Information System Management

A significant amount of information gathered during the design process is notdirectly represented within a session or project; the information is not really representableas a functional unit or a design unit. However, this type of information can be essential insupplying historical context to people new to a project and can be stored within an infor-mation modeling system. Generalized information modeling must be a strategic directionof the SEED project. However, this section concentrates on hypermedia systems, which canbe considered a subset of general information modeling. Hypermedia systems can be con-ceptualized as directed graphs where the vertices are media elements and the directed edgesare hyperlinks from one media type to another. “Browsing” or navigation is the process ofmoving from one vertex to another via a hyperlink. Traversing a hyperlink may have anassociated action tied to it—displaying a video sequence is an example.

A general information modeling system provides a framework to integrate both aknowledge representation scheme and a hypermedia system by defining hypermedia accesstechniques that can be stored within the method of a class in the knowledge representationscheme. As a result, it makes sense to have other types of objects within the knowledge rep-resentation scheme other than functional units or design units. While a term for these typesof objects has not been defined within SEED, it should allow for the anticipated types ofinformation.The following subsections identify the multimedia needs via a layered soft-ware architecture which is described below. The specific implementation information isprovided in section 3.

2.11.1 Media Presentation

The general types of media that can be presented include the following: unformat-ted text, formatted text, hypertext, still images, video images, audio information includingvoice, and tabular or spreadsheet data. The presentation of media types is typicallyachieved through the use of viewers. For example, the program Xview allows for the dis-play of still images. Ideally, the presentation of a media should be separated from the accessmethods and supporting hardware. For example, within the X-window environment, thiscan be easily achieved by providing a window for visual display media.

Presenting the media becomes difficult when several different types of media areincluded in a single hypermedia document. When a single medium is utilized, a simpleviewer is appropriate. However, more complicated viewers are necessary when media areintegrated. The presentation of media information is highly impacted by the developmentswithin the standards community. Browsing and navigation tools must not only support

SEED Database Requirements 7/19/94 Page 13

knowledge representation schemes, but they must also support sophisticated viewing meth-ods. A critical tool for both information navigation and as a benchmark is NCSA Mosaic.This hypermedia tool provides a high degree of information access and integration into asingle viewer and is a state-of-the-art example hypermedia system.

2.11.2 Media Formats

Several types of media formats exist for each type of media. The ASCII represen-tation of unformatted text is by far the most common. Formatted text can be stored in a largevariety of formats, with some of the more common being TeX, PostScript, Rich Text For-mat (RTF), Standardized General Markup Language (SGML), and HyperText Markup Lan-guage (HTML). Still images are stored in a variety of compressed and uncompressed formsand include GIF, JPEG, TIFF, RGB, X-Bitmap, HDF, and PBM.

Video information is provided in an analog form or a digital form. Analog informa-tion can be obtained from typical consumer electronic equipment such as VCRs. The digitalinformation is obtained through either a compressed or uncompressed sequence of videoframes or still images. The most common and portable digital video form is MPEG.

Audio information is provided in either a digital sampling format or a discrete musi-cal event format. The MPEG specification allows for both audio and video to be combinedinto a single format, which simplifies the synchronization of the audio and video frames.Depending upon the type of computer and operating system, there are several file formatsfor storing audio data. The Musical Instrument Digital Interface (MIDI) specification is aprotocol for specifying events rather than the sound itself. This protocol relies on the capa-bilities of a device to synthesize the sound in real time, which is a standard feature of mod-ern synthesizers and music keyboards.

Both spreadsheets and relational databases provide an abstract notion of tabularinformation that can be translated into a formatted textual representation. However, it canbe necessary or useful to keep the notion of a table within the presentation of the media.Relational database management systems provide a standard, data independent method foraccessing data. However, spreadsheets are newer and do not have standard access tech-niques, which complicates data interchange. It is important to note however that standard-ized file formats for the storage of spreadsheets do exist. Along a similar vein, GeographicInformation Systems (GIS) provide a database of information which can be queried. Even-tually, these systems will be integrated into existing hypermedia environments.

An important feature related to still images is the concept of image maps, whichallow portions of an image to be mapped to a hypermedia link. Type shapes for an imagemap is a circle, rectangle, or a polygon represented as a finite set of vertices. When an event

Page 14 7/19/94 SEED Database Requirements

such as a mouse click in a map is detected, a hyperlink is traversed in the same manner asa hypertext link.

2.11.3 Media Access Methods

Hypermedia are accessed as either a stream of characters or a stream of bits. Theinformation can be obtained via a file system, a network connection, or a combination ofthe two. Most viewers operate on files with a specified format (i.e. HTML). However, manytypes of information are not stored on a locally available file systems; therefore they mustbe retrieved from some other location via a network connection. Connections are typicallymade to an information server, some of which are FTP, Gopher, Archie, World Wide Web(WWW), and Wide Area Information Server (WAIS). Robust viewers combine several ofthese access methods together with several presentation methods, in the hope that thisallows multiple media types within the same hyperdocument.

Information within a relational database management system can also be accessedin a manner similar to FTP. Essentially, these methods can both be considered a client/server architecture for information exchange; FTP happens to be primitive in comparison.

2.11.4 Media Hardware Support

Several types of media cannot be displayed without special hardware support; forexample, analog video needs a specialized graphics card adapter to digitize the analog sig-nal into digital video frames. Many computers now supply digital signal processing cardsas a standard feature, which are being utilized quite frequently. Other types of media distri-bution also rely on specialized hardware such as CD-ROM and 3D solid modeling graphicscards. As new media are invented, new hardware support may be required. Therefore, it isessential that any information system must allow new types of media to be integrated, espe-cially via network connections.

2.12 Requirements Summary

The following bullet list summarizes the requirements in general terms. It isintended to be used as reference to compare implementation features to requirements.

2.12.1 Database Management System Support

• The database system must represent and store problem statements, functionalunits, design units, and solutions persistently.

• The database system should be used as a common information storage andretrieval mechanism for most or all data that must be stored beyond a modulesession.

• The database system will be used to store and retrieve building information;functional unit libraries and architectural programming data are examples.

SEED Database Requirements 7/19/94 Page 15

• The database system must also support the storage of multiple media types for usein a hypermedia system.

2.12.2 Agent Communication and Coordination Framework

• The communication of design descriptions must be supported transparently by thecommunication framework.

• All agents must communicate with each other using the common vocabulary.

• Constraint violations and design change hypotheses must be communicated in astandard representation.

• A generalized constraint mechanism must be supported by the database system,the communication framework, and the knowledge representation; it must includevalue domain constraints, relationship constraints, as well as constraint states(satisfied/unsatisfied).

• The communication framework must allow for the addition and removal of newmodules or agents (possibly at run time).

2.12.3 Knowledge Representation Support

• The knowledge representation scheme must support the following types ofinheritance: class/class, class/instance, instance/instance.

• Classes and instances must be modifiable at run time, which implies that the meta-objects that represent these objects must be accessible.

• The knowledge representation scheme must support class queries such as subclassmembership tests (i.e. is a corridor a public circulation space).

• A class or instance must be usable as a target index into a case base.

• Case-based indexing, matching, adaptation, and retrieval must be supported byboth the database mechanisms and knowledge representation scheme.

• The integration of geometric relationships and topologies into the designrepresentation must be supported as automatically as possible.

3 Design and Implementation IssuesThis section examines the design and implementation issues pertaining to the data-

base management system as it is affected by knowledge representation, distributed commu-nication and coordination, information storage and retrieval, and software tools andprogramming environments. The impact of standards within each area is addressed, notnecessarily advocating the adoption of a particular standard, but rather to ensure that thestandard is rejected for clear reasons and not overlooked.

Page 16 7/19/94 SEED Database Requirements

3.1 Knowledge Representation

Three important issues revolve around the representation of objects within theSEED environment, which are covered in the following subsections. First is an examina-tion of the underlying computational representation of objects (i.e. functional and designunits). Second, the indexing of classes, instances, and attributes into a case-base isaddressed. Lastly, the matching and adaptation of case indices are discussed.

3.1.1 Metaobjects or Database Objects

There are two basic approaches to representing functional units and design unitswithin the context of a database management system. The objects can be directly repre-sented within the facilities of the database management system, or the metaobjects whichdescribe the components of functional units and design units can be represented; each ofthese alternatives is discussed.

Representing objects via the database management system descriptions has certainadvantages. Primarily, all the functionality supplied by the database can be directly utilized— class browsers are an example. In addition, any optimizations or features provided withthe database system can be taken advantage of. Many of these issues relate to performance.The problem with utilizing the database representation is that they tend to be derived fromprogramming language constructs — abstract data types. Even if schema evolution is sup-ported, the ability to index objects for case-based retrieval may not be supportable; partialclass descriptions cannot exist. Therefore, two different representations would have to bedeveloped — one which models functional and design units and one which models cases.These representations would have to be translated from one to another.

The alternative to representing knowledge by the database facilities is to store themetaobjects within the database. These metaobjects model the structure of the functionaland design units as well as all of their composite parts — for example attributes and valueswould be represented as database objects. The main advantage of this approach is the flex-ibility provided for expansion and enhancement of the knowledge representation scheme,particularly for the case-based aspect of the knowledge representation. Additional metaob-jects can be defined within the database for indexing mechanisms, thereby allowing exper-imentation and improvement on the indexing techniques without manipulating ordestroying the functional and design unit metaobjects. The main disadvantage with thisapproach is that tools such as class browsers are no longer useful as end-user tools; brows-ers would have to be developed. However, it is likely that the class browsers provided withthe database systems would not be suited to the SEED requirements anyway. Within thisframework, the class browsers would only be used as a software development tool.

Common to both implementations is the issue of programming language access tothe knowledge representation. If the objects are represented by the database facilities, the

SEED Database Requirements 7/19/94 Page 17

objects are usually automatically translated into an equivalent programming languageobject — a C++ class for example. If the metaobject approach were taken, an additionalsoftware interface would have to be developed to provide access to the knowledge repre-sentations objects.

For the purposes of the SEED project, the metaobject approach is the most viablefor several reasons. First, the representation of knowledge within SEED is somewhat unde-fined yet and will most likely change to some degree. The metaobject approach allows fora redefinition of the behavior of the representation scheme while providing some level ofdata independence for the applications built using the representation. Secondly, the metaob-ject approach allows for the same representation to be built in multiple execution environ-ments. For example, more than one database management system can be used to implementthe representation; multiple programming languages can also be supported. If the databaseobject approach were taken, representations and programming languages would be limitedto the set provided by the developers or vendors.

The ASCEND system (Pielaet al. 1990) was designed as a “structured approach todeveloping and solving equational models.” The development of ASCEND was motivatedby the inadequacies of constraint solving systems such as ThingLab and CONSTRAINTS.These tools are motivated by simulation technology and are not appropriate for knowledgelevel development. More specifically, ASCEND uses a strongly-typed single inheritancemechanism which does not allow for most of the operations previously identified. The basicfeature of ASCEND is the ability to move an object up or down a single path in a type hier-archy.

3.1.2 Common Vocabulary

A significant number of researchers agree that distributed, collaborative systemsmust communicate using an underlying common vocabulary. This common vocabulary notonly consists of the names of objects and attributes, but also of the representational capa-bilities of the objects themselves. The capabilities of a representational system should bedefined independently of the particular software implementation. For example, frame-based systems can be implemented in all the programming language paradigms (proce-dural, production rules, deductive systems), but the capabilities of the frame representationis the same regardless of the implementation.

What is not typically addressed in discussions related to common vocabularies isthe significant overhead in the construction and maintenance of these vocabularies. The twoimportant issues relating to construction and maintenance are an expressive knowledge rep-resentation scheme, and the consistency maintenance of applications that use the vocabu-lary. Like any other type of designed artifact, these common vocabularies are developed inan iterative manner; information is gathered incrementally. The representational system

Page 18 7/19/94 SEED Database Requirements

must be capable of supporting this activity. More importantly, as agents and modules aredeveloped that utilize this vocabulary, they become dependent on it; if a modification to thevocabulary occurs, the modules and agents must be altered to accommodate the change. Inorder to deal with the information gathering and change complexity, a software tool setmust be in place that at least assists the application developers in the process of vocabularyand agent development. This issue is discussed in more detail in section 3.4.

3.1.3 Consistency Maintenance

Two types of consistency maintenance must be incorporated into a representationalframework: value dependencies and truth maintenance. Value dependencies based onderived values should be handled in an automatic manner without effort on the part of adeveloper; an example of automatic dependency detection can be seen within the frame-work of a representational system (Snyder 1993). Truth maintenance is the process ofensuring that information that was generated as a result of an inference remains true ifchanges occur to the knowledge base—the “frame problem” is a common term. There areseveral formal solutions to the truth maintenance problem such as an Assumption-BasedTruth Maintenance System (ATMS). The selection of the type of truth maintenancedepends heavily on the form and capabilities of the knowledge representation system.

3.1.4 Constraint Mechanisms

Four type of constraints have been identified for use within the SEED knowledgerepresentation: value domain constraints, relationship constraints, linear constraint solving,and non-linear constraint solving. Value domain constraints restrict the values that anattribute can assume. Example value constraints are ranges and sets as well as the numberof values such as a single or multivalued attribute.

Relationships constraints deal with relationships between objects. For example,specifying that a room can only belong to one building is a constraint on the cardinality ofthe relationship. Other types of relationships constraints are related to the addition andremoval of objects in the knowledge representation scheme. An example of this type ofoperation is the automatic removal of dependent objects when some event occurs such asthe deletion of a collection. Recently, mechanisms referred to astriggers have been pro-posed as a mechanism to automatically manage the lifetime of objects (MacKeller 1992).

Both linear and non-linear constraint solving systems deal with different types ofproblems from the two previously discussed. Relationship and value constraints are knowl-edge representation issues, whereas constraint solving is an evaluation technique. The mainimplementation issue is whether or not the constraint solving systems should be incorpo-rated into the knowledge representation scheme, or should the information in the knowl-edge representation be used to generate problems to be solved via constraint satisfaction.

SEED Database Requirements 7/19/94 Page 19

The WRIGHT constraint solving system is a disjunctive constraint satisfaction sys-tem which can be utilized within the 2D layout algorithms for SEED-LOOS. This techniquehas continued to evolve and should be a primary candidate for incorporation into the rep-resentational system.

3.1.5 Partial Matching of Case Indices and Index Adaptation

The type of case-based reasoning utilized within SEED is the problem-solving style(Kolodner 1991). In this type of reasoning, a case index is used to search the case-base andderive a new solution that is mostly correct. The main problem to be resolved with this por-tion of SEED is how to adapt a solution to the current problem specification. The adaptationprocess usually will involve the retrieval and adaptation of other cases. Therefore, a mech-anism for limiting the recursive nature of the process must be examined. Allowing the tax-onomy of the knowledge representation system to limit the recursion is not a generalsolution and severely limits the type of reasoning that can be done.

The types of facilities needed to support case retrieval are superclass queries, sub-class queries, attribute value constraints, and constraints on relationships (i.e. find a mealthat does not have meat in it). However, the reasoner needs help determining what can beadapted and how things can be adapted. These hints are used in the retrieval process to morequickly find solutions that meet the current problem. Once a retrieved case has beenadapted, it must be tested to verify its usefulness in solving the problem.

3.1.6 Case Indexing of Classes, Instances, and Attributes

While the retrieval and adaptation of cases is not trivial, the major obstacle remain-ing is knowledge required to allow for successful adaptation. This information cannot bederived via a computation. The identification of appropriate indexing information must bedone by a human. In the JULIA system, a frame-based knowledge representation schemeis utilized (Hinrichs 1992). However, the frames are embellished with information that aidsthe reasoner during case retrieval. The use cases within the current SEED specification donot address the process of embellishing a solution for indexing and storage.

Assuming a frame-like knowledge representation scheme for SEED, there are onlythree things that can be used in an index namely classes, instances, and attributes. However,to support the variety of possible transformations additional attribute descriptors are neededto guide the adaptation and criticism processes within the case reasoner. Example embel-lishments can be seen in (Hinrichs 1992). An algorithm for case-based retrieval in SEED isoutlined in (Flemming 1993).

3.1.7 The Impact of Standards

There are two standards organizations which impact the area of SEED knowledgerepresentation: the Knowledge Sharing Initiative (KSI) and Object Management Group

Page 20 7/19/94 SEED Database Requirements

(OMG). The KSI is sponsored by the Advanced Research Projects Agency (ARPA) withthe primary research effort being the Interlingua Project. This project is attempting to builda set of standards (also known as an Agent Communication Language (ACL)) for theexchange of knowledge between collaborating computer agents and is divided into twoareas: Knowledge Interchange Format (KIF) and Knowledge Query and Manipulation Lan-guage (KQML). Implicit within this approach is an underlying common vocabulary of aparticular design domain. In the case of the SEED project, a common vocabulary for build-ing related issues would have to be specified. This standard impacts the SEED project fromother perspectives which are covered in section 3.2.

The OMG is a consortium of researchers and vendors whose goal is to produce stan-dards for distributed object paradigms. A primary standard being developed by the OMGis the Common Object Request Broker Architecture (CORBA) which specifies a platform-independent mechanism for distributed object interaction. Industry is widely supportingthis effort and several compliant programs have been developed and are being sold; manyvendors have committed to migrating to the standard; some object-oriented database devel-opment companies are taking note of these standards and moving towards them. As thescope of SEED continues to expand in the area of distributed computing, this standard, andpotentially others, will have to be addressed and potentially dismissed with clear reasons.Nevertheless, this standard must be addressed because the long-term industry acceptanceof SEED-like systems will eventually rely on distributed computing standards such asCORBA.

3.2 Distributed Communication and Coordination

Because the database management system will not be used for a centralized com-munication system, any agent collaboration must be done on copies of objects which residein each agents address space.8 However, in order for the agents to coordinate with eachother, some type of synchronization mechanism must be in place. The following subsec-tions address the issues related to communication and coordination among agents that areseparated by address space or over a network.9

3.2.1 Common Vocabulary and Translation Issues

The relevance of a common vocabulary has been illustrated by a blackboard andderivative systems. Only the AI field of Cooperative Distributed Problem Solving (CDPS)attempts to build systems specifically excluding the notion of a common vocabulary (Barret al. 1989). Therefore, the requirement that all modules communicate with a common

8. Some database systems provide for change notification, but the practice of using the database system as a dis-tributed communication framework is still rare.

9. Research in the area of distributed computing is quite active. While many issues, such as reliable versus unre-liable networks, are not addressed, they must be during the implementation of any software systems. Theseskills are typically rare.

SEED Database Requirements 7/19/94 Page 21

vocabulary to agents outside their domain should be imposed on the SEED system. How-ever, most modules will certainly have domain specific models which do not need to becommunicated outside its domain. Two import facilities must be available to modules andmodule developers: the ability to translate to and from the common vocabulary, and mech-anisms to help automate the updating of values from the local domain to the commonvocabulary objects. Some software systems such as the ISO Development Environment(ISODE) address these issues and are covered in section 3.2.5.

In order for proper translations to be developed, domain developers must be able tobrowse the common vocabulary to understand the consequences of a particular translation.Therefore, browsing tools must be available that help filter information in the vocabularydown to relevant views of the knowledge. Otherwise, the amount of information will be toolarge to understand or view on a screen. A basic approach is to allow a user to display anode in the representation scheme and incrementally expand objects “near” the currentnode. The expansion can filter out other nodes via class membership and relationship types(e.g. PART-OF).

3.2.2 Information Routing, Notification, and Coordination

The routing of information from one module or agent to another involves a signifi-cant amount of underlying software support. Layers of network and operating systemmechanisms must be combined in such a way that modules do not actually have any knowl-edge of how or to whom they are sending messages. In addition, if information changeswhich an agent depends upon, a notification must be sent to the agent, implying that agentsmust be capable of process asynchronous information streams—event driven computa-tional models. Two important factors must be resolved during the implementation. First, thecoordination of the values must be done somewhere. Two example solutions are the Mes-sage Router in the ICADS model (Pohlet al. 1991) and the Federation Architecture (Gen-eserethet al. 1992). In both of these solutions, each agent informs a “service” agent(facilitator) of the items it is interested in knowing about. Enhancements to this schemerequire agents to inform the “service” agent of the types of information that are produced,thereby allowing the facilitator to more intelligently route requests for information or ser-vices. The IBDE project (Fenveset al. 1994) provides value discussion and experience onthis issue.

If multiple agents are allowed to establish different values for a particular attributeof an object, some type of resolution to the differing values must occur. This process iscalled “Conflict Resolution” within the ICADS model and has an agent dedicated to thetask of resolving these conflicts. However, there are some important deficiencies with thisparticular implementation, primarily relating to the hand encoding of rules to resolve theseconflicts. This approach is not believed to be scalable (Loganet al. 1992). A critical capa-bility that is needed, which is missing from the ICADS model, is the ability for an agent to

Page 22 7/19/94 SEED Database Requirements

specify a collection of values which will satisfy their requirements. Examples of this typeof specification are sets, expressions, or ranges of values. Given this information, coordi-nation can occur at a much higher level than described in the ICADS and, possibly, the Fed-eration models.

3.2.3 Collaboration

While little discussion has occurred within the SEED project regarding collabora-tive computing, it will eventually become part of the SEED environment, especially if mul-tiple users are allowed to access the design representation at the same time. The details ofthese issues cannot realistically be described until more discussion occurs within the SEEDgroup. However, there are software tools available for the collaboration between designers,which should be investigated simply for the experience of current notions of collaboration.In particular, NCSA has developed the Distributed Transport Model (DTM) for the trans-mission and collaboration of graphical information over the Internet.

3.2.4 Constraints and Violations

The issue of constraint satisfaction and violation permeates most aspects of theSEED environment at the knowledge and development levels. At the knowledge level, themaintenance and communication of constraints must be represented by some agent—mostlikely a coordinator or facilitator. In addition, once a violation has been detected, it must becommunicated to relevant agents. However, there are many different types of constraintviolations, and each one must be identified and examined individually. For example, givena set of equations to solve for, a violation might be defined as no solution to the system ofequations. However, from the point of view of a standards processing agent, a violationwould be a building component which does not meet some criteria. Clearly, these two typesof constraints and associated violations are different in nature and cannot be represented inthe same way.

At the development level, module and agent developers must be able to respond torelevant notifications of violations. Again, these notifications will arrive in an asynchro-nous manner and have serious implications on the software architecture of each agent. Ide-ally, the same communication mechanism should be used for all agents.

3.2.5 Impact of Standards

Borrowing from computer science, examples which address these needs are the fol-lowing: Open Systems Interconnection X.400 protocol suite, Open Software Foundation’s(OSF) Distributed Computing Environment (DCE), the previously mention CORBA stan-dard, and the Interlingua project’s Knowledge Query and Manipulation Language(KQML). X.400 is an Open Systems Interconnection (OSI) compliant network protocolsuite. It provides for the specification and communication of arbitrary types of data to besent and received by distributed processes on a network; an example application is elec-

SEED Database Requirements 7/19/94 Page 23

tronic messaging systems. An important specification within X.400 is Abstract SyntaxNotation 1 (ASN.1); complex objects descriptions can be described, packaged, and unpack-aged utilizing an ASN.1 grammar of the object contents.10 This type of software infrastruc-ture is an example of a protocol that is expandable and modifiable at the “user-level” and istherefore a desirable possibility.

DCE and CORBA are companion/competitor systems. Both deal with the issues ofdistributed information management. In addition, many software manufacturers have com-mitted to making their systems both DCE and CORBA compliant. Essentially, both of thesesystems specify a programming language interface to an abstraction imposed over network-ing, operating system, and object facilities. The goal is to allow applications to work in awell defined distributed computing environment regardless of the type of hardware or oper-ating systems facilities available. Issues such as host name resolution, and network securityand quality of service are all addressed by these standards.

KQML provides a standardized mechanism for the interchange of knowledgeamong programs. This project is attempting to formulate a generalized representation andassociated implementation semantics for distributed, cooperative agents. The areas of cov-erage are vast and include mechanisms for agent “bartering” of services. While this type oftool may have desirable features, a clear matching of SEED requirements to KQML fea-tures needs to be performed. This type of system may, in fact, be too complicated for theneeds of the SEED project. However, the complexity of agent communication must not beoversimplified either.

3.3 Information Storage and Retrieval

It is quite likely that information not contained within any SEED representation willbe needed by agents. For example, parts catalogs and climate information are needed butwill probably not be stored within the knowledge representation scheme developed forSEED. However, it is desirable to have access to the recoverable and concurrent facilitiesof a database management system. In particular, the data independence achieved by a data-base allows applications to be fairly stable even if the underlying information structurechanges.

3.3.1 External Database Connections

It would be naive to assume that a single database system will contain all the nec-essary information for a SEED session. Information will certainly be maintained outsidethe scope of the SEED database system. Therefore, connections to external database sys-tems must be provided. The notion of database in this context takes on a loose definition.For example, the World Wide Web (WWW) and Wide Area Information Server (WAIS) can

10. The ISODE package implements the X.400 protocol suite utilizing TCP/IP as a transport protocol.

Page 24 7/19/94 SEED Database Requirements

be considered to be databases. Multiple types of database connections must be integratedinto a single environment. These areas deal with the information modeling issues discussedpreviously. The end goal is that the user should not be aware of the type or location of theexternal information; the connections should behave in as transparent a fashion as possible.

3.3.2 Multimedia Manipulations

The immediate goal of multimedia capabilities is to have the ability to view the dif-ferent media types within one environment. The media types that should be initially avail-able are still images, video images, hypertext. These media types should be storable withinthe framework of the database management system. However, the method for viewing themmay not be storable in terms of database methods. Therefore, the simple mechanism ofallowing a viewer to beforked() to display the data can be used and is common in mostmultimedia applications.

3.3.3 Module Specific Information Storage

Module and agent developers should be encouraged to use database mechanism tostore domain information, allowing other people to take advantage of the information. Theinformation may be useful to other developers, and having a common storage mechanismreduces the number of things people must understand to use existing information.

3.3.4 The Impact of Standards

The most significant standards relating to information storage are query languages— primarily SQL. These languages, referred to as fourth generation languages, allow datasearches to be specified independently of the access mechanisms. SQL access must be sup-ported to allow connection to external databases. Several object oriented extensions to SQLare being proposed, but there is no convergence of opinion.

The other standard pertaining to information storage is C++. Several database man-agement systems utilize C++ as the run time representational object. The important issuehere is the version of C++ that is chosen to implement the database model in. Currently,there are about four versions of C++ being used throughout the industry — all supportingdifferent features. If a database system does not evolve with the C++ standard, then soft-ware limitations are implicitly inherited. If the database system does evolve with the C++standard, then the migration of existing application code must be evaluated before databaseupgrades occur. This process is very complex and must be factored into the selection of thedatabase system.

3.4 Software Tools and Programming Environments

The successful implementation of the SEED environment hinges on the ability ofthe various implementation tools to work together in an integrated fashion. The database

SEED Database Requirements 7/19/94 Page 25

management system must be able to work with the project management and developmentenvironment tools. In addition, the integration of database management class libraries withother class libraries may only be possible through the use of multiple inheritance. There-fore, a recent C++ compiler must be utilized that supports both templates and a full imple-mentation of multiple inheritance.

The integration of the database management environment with specific domainclasses can be viewed as bridging two parallel development environments. For example,most commercial database management systems provide class browsers and user interfacedevelopment tools which access database objects. However, many SEED modules willhave their own notion of both domain object behavior and user interface development.These two worlds must be bridged. The difficulty potentially occurs with the complexity ofinteraction between these two worlds and termedcognitive distance (Krueger 1992). Ide-ally, a seamless environment should be created, but may in fact be impossible to implementwith current technology.

3.4.1 Software Engineering Tools

The primary software engineering tools needed apply to both the database worldand the programming language world and include compilers, source level debuggers, classbrowsers, user interface development tools (possibly interface builders), and project man-agement tools. Project management tools can be divided into configuration management,revision control, installation management, and reverse engineering tools. Configurationmanagement is the process of maintaining and developing software components in a repro-ducible manner. For example, most executable programs are composed of numerous othercode libraries, and reproducing an executable mandates that every component be reproduc-ible. Configuration management must not be confused with revision control, which is theprocess of maintaining source code level changes to a code set. Revision control is a sub-task of configuration management. Installation management is the process of collecting theoutput from a configuration and placing it in location that supports the execution. Typically,an associated resources are installed as a result of the installation process. Lastly, reverseengineering tools aid developers in understanding a software structure. Conceptually, thesetools can be viewed as query systems where the questions are about programming languageconstructs such as classes and variables.

Most object-oriented database systems provide a C++ source file generation capa-bility, where the definition of a database object is translated into a C++ specification. Con-sistency must be maintained between the C++ sources and the object descriptions. Thedatabase systems should provide assistance with this problem, otherwise the complexitycan potentially become too complicated to manage. An example solution to this problem isthrough the use of triggers; triggers allow for the specification of an event to detect and an

Page 26 7/19/94 SEED Database Requirements

associated action to invoke when the event occurs. When an object description is changed,a new C++ specification can automatically be generated.

3.4.2 Knowledge Engineering Tools

Two knowledge related issues impact the database management system. First, it ishighly likely that a common vocabulary will be developed that all modules will utilize.However, a global vocabulary will probably not be developed. The description of thesevocabularies should be stored in a computable form within the database management sys-tem. Tools for maintaining these vocabularies will be needed. If a representation is imple-mented by the SEED group, tools for maintaining the vocabularies will have to bedeveloped. Otherwise, the database management system should provide the necessarycapabilities.

The second area of importance relates to programs written in knowledge-based lan-guages such as Prolog, or OPS, or CLIPS. Without support tools to debug and verify theknowledge-bases, it becomes increasingly difficult to maintain these knowledge bases. Thedatabase management system can aid the development of these programs by allowing sometype of source code verification to occur. For example, an invalid reference to a slot in anobject can easily be detected by a program. However, it is quite likely that these tools donot exist and would have to be developed by the SEED team.

3.4.3 The Impact of Standards

The impact of standards on the software engineering environments can be quite sig-nificant. C++ now has an ANSI version and some compilers do not support the ANSI spec-ification. In particular, G++ does not support the ANSI specification. Most databasemanagement systems support ANSI C++ exclusively, however many support G++ as well.

The only standardized query language for object database systems is Object SQL(OSQL). This version of SQL is an extension to the existing ANSI SQL-2 standard and iscurrently in the standards committee; the UniSQL product SQL/X is the basis for the ANSIdraft. The lack of a standardized query language has been a major criticism of object data-base systems and has been a significant hinderance for industry acceptance of object data-base technology. Any database system not supporting standard object SQL must have clearbenefits that outweigh the lack of standard compliance.

The final standard has less significance than the previous two, however they shouldbe considered. Based on conversations with faculty at Stanford, both KIF and KQML arerapidly evolving standards and have little built-in support for debugging. If this method ofagent collaboration is chosen, mechanisms for the verification and debugging of KIF andKQML statements should be included. Most likely, these tools will be developed by SEEDproject members.

SEED Database Requirements 7/19/94 Page 27

4 RecommendationsThe following recommendations are divided into commercial and research database

systems. The specific facilities the database system must support are used as criteria forevaluation. These criteria are given with a short definition of the concept in italics and aresummarized below.

• Platform Support: The hardware, operating system, networking, and windowingsoftware supported by the database. A typical example configuration is a DECAlpha with OSF/1, TCP/IP, and X-Windows, which is the platform chosen for theSEED project.

• Compiler Support: The programming languages that can access databaseinformation. Included in this requirement is the need for debugging support. Somecompilers do not have debugging support, and therefore should not be used. TheSEED project requires a compiler that is both ANSI compatible and supportsinteractive debugging; G++ is not ANSI compliant, and AT&T cfront does notsupport debuggers directly11.

• Object Model: The underlying concept of an object supported by the database.The two basic object modeling approaches are to define a language independentobject model and develop language bindings for the model, or programminglanguage objects extended and made persistent. The SEED project requirementsexclude the language specific implementation.

• Schema Evolution: The method for managing structural object changes. Thedatabase system should support dynamic schema evolution while the databasesystem is active. In other words, the database should not have to be taken “off-line” and reorganized due to a schema change. In addition, multiple inheritanceshould be supported. One of the most important uses of multiple inheritance is theintegration of several class library trees into a single class library.

• Query Language Support: The method of manipulating the database. Thedatabase should support both an Application Programming Interface (API) as wellas an embedded query language that give programs data independence. The querylanguage should be based on a standard.

• Subclass Queries: Automatic query support for subclass inclusion. The databasesystem must be able to retrieve subclasses of a given superclass in a query. Anobject of typecorridor should be retrieved if apublic-circulation-space is given inthe query. The method for specifying the subclass query should be a naturalextension of the query language.

• Event Notification: The method of tracking database changes. When changesoccur to the database, some mechanism for capturing the event must be provided.Included in this mechanism must be notification of schema changes; without this

11. Some complier vendors support debuggers for cfront, such as Hewlett-Packard.

Page 28 7/19/94 SEED Database Requirements

particular type of notification, it becomes quite difficult to maintain consistency ofprogramming language objects and database objects.

• Multiple Database Integration: The method of transparently accessing multipledatabases. The database engine should support access to multiple databasemanagement systems including relational databases. The SEED team anticipatesthat data from existing database systems will be needed in the future and should beautomatically supported.

• Compliance with Computing Standards: The supported computing standards.The three most important standards the database must be in compliance with arethe new ANSI SQL-3 draft, the CORBA standard, and the Object ManagementGroup’s Core Object Model (COM). The database should also support client/server interfaces over a network such as TCP/IP or OSF’s Distributed ComputingEnvironment (DCE).

What is not included in our criteria is object versioning and work group support.These areas are not as clearly defined and are currently being investigated by otherresearchers as well. For example, the debate over object versioning versus time travellingis not yet resolved; no clear answers have been resolved to the question “what makes up aversion of an object?”

4.1 Evaluation of Commercial Systems

Several commercial database system were investigated and rejected because theydid not support a language independent object model. Many systems were exclusively C++oriented or were persistent extensions of object-oriented programming languages such asSmalltalk or C++. The commercial database systems that most closely match our require-ments were UniSQL Inc.’s UniSQL/X, Versant Object Technology’s Versant, and ADB,Inc.’s Matisse.

4.1.1 UniSQL

UniSQL supports a wide variety of hardware and operating system platforms,including DEC Alpha/OSF1. No particular C++ compiler is required; G++ should work aswell as any ANSI compliant C++. The object model is independent of any language andsupports both schema evolution and multiple inheritance. Extensive subclass queries aresupported which include the ability to exclude certain subclasses from the results; essen-tially, the set of classes included in the query can be specified in the query. UniSQL sup-ports an extended version of the ANSI SQL-2 query language and is the basis for the SQL-3 ANSI draft. Event notification is supported via atrigger mechanism including the data-base classDB_Trigger. With the optional component UniSQL/M, multiple databases canbe accessed including relational databases; a uniform object representation exists whererelational schemas are static classes. UniSQL is an active member of several standardscommittees as well as industry consortiums.

SEED Database Requirements 7/19/94 Page 29

4.1.2 Versant

While Versant runs on several platforms, there are several restrictions, the most sig-nificant being no DEC Alpha/OSF1 implementation. In addition, the typically supportedcompiler is AT&T cfront 3.0 which typically does not have debugging support. These tworestrictions are significant enough to exclude it from site evaluation. The object model isseparate from any language and supports schema evolution as well as multiple inheritance.Subclass queries are supported within the context of the same SQL/X provided fromUniSQL; Versant’s SQL was licensed from UniSQL. Event notification is still being veri-fied. Versant does not support multiple database integration. Versant is a member of severaldatabase standards committees.

4.1.3 Matisse

Matisse runs on a limited number of platforms, and the DEC Alpha/OSF1 is notsupported. Matisse is not dependent on any language or compiler; a database object modelis supported. Schema evolution and multiple inheritance is supported. Subclass queries arestill being verified. Matisse supports an ANSI SQL-2 query language without object exten-sions. Event notification is supported viatriggers but does not monitor all databasechanges. Multiple database integration is not supported directly but can be integrated viametaschema manipulations; essentially relational database information can be integratedinto Matisse objects via user-defined metaschema object attributes. While Matisse is notcompliant with any database standard, it is a member of several database standards com-mittees.

4.2 Evaluation of Research Systems

Several research database systems exist, however the distribution of the researchcode to the general public is a necessary requirement for a research system. Therefore,many research database systems are not included in this evaluation. The systems that areincluded are the Exodus system from University of Wisconsin, AT&T’s ODE, and

4.2.1 Exodus and E

The E programming language is a persistent extension of the C++ programminglanguage. C++ objects which are to be stored persistently are identified with thepersis-

tent keyword. As a result of this design, the semantics of inheritance are bound by theC++ object model. Platform support is limited to just a few machines because the C++ com-piler is based on GNU G++ version 2.3 and will not be upgraded; both compiler and plat-form support is severely limited. The object model is based on the C++ programminglanguage; the original intention of E was for a database systems programming language andevolved into a more general purpose language (Richardsonet al. 1993). Schema evolutionis provided only by creating new subclasses or modifying the E language and object repre-sentations. Queries are supported bycollection anditerator classes as well as tem-

Page 30 7/19/94 SEED Database Requirements

plate based persistent types (e.g.collection of people). Subclass queries areautomatically supported in collections through the polymorphic behavior of C++ classes;subtypes of a class can be put in the same collection12. Event notification or multiple data-base integration is not supported by E, however, both facilities could be written within theframework of E classes. E also is not compliant with any computing standard.

4.2.2 ODE and O++

AT&T’s object database system, ODE and associated programming language O++,is similar to Exodus/E but was designed more recently (Agrawal and Gehani, 1989). Thesame platform and language constraints found in E apply to O++. The object model is basedon persistent C++ objects. Again, because O++ is based on C++, schema evolution is onlysupported by subtyping O++ classes. Queries are supported via predefined query classes orCQL++ (an SQL derivative). Subclasses queries are supported by class extents (collec-tions) and polymorphic subtyping in manner similar to E. Event notification is directly sup-ported with bothtriggers andconstraints, whereconstraints define relationships betweenobjects that must be preserved. Multiple database integration is not directly supported butcould be added by the same procedure for Exodus/E. ODE/O++ is not compliant with anycomputing standard.

4.3 Similar Research Efforts

Several research efforts are actively working in areas similar to the SEED projectand are of interest to the SEED team members because of the similarity of the research.Several prominent research systems are listed below.

• The Integrated Building Design Environment (IBDE). (Fenveset al. 1994).

• The Intelligent Computer-Aided Design System (ICADS). (Myerset al. 1993,Pohlet al. 1988).

• The Shared Dependency Engineering System (SHADE). (McGuireet al. 1994).

• The Distributed and Integrated Environment for Computer-Aided Engineering(DICE). (Sriramet al. 1992).

• The System Workbench for Integrating Facilitating Teams (SWIFT). (Luet al.1994).

• The Engineering Data Model (EDM). (Eastmanet al. 1991).

4.4 Ranked Recommendations

Given our bias towards a commercial database system, the following systems areranked in order of preference based on the previously defined criteria.

12. It is important to note that subclasses here are compiler language types and not subclasses in the more generalsense.

SEED Database Requirements 7/19/94 Page 31

5 ConclusionsUniSQL meets a substantial number of requirements stated in this paper. The three

most significant capabilities are the integration of the Object and Relational models, anSQL-based query language that incorporates methods, and a robust schema evolution capa-bility. While not all the requirements are met, a substantial set of functionality is providedand is the basis for a robust environment.

The importance of the Object/Relational database systems cannot be stressedenough. It has become quite clear both in industry and academia that complex applicationscannot model information adequately in a pure relational environment. However, existingdata sets should not be thrown out. UniSQL provides the multidatabase feature to allow theintegration of object and relational systems into a comprehensive whole.

The current experiments with UniSQL have utilized the object model providedwithin the database system. While this model is quite powerful, it is still not yet rich enoughfor all the identified needs within the SEED project. There are two parallel directions ofresearch that we are considering pursuing. First, the database objects can be embellishedwith a significant number of methods and triggers. However, it may be that the ever increas-ing number of methods and triggers will yield unsatisfactory performance. Therefore, a sec-ond line of research is being pursued. A more robust object model is being defined and willthen be implemented in terms of the UniSQL object model. In other words, the UniSQLobjects become the construction components of the new object model.

The only significant problem encountered with the UniSQL object model pertainsto inheritance. Attributes and methods inherited by a subclass cannot be redefined in thesubclass. From the perspective of methods, the method cannot be specialized in the sub-class. Default values for attributes cannot be redefined within the subclass either.

Table 1: Ranked Database Selections

Database System Rank

UniSQL 1

Versant 2

Matisse 3

ODE and O++ 4

Exodus and E 5

Page 32 7/19/94 SEED Database Requirements

6 Bibliography[1] Agrawal, R., & Gehani, N. H. (1989). ODE (Object Database Environment): The language

and the data model. InProceedings of the ACM-SIGMOD Conference, Portland, OR, June.

An early paper that discusses the capabilities of the ODE object-oriented database from AT&T. Included inthis discussion database language O++, an extension of C++.

[2] Barr, A., & Feigenbaum, E. (Eds.). (1981).The Handbook of Artificial Intelligence (Vol. I).Los Altos: William Kaufmann, Inc.

A collection of survey papers reporting on the initial results of Artificial Intelligence research. Includes botha comprehensive coverage of the topics as well as large author and bibliographic indexes.

[3] Barr, A., Cohen, P. R., & Feigenbaum, E. A. (Eds.) (1989).The Handbook of ArtificialIntelligence (Vol IV). Reading: Addison Wesley, Inc.

A companion volume to the previous volume which summarizes more recent developments in Artificial Intel-ligence including: Blackboard Systems, Expert Systems, Computer Visions, Distributed AI, and Simulation.

[4] Barsalou, T. & Wiederhold G. (1990). Complex objects for relational databases.Computer-Aided Design. 22(8). pp. 458-468.

Describes a method for defining and manipulating complex, nested object descriptions within the context ofa traditional relational database management system.

[5] Björk, B. (1992). A conceptual model of spaces, space boundaries and enclosing structures.Automation in Construction. 1(3). pp. 193-214.

Describes a conceptual model for representing and reasoning about architectural information with an empha-sis on defining spatial structures as recursive sets of enclosed spaces.

[6] Brachman, R. J. & H. J. Levesque (Eds.) (1985).Readings in Knowledge Representation.Los Altos: Morgan Kaufmann Publishers, Inc.

A highly referenced collection of papers which discuss most aspects of knowledge description and manipu-lation. Semantic network related structures dominate most of the discussions.

[7] Cardenas, A. F., & D. McLeod (Eds.) (1990).Research Foundations in Object-Orientedand Semantic Database Systems. Englewood Cliffs: Prentice Hall

A collection of recent papers which are representative of research results within the area of object-orientedand extensible database management systems.

[8] Chang, E. E., & Katz, R. (1990). Inheritance in computer-aided design databases: seman-tics and implementation issues.Computer-Aided Design. 22(8). pp. 489-499.

Discusses inheritance related issues within the context of object-oriented databases intended to support com-puter-aided design. In particular, it addresses the inadequacy of traditional super/subclass hierarchies and pro-poses instance-to-instance inheritance as well as selective inheritance; each object should be able todetermine what items are inherited from another object.

SEED Database Requirements 7/19/94 Page 33

[9] Coyne, R. D., Rosenman, M. A., Radford, A. D., Balachandran, M., & Gero, J.S. (1991).Knowledge-Based Design Systems. Reading: Addison-Wesley, Inc.

This reference book provides an introduction to the application of AI techniques and methods to the field ofdesign, in particular building design.

[10] Dittrich, K.R., U. Dayal, & A. P. Buchmann (Eds) (1991).On Object-Oriented DatabaseSystems. New York: Springer-Verlag.

A collection of papers which discuss some of the latest database management research issues.

[11] Eastman, C. M., Chase, S. C., & H. Assal. (1991).System Architecture for Computer Inte-gration of Design and Construction Knowledge. Working Paper. Graduate School of Archi-tecture and Urban Planning. University of California, Los Angeles.

This working paper describes the issues pertaining to information and knowledge integration through a com-mon underlying representation called the Engineering Data Model (EDM).

[12] Fenves, S., Flemming, U., Hendrickson, C., Maher, M. L., Quadrel, R., Terk, M., & R.Woodbury (1994).Concurrent Computer-Integrated Building Design. Englewood Cliffs:PTR Prentice-Hall.

[13] Flemming, U., R. Coyne, R. Woodbury, S. Bhavnani, S. Chiou, B. Chio, R. Stouffs, T.Chang, S. Han, C. Jo, H. Kiliccote, J. Shaw, & K. Suwa (June 1992).SEED-LOOS Require-ments Analysis. Engineering Design Research Center, Carnegie Mellon University. Unpub-lished working paper.

This document is a requirements analysis developed using the OOSE software engineering methodology.

[14] Flemming, U. J. (1993).Cased-Based Design in the SEED System. Working Paper. Carn-egie Mellon University, Pittsburgh, PA.

This paper compares and contrasts case-based and prototype based design, as well as describes the require-ments and algorithms for case-based retrieval and indexing within the SEED project.

[15] Genesereth, M. R. (1992). An Agent-Based Framework for Software Interoperability, Pro-ceedings of DARPA Software Technology 1992. pp. 359-366.

This paper describes the concepts and issues related to software integration and engineering in a distributed,cooperative design environment and is termed the Federation Architecture.

[16] Gero, J., Maher, M. L., & Zhang, W. (1988).Chunking Structural Design Knowledge asPrototypes. Sydney: Author.

This paper describes a knowledge representation schema which can store design information. Design Proto-types heavily borrow from AI methods and concepts.

[17] Giarratano, J., & Riley, G. (1989).Expert Systems: Principles and Programming. Boston:PWS-Kent Publishing Company.

Page 34 7/19/94 SEED Database Requirements

A general introduction to the artificial intelligence techniques relating to rule-based expert systems andknowledge representation.

[18] Hinrichs, T. (1992).Problem solving in open worlds: a case study in design. Hillsdale: L.Erlbaum Associates.

A description of JULIA meal planning cased-base design system that includes the issues involved in design-ing and constructing a case-based reasoning system. Descriptions of some of the knowledge representationtaxonomies are also presented.

[19] Kalay, Y. (1985). Redefining the role of computers in architecture: from drafting/modelingto knowledge-based design assistants.Computer-Aided Design. 17(7). pp. 319-322.

A position paper which identifies the limitations of existing CAD systems and proposes a new view of com-puter tools which utilizes computer tools as design assistants.

[20] Kolodner, J. (1991).Improving Human Decision Making through Case-Based DecisionAiding. AI Magazine, Summer 1991.

This paper characterizes case-based reasoning and how it can be used for decision aiding. In particular, designdecision aiding is addressed in detail.

[21] Korth, H., & Silberschatz, A. (1986).Database System Concepts. New York: McGraw-HillBook Company.

A comprehensive text book on database management systems and the direction of current research.

[22] Krueger, C. (1992).Software Reuse. ACM Computing Surveys. 24(2). p. 131.

This survey comprehensively covers the issues relating to software reuse from multiple perspectives. In par-ticular, the human cognitive issues are addressed.

[23] Lehman, F. (Ed.) (1992).Semantic Networks in Artificial Intelligence. Oxford: PergamonPress

A collection papers containing recent work within the area of semantic networks and includes both appliedand theoretical information.

[24] Logan, B. S., Corne, D. W., & T. Smithers (1992). Enduring support: on defeasible reason-ing in design systems. In J. Gero (Ed.)Artificial Intelligence in Design ‘92. Boston: KluwerAcademic Publishers.

In addition to describing their research directions, the authors discuss issues related to components in theICADS model. (See Pohl, 1988).

[25] Lu, S., Lucenti, M., Smith, K., Jacobs, J., Herman, A., Chazin, D., Mattox, D., Lawley, M.,Sillman, M., & M. Case. (1993).SWIFT: System Workbench for Integrating and Facilitat-ing Teams. International Journal of Intelligent and Cooperative Information Systems.

The SWIFT system provides comprehensive knowledge representation mechanisms in a distributed, sharedobject system. In addition, multi-designer collaboration is supported.

SEED Database Requirements 7/19/94 Page 35

[26] MacKeller, B. & J. Peckham (1992). Representing design objects in SORAC: a data modelwith semantic objects, relationships, and constraints. In J. Gero (Ed.)Artificial Intelligencein Design ‘92. Boston: Kluwer Academic Publishers.

The SORAC model describes a semantic network oriented data model which allows for very expressiveknowledge representation systems with a relatively small effort on the part of a user. The concept of both rela-tionships and constraints are integrated into an active environment.

[27] Maier, D., Stein, J., Otis, A., & Purdy, A. (1986).Development of an Object-OrientedDBMS(Technical Report CS/E-86-005). Beaverton: Oregon Graduate Center.

A report which describes the implementation of the Smalltalk programming language into an object-orienteddatabase management system called GemStone. This paper has subsequently been published in several jour-nals.

[28] Meyer, B. (1988).Object-Oriented Software Construction. Englewood Cliffs: Prentice-Hall.

A seminal book which describes both the theory and use of object-oriented programming languages for theconstruction of software and includes discussions of inheritance, reuse, and persistence.

[29] Mill, F. G., Salmon, J. C., & Pedley, A. G. (1993). Representation problems in feature-based approaches to design and process planning.International Journal of Computer Inte-grated Manufacturing. 6(1) p. 27

Describes many problems discovered in the feature-based approach to information representations and sug-gests corrections and alternatives.

[30] Minsky, M. (1975). A Framework for Representing Knowledge. In P. Winston (Ed.),ThePsychology of Computer Vision. New York: McGraw-Hill.

The original paper which describes Frames as a general structure for representing taxomic knowledge sys-tems.

[31] Myers, L., Pohl, J., Cotton, J., Snyder, J., Pohl, K. J., Chien, S., Aly, S., & Rodriguez, T.(1993).Object Representation and the ICADS-Kernel Design (Technical Report CADRU-08-93), San Luis Obispo: CAD Research Center, Design Institute, Cal Poly, CA.

This technical report describes the knowledge representation issues under investigation by the CAD ResearchCenter, as well as a distributed framework for computer-based agent cooperation.

[32] Myers, L., Snyder, J., & Chirica, L. (1992). Database Usage in a Knowledge Base Environ-ment for Building Design.Building and Environment. 27(2). pp 231-241.

A paper which describes the usage of database management systems within the ICADS model and identifiesdirections for further research in database/knowledgebase integration.

[33] McGuire, J. Kuokka, D., Weber, J., Tenenbaum, J. Gruber, T., & G. Olsen. SHADE: Tech-nology for Knowledge-Based Collaborative Engineering.Journal of Concurrent Engineer-ing: Research and Applications. To appear.

Page 36 7/19/94 SEED Database Requirements

This paper discusses the issues and approaches taken in the Shared Dependency Engineering system. The pri-mary focus is the sharing if design information.

[34] Nguyen, G. T. & D. Rieu. SHOOD: A design object model. In J. Gero (Ed.)Artificial Intel-ligence in Design ‘92. Boston: Kluwer Academic Publishers.

Describes an object model which support complex information representation much like a semantic network.

[35] Ohsuga, S. (1989). Toward intelligent CAD systems.Computer-Aided Design. 21(5). pp.315-337.

This paper describes a framework for developing intelligent CAD systems which is defined as a distributedset of knowledge-based agents that assist a designer during a design process.

[36] Piela, P. Epperly, T., Westerberg K., & A. Westerberg (1990).ASCEND: An Object-Ori-ented Computer Environment for Modeling and Analysis. Part-1 The Modeling Language.Technical Report EDRC 06-88-90, Engineering Design Research Center, Carnegie MellonUniversity, Pittsburgh, PA.

This technical report describes the motivation and capability of the ASCEND equation modeling language.

[37] Pohl, J., Chapman, A., Chirica, L. & Myers, L. (1988).ICADS: Toward and IntelligentComputer-Aided Design System. Technical Report CADRU-01-88, CAD Research Unit,Design Institute, Cal Poly, San Luis Obispo, Calif., U.S.A.

An initial technical report which describes the original concepts and issues pertaining to the ICADS model.

[38] Quillian, M. (1968). Semantic Memory. In M. Minsky (Ed.)Semantic Information Process-ing. Cambridge: The MIT Press.

This paper is the original description of Semantic Networks as a knowledge representation system for reason-ing within a computer environment.

[39] Rich, E., & Knight, K. (1991).Artificial Intelligence (2nd ed.). New York: McGraw-Hill,Inc.

This book is a comprehensive reference on the techniques and tools utilized within AI. Most general topicsare discussed with some exception such as computer vision.

[40] Richardson, J. E., Carey, M. J., & Schuh, D. T. (1993). The Design of the E ProgrammingLanguage.ACM Transactions on Programming Languages and Systems. 15(3). pp. 494-534.

This paper describes the design issues and uses of the E programming language. E is an extension of C++ andutilizes the Exodus storage manager from University of Wisconsin.

[41] Rumbaugh, J. (1987). Relations as Semantic Constructs in an Object-Oriented Language.OOPSLA ‘87 as ACM SIGPLAN22(12). pp. 466-481.

SEED Database Requirements 7/19/94 Page 37

This paper proposes the idea that relationships have attributes that can also act as semantic constructing inmodeling objects in an object-centered approach.

[42] Rumbaugh, J. (1988). Controlling Propagation of Operations Using Attributes on Rela-tions.OOPSLA ‘88 as ACM SIGPLAN23(11). pp. 285-296.

This paper proposes the idea that attributes associated with object relationships can help manage the complex-ity in object-centered systems.

[43] Smithers, T. (1989). AI-Based design versus geometry-based design or Why design cannotbe supported by geometry alone.Computer-Aided Design. 21(3). pp. 141-150.

This paper argues that geometric descriptions are not sufficient for use as a design description and communi-cation mechanism and proposes alternative methods of computer usage.

[44] Snyder, J. (1993).A Semantic Modeling System for CAD. MS Thesis. Department of Archi-tecture. California Polytechnic State University, U.S.A.

This thesis describes a semantic modeler which integrates a geometric model with a semantic model. The geo-metric model is synchronized with the semantic model, while the semantic model can manipulate the geomet-ric model.

[45] Snyder, J., & Chirica, L. (1990). An SQL Query Generator for CLIPS.First CLIPS Con-ference Proceedings. Houston: NASA.

This paper describes a method for integrating a rule-based expert system shell with a relational database man-agement system.

[46] Sowa, J. (Ed.) (1991).Principles of Semantic Networks: Explorations in the Representationof Knowledge. San Mateo: Morgan Kaufmann, Inc.

A comprehensive collection of papers which discuss the more complex issues relating to semantic networksincluding the complexity of different types of object-based inheritance.

[47] Sriram, D. Logcher, R., Grouleau, N., & J. Cherneff (1992). DICE: An Object-OrientedProgramming Environment for Cooperative Engineering Design.AI in Engineering DesignVolume III.Tong, C. and Sriram, D. (Eds.) Academic Press.

The Distributed and Integrated environment for Computer-aided Engineering is described in detail. In partic-ular, the implementation of the shared object representation is discussed.

[48] Stonebraker, M. (1990, March). Introduction to the Special Issue on Database PrototypeSystems.IEEE Trans. on Knowledge and Data Engineering. 2(1). pp 1-3.

This introductory paper brings out the relevance of defining the intentions of definitions within object-ori-ented database systems and makes a convincing argument that no definition of object-oriented databasesexist.

[49] Stonebraker, M., Rowe, L., & Hirohama, M. (1990). The Implementation of POSTGRES.IEEE Trans. on Knowledge and Data Engineering. 2(1). pp 125-142.

Page 38 7/19/94 SEED Database Requirements

This paper describes the issues relating to a computer implementation of an extensible database managementsystem.

[50] Stroustrup, B. (1991).The C++ Programming Language (2nd ed.). Englewood Cliffs:Prentice-Hall.

This reference was written by the author of the C++ language and describes the language as well as intendeduses via examples.