object-oriented modeling of geographic data

11
Object-Oriented Modeling of Geographic Data Eliseo Clementini and Paolino Di Felice Department of Electrical Engineering, University of L ‘Aquila, 67040 L ‘Aquila, Italy. E-mail: eliseo@dsiaql .ing.univaq.it [email protected] Two levels of data abstraction, namely conceptual and log- ical, are essential in the process of designing an object- oriented geographic database. The aim of the present arti- cle is to establish a consistent framework for this bilevel design. At the conceptual level, we develop an object-ori- ented model useful for organizing the designer’s knowl- edge about a geographic application in terms of the basic concepts of the object-oriented paradigm. Both concep- tual abstraction primitives (namely, classification, general- ization, and aggregation) and a spatial abstraction primi- tive (namely, location) are discussed. The dynamic part of the model is due to the methods associated with objects. At the logical level, we propose an object-oriented struc- ture of classes and instances, which is uniform with the conceptual model and, therefore, facilitates the concep- tual logical mapping. On the whole, the model-based ap- proach adopted in this article offers object orientation, con- ceptual abstraction, and system extendibility. Introduction The area of spatial information systems has generated a considerable interest among computer scientists and practitioners in the last decade (Abel & Ooi, 1993; Giinther & Schek, 1991; Iyengar & Kashyap, 1988; Laurini & Thompson, 1992). Many issueshave been in- vestigated such as pattern recognition, image processing and analysis, spatial data modeling, geographic integrity constraints, geographic information retrieval, visual query languages, and map-based interaction techniques. This kind of research is expected to lead to the definition of new types of spatial databases, combining together ad- vances in database technology and human computer in- teraction. Frank ( 1988) reports that most existing systems were obtained by extending a relational system with addi- tional features for storing and visualizing images. Al- though the relational data model is simple and easy to implement, it lacks flexibility to handle spatial data and has limited expressive power, as pointed out by Wor- boys, Hearnshaw, and Maguire (1990). Other deficien- 0 1994John Wiley & Sons, Inc. ties of present spatial information systems regard their limited capabilities to perform spatial analysis and mod- eling (Dutton, 1991; Rhind, 1988). Chou and Ding (1992) identified two important steps of GIS design methodology to overcome the above limitations: devel- oping a conceptual model and mapping it to a software (logical) data model. Tsichritzis and Lochovsky (1982) describe two levels of data modeling depending on the viewpoint adopted for data representation. The first level is related to the user’s viewpoint, while the second one is related to the computer system’s viewpoint. Correspondingly, there are two categories of data models: the first category in- cludes the E-R model (Chen, 1976) and semantic models (Hammer & McLeod, 1981; Levesque & Mylo- poulos, 1979), in which phenomena and their associa- tions are visually represented on behalf of users and im- plementation issuesare not considered. The second cat- egory, including the relational model, represents the first step in computing, conveying information from a con- ceptual phase toward an implementation-oriented de- sign. Laurini and Thompson (1992, p. 362) state that the two modeling levels, called conceptual and logical, re- spectively, are essential in the geographic database de- sign. In fact, due to the complexity of geographic data in terms of relationships to be modeled and operations to be performed, a conceptual analysis of the specific appli- cation is strongly required. On the other hand, the “schema” design is not trivial since geographic entities are not easyto transfer in a databaseformat. At present, there is an increasing interest for the ob- ject-oriented approach as a way for overcoming the drawbacks of traditional data models in the management of geographic data. Object orientation is an emerging technology suitable for the management of large amounts of complex data, and can be used with surpris- ingly good results at both conceptual and logical levels. Major merits of object orientation come from modeling power, abstract data types, subpart sharing, object iden- tity, code/schema extendibility, methods definition, in- JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 45(9):694-704, 1994 CCC 0002-8231/94/090694-l 1

Upload: mathmods

Post on 10-Mar-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Object-Oriented Modeling of Geographic Data

Eliseo Clementini and Paolino Di Felice Department of Electrical Engineering, University of L ‘Aquila, 67040 L ‘Aquila, Italy. E-mail: eliseo@dsiaql .ing.univaq.it [email protected]

Two levels of data abstraction, namely conceptual and log- ical, are essential in the process of designing an object- oriented geographic database. The aim of the present arti- cle is to establish a consistent framework for this bilevel design. At the conceptual level, we develop an object-ori- ented model useful for organizing the designer’s knowl- edge about a geographic application in terms of the basic concepts of the object-oriented paradigm. Both concep- tual abstraction primitives (namely, classification, general- ization, and aggregation) and a spatial abstraction primi- tive (namely, location) are discussed. The dynamic part of the model is due to the methods associated with objects. At the logical level, we propose an object-oriented struc- ture of classes and instances, which is uniform with the conceptual model and, therefore, facilitates the concep- tual logical mapping. On the whole, the model-based ap- proach adopted in this article offers object orientation, con- ceptual abstraction, and system extendibility.

Introduction

The area of spatial information systems has generated a considerable interest among computer scientists and practitioners in the last decade (Abel & Ooi, 1993; Giinther & Schek, 199 1; Iyengar & Kashyap, 1988; Laurini & Thompson, 1992). Many issues have been in- vestigated such as pattern recognition, image processing and analysis, spatial data modeling, geographic integrity constraints, geographic information retrieval, visual query languages, and map-based interaction techniques. This kind of research is expected to lead to the definition of new types of spatial databases, combining together ad- vances in database technology and human computer in- teraction.

Frank ( 1988) reports that most existing systems were obtained by extending a relational system with addi- tional features for storing and visualizing images. Al- though the relational data model is simple and easy to implement, it lacks flexibility to handle spatial data and has limited expressive power, as pointed out by Wor- boys, Hearnshaw, and Maguire (1990). Other deficien-

0 1994 John Wiley & Sons, Inc.

ties of present spatial information systems regard their limited capabilities to perform spatial analysis and mod- eling (Dutton, 199 1; Rhind, 1988). Chou and Ding (1992) identified two important steps of GIS design methodology to overcome the above limitations: devel- oping a conceptual model and mapping it to a software (logical) data model.

Tsichritzis and Lochovsky (1982) describe two levels of data modeling depending on the viewpoint adopted for data representation. The first level is related to the user’s viewpoint, while the second one is related to the computer system’s viewpoint. Correspondingly, there are two categories of data models: the first category in- cludes the E-R model (Chen, 1976) and semantic models (Hammer & McLeod, 198 1; Levesque & Mylo- poulos, 1979), in which phenomena and their associa- tions are visually represented on behalf of users and im- plementation issues are not considered. The second cat- egory, including the relational model, represents the first step in computing, conveying information from a con- ceptual phase toward an implementation-oriented de- sign.

Laurini and Thompson (1992, p. 362) state that the two modeling levels, called conceptual and logical, re- spectively, are essential in the geographic database de- sign. In fact, due to the complexity of geographic data in terms of relationships to be modeled and operations to be performed, a conceptual analysis of the specific appli- cation is strongly required. On the other hand, the “schema” design is not trivial since geographic entities are not easy to transfer in a database format.

At present, there is an increasing interest for the ob- ject-oriented approach as a way for overcoming the drawbacks of traditional data models in the management of geographic data. Object orientation is an emerging technology suitable for the management of large amounts of complex data, and can be used with surpris- ingly good results at both conceptual and logical levels. Major merits of object orientation come from modeling power, abstract data types, subpart sharing, object iden- tity, code/schema extendibility, methods definition, in-

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE. 45(9):694-704, 1994 CCC 0002-8231/94/090694-l 1

heritance, and encapsulation. Though no common definition for “the” object-oriented data model has emerged, a reasonable agreement exists over the features that are expected to be in an object-oriented database (Khoshafian & Abnous, 1990), which is a system that in- tegrates object orientation with traditional database ca- pabilities, such as persistence, transactions, concurrency control, and querying.

The aim of the present article is to establish an object- oriented framework for both conceptual and logical de- sign of geographic databases. We propose an object-ori- ented conceptual data model tailored for organizing and representing basic map elements and their relationships, as well as operations of interest for the management of geographic data. The model provides designers with a conceptual tool useful for organizing their knowledge about a geographic application in terms of the basic con- cepts of the object-oriented paradigm (i.e., classes, in- stances, and methods). Both conceptual abstraction primitives (Smith & Smith, 1977a, b; Tsichritzis & Lo- chovsky, 1982) (namely, classiJication, generalization, and aggregation) and a spatial abstraction primitive (Mohan & Kashyap, 1988) (namely, location) are dis- cussed. The model is also helpful for the database schema design, that in an object-oriented context corresponds to the definition of a structure for classes and instances and their hierarchical organization.

Each geographic entity is surrounded by a context of other geographic entities. The study of spatial relation- ships among entities is very relevant to spatial informa- tion modeling. The proposed model is flexible with re- spect to the decision a designer has to take about which knowledge should be modeled in terms of relationships among objects and which one as an operation applicable on an object. A basic fact that needs to be modeled (Laur- ini & Thompson, 1992, p. 27) is the location of some particular feature in the geographic space. In fact, two frequent asked questions are: “What is in a particular place?’ and “Where is a certain entity?’ The first query involves the retrieval of what is the information “inside” a given boundary, and the second one asks for determin- ing the location of an entity. A possible answer to the latter question would be to give the geographic coordi- nates, but, from the user point of view, it may be better to have a “context-related” answer, such as the name of a significant area surrounding the entity itself. In this sense, the process of giving a location to objects is part of the conceptual design of a geographic database. The proposed model allows a direct representation of the lo- cation of geographic entities (a relationship in the adopted formalism). Other kinds of spatial relationships (Egenhofer & Franzosa, 1991) are appropriately de- scribed by operations (methods in object-oriented terms).

The remainder of this article will often propose a run- ning example of design to illustrate the meaning of the object-oriented formalism. Its intensional part has

classes such as GeographicArea, State, and Country, while its extensional part contains simple instances be- longing to the same classes.

Related Work

The idea of introducing an object-oriented formalism helpful for the conceptual analysis of geographic maps was initially given in a previous study (Clementini, Di Felice, & D’Atri, 1991). The work that most closely re- sembles ours is by Mohan and Kashyap (1988). In that study, the authors proposed an object-oriented formal- ism for the representation of spatial knowledge. They outlined the drawbacks of the relational data model for modeling geographic applications and pointed out the merits of the object-oriented approach. The major difference between their proposal and ours is that they considered a “structural” object-oriented model and coupled it with predicate logic to deduce implicit infor- mation, while we comprise dynamic aspects (properties and methods) inside the object-oriented model. In this way we obtain a uniform conceptual frame for modeling explicit and implicit knowledge associated with geo- graphic maps. Another difference is that the “isin” spa- tial relationship introduced in this study is more general than the Mohan and Kashyap’s (1988) “belongs-to,” because the latter relationship models the strict contain- ment of a geographic area (e.g., a lake) in another one (e.g., a state), while the “isin” relationship models the (complete or partial) containment of any geographic ele- ment (e.g., a city, a river, and a lake) in a geographic area (e.g., a state).

The Mohan and Kashyap approach is similar to the one adopted by van Oosterom and van den Bos (1989) to model spatial abstraction (called “generalization”). Kainz (1988) applies data structures of graph theory to the modeling of spatial overlays. Egenhofer and Frank (1989) claim the appropriateness of object-oriented modeling for the conceptual design of geographic appli- cations. They identify four conceptual abstraction prim- itives-namely, classification, generalization, aggrega- tion, and association. Association may be compared to our location fact, since association models the relation- ship “inside” between a geographic object and an area containing it. However, association is a general set con- structor mechanism, while location is a specific spatial primitive. Furthermore, the authors emphasize the use of operations for deriving all spatial relationships of an object. The same four conceptual abstraction primitives are considered important by Nyerges ( 199 1).

Choi and Luk (1990) propose a geographic-specific layer on top of a general purpose object-oriented data- base. This starting project is similar to our approach with regard to the bilevel geographic model. Zhan and Mark ( 1992) discuss a plain object-oriented approach for rep- resenting basic geometric classes, while they formalize topological knowledge as classes for representing spatial

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994 695

relationships; for example, “point/line” is a class de- signed for handling spatial relationships between a point and a line.

In the PROBE database management system (Or- enstein & Manola, 1988), the term object-oriented is used in a way slightly different than in this article. In fact, the PROBE data model (Manola & Orenstein, 1986) relies on the “functional” data model built in terms of entities and functions. Worboys et al. (1990) demonstrate the ap- plicability of an object-oriented design methodology to the design of geographical information systems. Their approach is an application of the IF0 formalism (Abi- teboul & Hull, 1987) to examples in the geographic context. However, IF0 is a semantic database model and, if compared with object-oriented models, it misses the dynamic part, that is, the specification of operations for its abstract types.

At the logical level, the suitability of object-oriented database systems for geographic data is demonstrated in several experimental works. Williamson and Stucky ( 199 1) designed and implemented a prototype based on Vbase. Milne, Milton, and Smith (1993) experimented the use of ONTOS to handle geographic data. Scholl and Voisard ( 1992) discussed an implementation based on 02.

The Conceptual Level

In this section, we present an object-oriented concep- tual model for the representation of geographic knowl- edge. First, we describe the conceptual abstraction prim- itives (Smith & Smith, 1977a, b; Tsichritzis & Lochov- sky, 1982), besides the spatial abstraction primitive (Mohan & Kashyap, 1988). Abstraction primitives are conceptual mechanisms able to individualize facts and relationships between entities of the geographic reality. Then, by means of formal properties, we model such facts in terms of directed acyclic graphs. Furthermore, properties supply a deductive mechanism for inferring knowledge. The dynamic part of the model is due mainly to operations that are associated with objects. The pres- ence of operations makes the data model object-ori- ented, like other models that can be found in the litera- ture (Banerjee et al., 1987; Lecluse, Richard, & Velez, 1988; Wand, 1989), and differentiates it from semantic and entity-relationship data models. We take the ap- proach of an object-oriented purist a la Smalltalk (Gold- berg & Robson, 1983), in which everything is an object (even the number ‘0’).

Geographic Elements and Relationships

Geographic elements are represented by objects of two kinds: class objects (C) and instance objects (I). Exam- ples of geographic class objects are Continent, Country, River, Lake, while examples of instance objects are North America, USA, Mississippi, and Ontario.

( W

FIG. 1. Examples ofclassification (a), generalization (b), and aggrega- tion (c, d) facts.

Among objects there exist binary relationships. The formalism (0 1, r, 02) denotes that the object 0 1 has the relationship r with the object 02; we call this triplet a fact. Facts of the kind (C 1, r, C2) provide an intensional description of data, while facts of the kind (I 1, r, 12) pro- vide an extensional description of data. Notice that facts of the kind (I, r, C) (or (C, r, I)) act as a link between these two data description levels. Four different kinds of facts will be analyzed with respect to their relevance in the geographic context: ciass$cation, generalization, ag- gregation (conceptual abstraction primitives), and loca- tion (spatial abstraction primitive).

(I, isaninstance-of, C) is a classijkation fact. It associates instance objects with their class. As an exam- ple, we have: (Mississippi, is-an-instance-of, River). We allow that an instance object belongs to many classes. For example, the Hawaii can be considered to be at the same time an archipelago and an American state: (Ha- waii, is-an-instance-of, Archipelago) and (Hawaii, is-aninstance-of, State). Classification facts organize classes and instances in bilevel directed graph structures, where classes are the roots and instances are the leaves (Fig. la).

(Cl, is-a-subclass-of, C2) is a generalization fact. We can say that Cl inherits all the characteristics of C2; moreover, Cl may have additional characteristics that are not inherited. For instance, (Lake, isa-sub- class-of, Geographic Area) tells us that a lake is still a geographic area, but it is a more specialized entity. In general, we can have multiple inheritance, i.e. a class may be a subclass of several classes. Generalization facts define a unique lattice of classes (Fig. lb), since we can assume the existence of a single root.

(0 1, is-apart-of, 02) (where 0 1 and 02 are either both classes or both instances) is an aggregation fact. Ag- gregation facts express that an object is formed by a cer- tain number of component objects. With regard to the extensional level, each time that an instance object 11, participating in an aggregation fact (I 1, is-a-part-of, I), is an instance of many classes, its role in the aggrega-

696 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994

tion fact is not unique. To manage this problem, it is necessary to introduce the concept of the role played by the instance 11 in the aggregation fact. This is carried out by associating a classification fact (I 1, is-anin- stance-of, Cl) to the aggregation fact. At the class level, examples of aggregation facts are (Capital, is-a-part-of, Country) and (Population, isapart-of, Country); correspondingly, at the instance level, exam- ples are (Rome, is-a-part-of, Italy) with the role (Rome, is-an-instance-of, Capital), and (55.155.993 inhabitants, isapahof, Italy) with the role (5 5.15 5.993 inhabitants, is-an-instance-of, Popula- tion). We admit subpart sharing, that is, a class may be a component of more than one class. Aggregation facts define many directed graph structures at both class and instance levels (Fig. 1 c, d).

We briefly introduce the dual facts of the conceptual abstraction primitives: instantiation, specialization, and disaggregation, with the relationships “has-thein- stance, ” “is-a-superclassof,” and “has-the-compo- nent,” respectively. Examples are: (River, has-thein- stance, Mississippi), (GeographicArea, isasuper- classof, Lake), (Country, has-the-component, Capital), (Italy, has-the-component, Rome).

So far, we described facts that occur each time we model a fragment of reality. As discussed in the Intro- duction, to model geographic applications, the designer must represent “spatial facts” (Laurini & Thompson, 1992, p. 645), which are statements that include location properties of objects. Therefore, we represent in terms of facts the location of geographic objects in a geographic area, since location is essential to describe the structural part of a spatial object. Examples of this category of facts are: a state belongs to a country; a lake belongs either completely or partially to a state; a river crosses either one or more states; a city belongs to a country; a highway crosses a city; the summit of a mountain is located in a country; and so on. From a geometric point of view, these examples correspond to a nonvoid intersection of a geometric element (point, line, or area) with an area; two cases can arise: the geometric element is completely contained within the boundary of the area, or it overlaps (crosses) the area.

The orientation of this kind of spatial relationships is fixed by the conceptual designer in accordance with the meaning commonly associated with geographic situa- tions. For example, if the territory of a state and the sur- face of a lake overlap each other, we say that the lake belongs (at least partly) to the state, but we do not say that the state belongs to the lake, as this would be non- sense.

We introduce the fact (0 1, is-in, 02) (where 0 1 and 02 are either both classes or both instances) to represent in a uniform way the spatial relationships discussed above. We name (0 1, isin, 02) a location fact. At the intensional level, examples are: (Country, isin, Conti- nent), (River, isin, State); some corresponding exam-

stat.?

Cc)

z Riva

FIG. 2. Examples of location facts at the intensional (a, c, e, g) and extensional (b, d, f, h) level.

ples at the extensional level are: {Canada, isin, North America), (Mississippi, is-in, Louisiana), (Mississippi, isin, Arkansas). Location facts define hierarchies on classes and instances. In general, a single link at the in- tensional level (Fig. 2a, c) originates many links at the extensional level (Fig. 2b, d). Situations of “complete” containment cause “many-to-one” is-in relationships at the instance level (Fig. 2b), while situations of “partial” containment cause “one-to-many” is-in relationships (Fig. 2d, f). In detail, Figure 2d points out a situation where one-to-many isin links are more usual; in fact, a river usually crosses many states. Figure 2f points out a situation where one-to-many is-in links may be consid- ered an exception; in fact, Russia is one of the few coun- tries in the world, whose territory geographically belongs to two continents, Europe and Asia.

It is worthwhile to notice that location hierarchies cor- respond to a representation of spatial data at various lev- els of spatial abstraction. For instance, the intensional facts: (State, isin, Country) and (Country, is-in, Con- tinent) correspond at the extensional level to geographic entities defined at different levels of spatial resolution (Fig. 2g, h).

The dual relationship of is-in is the contains relation- ship. We name (0 1, contains, 02) (where 0 1 and 02 are either both classes or both instances) a containment fact. By using the contains relationship, at the inten- sional level we can model facts like: {Continent, con-

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994 697

tains, Country) and (State, contains, River), while at the extensional level we can model facts like: (America, con- tains, USA), (America, contains, Canada), (Arkansas, contains, Mississippi), (Arkansas, contains, White River).

Properties of Facts

Herein, we define the set of formal properties that the abstraction primitives must satisfy. The aim of these properties is twofold: first, they specify the data struc- tures associated with the model; specifically, classifica- tion, generalization, aggregation, and location facts (both at the intensional and extensional level) can be modeled as directed acyclic graphs. Second, properties act as an inferring tool to derive new facts starting from known facts.

Symbols 3 and 0 will denote the sets of all instances and all classes, respectively, with 3 fl Q = 0.

Properties of classification facts:

(1) Required clumjication for instances: for every I E 3, there exists C E 0 such that (I, isanjnstance-of, C).

Property (1) states that every instance is a member of at least one class. This assumption defines a strictly typed model (Tsichritzis & Lochovsky, 1982).

Properties of generalization facts:

(2)

(3)

(4)

Root of the generalization hierarchy: there exists the class Object E Q such that, for every C E (E-Object), (C, is-a-subclass-of, Object}: Irreflexivity: for every Cl, C2 E 0, if (Cl, is-a-subclass-of, C2), then C 1 f C2; Transitivity: for every C 1, C2, C3 E 6, if (Cl, isasubclassof, C2) and (C2, is-a-subclass-of, C3), then (Cl, is-a-subclassof, C3).

The generalization hierarchy has the class Object as a single root [property (2)]. Together, properties (3) and (4) guarantee the acyclicity of this hierarchy. In fact, assum- ing by contradiction a cycle in the structure, that is, if it exists a sequence of classes C 1, C2, . . . , CN E 0, where N > 0, such that (C, is-a-subclass-of, Cl), (Cl, is-a- -subclassof, CZ), . . . ) and (CN, is-a-sub- class-of, C), then by applying N times transitivity [prop- erty (4)] it is possible to deduce (C, is_asubclassof, C); this contradiction with irreflexivity [property (3)] proves the claim.

Properties of aggregation facts:

(5) Irreflexivity: (a) intensional level-

for every Cl, C2 E Q, if (C 1, isapart-of, C2), then Cl Z C2;

FIG. 3. An instance I with an extra-component 14.

(b) extensional level- for every II, I2 E 5, if (I 1, isapartof, 12) then I1 Z 12;

(6) Acyclicity: (a) intensional level-

for every C E 0, there does not exist a sequence ofclasses Cl, C2, . . . , CN E 6, where N > 0, such that (C, La-part-of, Cl), (Cl, in- part-of, C2), , and (CN, La-part-of, c>;

(b) extensional level- for every I E 3, there does not exist a sequence of instances II, 12, . . . , IN E 5, where N > 0, such that (I, isapartof, Il), (11, isapar- tof, 12), . . . , and {IN, is-a-part-of, I);

(7) Uniqueness of aggregation at the extensional level: for every C, Cl E 0 and for every I E 3, if (I, is-an- -instance-of, C) and (C 1, is;tpartof, C), then there exists and it is unique I1 E S, where (11, is-an-instance-of, Cl), such that (11, isapar- t-of, I).

Properties (5) and (6) state the irreflexivity and acy- clicity of aggregation hierarchies (both at the intensional and extensional level). Property (7) establishes that an instance has at least all the subparts specified in its class. However, property (7) does not exclude that an instance might have extra-cornportents, that is, other components having no counterpart at the class level (the fact (14, isa-part-of, I) in Fig. 3). It is important to specify the role of extra-components, in order to determine the class to which they belong. A benefit of extra-components is that they allow transfer at the instance level specific fea- tures not convenient to model at the class level and, hence, to impose as a common features for all instances.

Properties of location facts:

(8) Irrejlexivity: foreveryCl,C2~O,if(Cl,is&C2),thenCl # c2;

(9) Transitivity at the intensional level: for every Cl, C2, C3 E (5, if (Cl, isin, C2) and (C2, is-in, C3), then (Cl, isin, C3);

(10) Partial transitivity at the extensional level: foreveryCl,C2,C3EBandforeveryI1,12E$,if (Cl, is-in, C2), (C2, is-in, C3), (11, is-an-ins- tance-of, Cl), (12, Latinstance-of, C2), and (I 1, is-in, I2), then there exists at least one I3 E 3, where (12, is-in, 13) and (13, is-an-instance-of, C3), such that (I 1, kin, 13);

698 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994

(11) Reauired location: (a)

(b)

intensional level- for every 11, I2 E 3, where (I 1, isin, I2), there exist C 1, C2 E 0, where (I 1, is-an-instance-of, Cl) and (12, is-aninstance-of, C2), such that (C 1, isin, C2); extensional level- for every C 1, C2 E 0 and for every I1 E 3, where (Cl, is-in, C2) and (I 1, isaninstance-of, Cl), there exists at least one I2 E 5, where (12, is-aninstance-of, C2), such that (11, isin, 12).

Properties (8) and (9) guarantee the acyclicity of the location hierarchy at the intensional level. The proof is analogous to that given for generalization facts. Property (1 la) states that if a location fact exists at the instance level, then it must exist also at the class level. This prop- erty may be regarded as a strict typing of location facts. Property (1 lb) is the dual of property ( 1 la). The acy- clicity at the intensional level and property (1 la) assure that acyclicity is also at the extensional level. In fact, let 11,12,. . . , IN E 3 be a sequence of instances such that (I, is-in, Il), (11, isin, 12), . . . , and (IN, isin, I), then, by applying property (1 la), it would be possible to construct a sequence of classes Cl, C2, . . . , CN E 0 such that (C, is-in, Cl), (Cl, isin, C2), . . . , and (CN, isin, C); this contradiction with the acyclicity at the intensional level proves the claim.

At the extensional level, the transitivity of location facts is not valid in general. Property ( 10) establishes the existence of “paths” in the location hierarchy where transitivity is valid, but it says nothing about which are the valid paths. For example, from (Siberia, isin, Rus- sia), (Russia, isin, Asia), and (Russia, isin, Europe), the only valid fact that can be deduced is (Siberia, is-in, Asia).

Acyclicity of the four kind of facts assures that each hierarchy can be represented as a directed acyclic graph. In the following, we give properties applying different kinds of facts and concerning the inheritance of classifi- cation, aggregation, and location over generalization (see also Fig. 4):

(12)

(13)

(14)

Inheritance of classification. for every I E 3 and for every C, Cl E 0, if (I, isan- instance-of, C) and (C, is-a-subclass-of, Cl), then (I, is-aninstance-of, Cl); Inheritance ofaggregation: for every C, Cl, C2 E 0, if {Cl, is-apartof, C) and (C2, is-a-subclass-of, C), then {Cl, is-a-part-of, C2); Inheritance of location. for every C, Cl, C2 E B, if (C I, is-in C2) and (C, is-a-subclass-of, Cl), then (C, isin, C2).

Transitivity and inheritance constitute a mechanism for deducing new facts starting from known facts. We

Cl .Q is-an-instance-of ,’

I is-a-subclass-of

/

is-an-instance-of

is-a-part-of

\ is-a-part-of x . I

is-a-subclass-of

09

is-in

p-,9

is-a-subclass-of I /

/ is-in

FIG. 4. Inheritance of classification (a), aggregation (b), and location (c) over generalization. Solid lines denote strong facts, while dashed lines denote weak facts.

call weak facts those that can be deduced by applying either transitivity or inheritance, and strong facts those that are not weak.

Operations and Their Properties

Besides describing the static structure of the objects in the system and their relationships, a dynamic compo- nent is also needed in the conceptual model. Spatial in- formation processing is committed to operations. It fol- lows that operations have a twofold role: they are a tool for inferring spatial knowledge not directly represented in the model, as well as for performing any other kind of computation.

In this section, we specify the formalism suitable for specifying operations (called methods) applicable on ob- jects. We distinguish between operations applicable on instances (I-methods) and operations applicable on classes (C-methods). We denote the association between an object and a method with a couple (0, method), which we call an operation set. In particular, the fact

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994 699

that an instance I has an I-method is expressed as: (I, I-method). Similarly, the fact that a C-method may be applied to a class is expressed as: (C, C-method). If an I- method is shared among all the instances of a class C, we write: (C, I-method).

With regard to I-methods, basic categories of opera- tions that can take place over geographic elements are geometric, topological, and metric operations. It is a hard task to point out a complete set of geometric opera- tions (Joseph & Cardenas, 1988; Roussopoulos, Fa- loutsos, & Sellis, 1988), since they strongly depend on the particular objects of the application. Some simple geometric operations concern the retrieval of the follow- ing spatial data:

l boundary and centroid of an area; l intersection between two geometric elements; l union between two geometric elements; l dzfirence between two geometric elements.

Topological operations are those related to the re- trieval of topological knowledge (Clementini et al., 1993); some topological operations permit to find the following spatial information:

l inclusion of geometric elements; l adjacency of areas; l connectivity of lines; l overlapping of lines or areas; l crossing of lines.

Metric operations (Frank, 199 1; Hemandez, 199 1) compute metric and orientation aspects of spatial knowl- edge, such as:

l relative position (“north of,” “east of,” . . .); l length of a line; l surface and perimeter of an area; l distance between geometric elements.

The examples above are modeled by I-methods, since they are applicable on instance objects. On the other hand, C-methods represent operations relative to a class object. An incomplete list of operations modeled by C- methods is the following:

l creating a new instance of a class; l deleting an instance; l counting all instances of a class; l adding a fact; l removing a fact.

Methods satisfy the following properties:

( 15) Required operation set for instances: for every C E Q and for every I E 5, if (C, I-method) and (I, is-an-instance-of, C) then (I, I-method);

(16) Inheritance over generalization.

for every C, Cl E B, if (C, isasubclassof, Cl) and (Cl, method) then (C, method).

Property ( 15) tells us that a method associated with a class is also associated with all instances of the class. This does not exclude that other methods not contemplated in the class might be associated with an instance (extra- methods). Extra-methods provide the same benefits de- riving from the presence of extra-components. Property (16) states the propagation of both I-methods and C- methods associated with a class through the hierarchy of subclasses.

Adding New Classes and Instances

Let us consider the conceptual design of a small part of a geographic application in terms of the object-ori- ented model. Using again example classes such as Geo- graphicArea, State, Country, and some instances belong- ing to them, the class City and the instance Phoenix will be added to the design.

First, it is necessary to append the class City to the generalization hierarchy. The superclass should include most of subparts and methods that are common to City and other geographic classes. Estimating that the class GeographicArea is a good superclass for City, we write: (City, is-a-subclass-of, GeographicArea). If we have (GeographicArea, isasubclassoK Area), then by ap- plying transitivity [property (4)] we can deduce also (City, is-a-subclass-of, Area), and so on if there are other superclasses.

Subsequently, by applying inheritance [property (13)], we can deduce many aggregation facts (the com- ponents of the class City), from superclasses such as: (Surface, is-apart-of, City) and (Map, is-apart-of, City). We can do the same for methods [property (16)] and derive several I-methods such as: (City, border) and (City, distance).

Then the design will continue by adding some pecu- liar features to the class City (components and opera- tions) as, for example: (Population, is-a-part-of, City), (Altitude, is-apart-of, City), (City, routeTo). If classes Population and Altitude are not defined pre- viously, then we have to designate them as (Population, is-a-subclass-of, Integer) and (Altitude, isasub- classof, Real).

As the next step of the design, we have to identify sig- nificant location facts for cities. A meaningful informa- tion to be represented along with cities could be which county they belong to. Therefore, we impose at the in- tensional level: (City, isin, County). If the fact (County, is-in, State) holds, then by transitivity [prop- erty (9)] we can also deduce (City, isin, State). Sim- ilarly, we could also deduce other facts such as (City, isin, Country).

Now, let us move to the design of the instance Phoe- nix. Initially, we have to insert it among the instances

700 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994

of the class City [Property (l)]; formally: (Phoenix, isaninstance-of, City). After that, as stated by prop- erty (7), we have to define at least the following aggrega- tion facts: (89 square miles, is-apart-of, Phoenix) with the role (89 square miles, is-an-instance-of, Sur- face); (map 212, impartof, Phoenix} with the role (map 212, is-aninstance-of, Map); (582,000 inhabi- tants, isapart-of, Phoenix) with the role (582,000 in- habitants, is-an-instance-of, Population); (25 32 feet, is-a-part-of, Phoenix) with the role (2532 feet, isaninstance-of, Altitude). The instance Phoenix has all the I-methods of the class City [Property ( 15)]; there- fore, we have (Phoenix, border), (Phoenix, distance), (Phoenix, routeTo).

With regard to location, according to property (1 lb), we have to point out the county where Phoenix is lo- cated, that is, (Phoenix, isin, Maricopa). If (Maricopa, is-in, Arizona) holds and we assume that this is the only strong location fact applying to the county of Maricopa, then we can employ transitivity [property ( lo)] to deduce the fact (Phoenix, isin, Arizona).

The Logical Level

According to object-oriented terminology, a schema corresponds to a network of classes and their relation- ships. Each class contains a structural part (i.e., vari- ables) and a behavioral part (i.e., metho&), as well as the description of its position in the class hierarchy. Classes may be either standard classes whose instances belong to simple or predefined data types (e.g., Integer, Real, String, Set, List, Text, Picture, and so on) or application- dependent classes whose instances represent geographic entities (e.g., City, River, State).

Therefore, designing an object-oriented database schema means defining the structure of the application- dependent classes. Later, during the use of the database, each time the need for a new instance of a specific class arises, the class instantiation process will be run. All in- stances of a class share exactly the same structural char- acteristics and the same behavior of the generating class.

There are at least six approaches for incorporating ob- ject-oriented capabilities in databases (Khoshafian & Ab- nous, 1990, p. 273). One of these approaches is to de- velop an entirely new database programming language and database management system with object-oriented features. In this study, we pursue this approach, propos- ing the structure of classes and instances. The similarity with the conceptual model facilitates the conceptual-log- ical mapping.

The Class Structure

The structure of classes of Figure 5 supports the direct representation of the four different facts introduced at the conceptual level. Emphasis has been put on the rep- resentation of strong facts. Location facts could always

class <class name>

interface links

4_a_subclas_of> { , <is-a-subclass-of>) <is-in> [, <is-in>)

class variables /* a list of constants common to all instances of the class */

instance variables /* a list of components structuring the class */

class methods /* a list of methods applicable to the class */

instance methods /* a list of methods applicable to all instances of the class */

implementation /* the code of both class and instance methods */

endclass

FIG. 5. The general structure of classes.

be computed by operations capable of extracting this in- formation from spatial data, for example, by using the point-in-polygon algorithm. Such computations would be time consuming since, out of all spatial queries, the two-dimensional region queries are the most frequently asked in a spatial information system (Laurini & Thompson, 1992, p. 540). Therefore, it is better to repre- sent explicitly strong location facts.

On the other hand, weak facts can be computed. Given the strong facts (Lake, is-asubclass-of, Geo- graphicArea) and (GeographicArea, isasubclassof, Area), the fact (Lake, is-a-subclass-of, Area) does not need to be represented since it can be deduced via transi- tivity. Similarly, given, for instance, the facts (Arizona, is-an-instance-of, State) and (State, isasub- classof, GeographicArea), the fact (Arizona, isani- nstance-of, GeographicArea) does not need to be repre- sented since it can be deduced via the inheritance of clas- sification over generalization.

A class has two parts (Fig. 5): an interface, containing the information needed for using the class from the out- side, and an implementation, containing the code hidden to the user. In turn, the interface is structured in terms of three blocks: the Iinks block, the variables block, and the methods block.

The links block of a class contains the is-asub- class-of and is-in links. Classes contain, in the variables block, instance variables and class variables. Instance variables correspond to the dual link of the is-a-part-of link and define data items for instances of a class, while class variables represent constants common to all in- stances of a class. The methods block contains the head- ings of instance methods and class methods, while the code associated with them is given in the implementa- tion part. Instance methods represent operations appli- cable on instances; class methods have an analogous meaning with regard to classes. A subclass inherits all “features” from the chain of its ancestors, that is, links, variables, and methods.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994 701

instance tinstance name> links

&-an-instance-ok {, <isan~instance-of>) <is-in> (, <is.&i>)

instance variables /* a list of components structuring the instance */

instance methods /* a list of the extra-methods applicable to the instance *I

endinstance

FIG. 6. The general structure of instances.

The Instance Structure

The need to have a nonstandard structure for in- stances arises because an instance may belong to more than one class, and may have extra-components as well as extra-methods. Instances of a class are structured in three blocks (Fig. 6): the links block, the instance vari- ables block, and the instance methods block. The links block may contain the is-aninstance-of and isin links. The instance variables block contains values asso- ciated with instance variables, possibly also representing extra-components. Eventually, the instance methods block contains the extra-methods of the instance.

Adding New Classes and Instances

The aim of this section is to show a fragment of schema design, by mapping at the logical level the exam- ples of classes and instances already considered at the conceptual level. Classes are organized in a generaliza- tion hierarchy (Fig. 7). Such a hierarchy strongly influ- ences the extendibility of the schema, since it is essential for appending a new geographic class. Let us add the class City to the design, considering that, besides the inheri-

Object GeometricElement

Point Line

GeographicLine Railway Road River

Area GeographicArea

AdministrativeUnit county State

Country Continent Lake

FIG. 7. The generalization hierarchy of the running example. Ac- cording to the model, the class Object is the root of this hierarchy.

Continent Country

State County

FIG. 8. The location hierarchy considered in the example.

tance hierarchy, new classes have to be appended at the right place in the location hierarchy (Fig. 8). As shown in Figure 9, the class City has been appended to Geograph- icArea to inherit all the characteristics associated to such a class and, with regard to the location hierarchy, to County as the immediate spatial abstraction containing it.

Considering the creation of the instance Phoenix, the designer has to specify the values of links and instance variables (Fig. 10). Instance variables may come from the definition of the class City (population, altitude, history, and touristInformation), from superclasses of City (sur- face and map), or they may be extra-components (botanicalGarden). The instance Phoenix has no extra- methods.

Conclusions

This research concerns the achievement of a consis- tent object-oriented framework for a bilevel geographic database design. Specifically, a contribution has been made with respect to the conceptual and logical level of data abstraction. The object-oriented model plays a cen- tral role in our study. Basic features of the proposed con-

class City

interface

links is~a3k&ss_of: GeographicArea; is-in: County;

instance variables population: Integer; altitude: Real; history: Text; touristInformation: Text;

instance methods routeTo(destCity: City): Road;

_- it returns the shortest mute plan between two cities.

implementation

endclass -- City

FIG. 9. An incomplete definition of the class City.

702 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994

instance Phoenix links

is-an-instance-of: City; is-in: Mruicopa;

instance variables surface: 89 square miles; population: 582,000 inhabitants; altitude: 2532 feet; histow: text.4324 touristInformation: text.1873; map: map.212 botanicalGarden: text.5437

endinstance -- Phoenix

FIG. 10. A definition of the instance Phoenix.

ceptual model can be stated in terms of simplicity, data structuring flexibility, and object-orientation.

Simplicity. We describe the knowledge associated with geographic maps in terms of four basic facts, result- ing in a simple and homogeneous data structure for each category of facts (directed acyclic graphs).

Data structuringflexibility. For any knowledge repre- sentation model to be successful, it is essential that its “structures” are consistent with the real world entities. The conceptual model described herein provides the ap- propriate framework for representing knowledge about geographic entities. Specifically, location facts are suit- able for capturing the inherent structure of spatial data. In fact, objects (either classes or instances) can be created at various levels of spatial resolution and the information concerning the links between them can be stored along with the objects themselves.

Object-orientation. The knowledge about geographic maps can be expressed in terms of basic concepts of the object-oriented paradigm; namely, classes, instances, and methods. As a consequence, the mapping of the con- ceptual design to the schema design is facilitated.

Acknowledgments

This work was supported by the CNR Grant 92.0 1574.PF69 under project “Progetto Finalizzato Sis- temi Informatici e Calcolo Parallelo” and by a MURST 1990 grant under project “Metodi Formali e Strumenti per Basi di Dati Evolute.”

References

Abel, D., & Ooi, B. C. (Eds.) (1993). Advances in spatial databases, 3rd symposium, SSD’93. Lecture Notes in Computer Science 692. Singapore: Springer Verlag.

Abiteboul, S., & Hull, R. (1987). IFO: A formal semantic database model. ACM Transactions on DatabaseSystems, 12, 525.

Banerjee, J., et al. (1987). Data model issues for object-oriented appli- cations. ACM Transactions on O&e Infirmation Systems, 5( 1).

Chen, P. P. (1976). The entity-relationship model: Toward a unified view of data. ACM Transactions on DatabaseSystems, 1, 9-36.

Choi, A. Y. L., & Luk, W. S. (1990). A bi-level object-oriented data model for geographic information systems. Proceedings of IEEE Compsac ‘90, Chicago, IL.

Chou, H., & Ding, Y. ( 1992). Methodology of integrating spatial anal- ysis/modeling and GIS. Proceedings of the 5th International Sympo- sium on Spatial Data Handling, Charleston, SC, pp. 5 14-523.

Clementini, E., Di Felice, P., & D’Atri, A. (1991). An object-oriented conceptual model for the representation of geographic information. Proceedings of the ACM-IEEE Symposium on Applied Computing, Kansas City, MO, pp. 472-480.

Clementini, E., Di Felice, P., & van Oosterom, P. (1993). A small set of formal topological relationships for end-user interaction. Advances in spatial databases, 3rd symposium, SSD’93. Lecture Notes in Computer Science 692. (pp. 277-295). Singapore: Springer Verlag.

Dutton, G. (199 1). Improving spatial analysis in GIS environments. Proceedings ofAUTO CART0 IO, Baltimore, MD, pp. 168-185.

Egenhofer, M. J., & Frank. A. U. (1989). Object-oriented modeling in GIS: Inheritance and propagation. Proceedings ofAUTO-CART0 9, Baltimore, MD, pp. 588-598.

Egenhofer. M. J., & Franzosa, R. (1991). Point-set topological spatial relations. International Journal of Geographic Information Systems, 5, 161-174.

Frank, A. U. (199 1). Qualitative spatial reasoning about cardinal direc- tions. Proceedings ofAUTO CART0 10, Baltimore, MD, pp. 148- 161.

Frank, A. U. (1988). Requirements for a database management system for a GIS. Photogrammetric Engineering and Remote Sensing, 54, 1557-1564.

Goldberg, A., & Robson, D. (1983). Smalltalk-80: The language and its implementation. Reading, MA: Addison-Wesley.

Giinther, O., & Schek, H. -J. (Eds.) (199 1). Advances in spatial data- buses, 2nd symposium, SSD ‘91. Lecture Notes in Computer Science 525. Zurich: Springer Verlag.

Hammer, M., & McLeod, D. (1981). Database description with SDM: A semantic database model. ACM Transactions on Database Sys- tems, 6(3).

Hernandez, D. (1991). Relative representation of spatial knowledge: The 2-d case. In D. Mark & A. Frank (Eds.), Cognitive and linguistic aspects ofgeographic space (pp. 373-385). Dordrecht: Kluwer Aca- demic.

Iyengar, S. S., & Kashyap, R. L. (1988). Guest editors’s introduction on image databases. IEEE Transactions on Software Engineering, 14, 608-609.

Joseph, T., & Cardenas, A. (1988). PICQUERY: A high level query language for pictorial database management IEEE Transactions on Software Engineering, 14, 630-638.

Kainz, W. (1988). Application of lattice theory to geography. Third International Symposium on Spatial Data Handling, Sydney, Aus- tralia, pp. 135- 142.

Khoshafian, S., & Abnous, R. (1990). Object orientation. New York: Wiley.

Laurini, R., & Thompson, D. (1992). Fundamentals of spatial infor- mation systems. New York: Academic Press.

Lecluse, C., Richard, P., & Velez, F. (1988). 02, an object-oriented data model. Proceedings of the ACM International Conference on the Management of Data, Chicago, IL.

Levesque, H. J., & Mylopoulos, J. (1979). A procedural semantics for semantic networks. In N. Findler (Ed.), Associative networks (pp. 93-120). New York: Academic Press.

Manola, F. A., & Orenstein, J. A. (1986). Toward a general spatial data model for an object-oriented DBMS. Proceedings ofthe 12th Interna- tional Conference on Very Large Data Bases, Kyoto, Japan, pp. 328- 335.

Milne, P., Milton, S., & Smith, J. L. (1993). Geographical object-ori- ented databases-a case study. International Journal of Geographi- cal Information Systems, 7, 39-55.

JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994 703

Mohan, L., & Kashyap, R. L. (1988). An object-oriented knowledge representation for spatial information. IEEE Transactions on Soft- ware Engineering, 14,615-68 1.

Nyerges, T. L. (1991). Representing geographical meaning. In B. P. Buttenfield & R. B. McMaster (Eds.), Map generalizafion: Making rulesfor knowledge representation (pp. 59-85). London: Longman.

van Oosterom, P., & van den Bos, J. (1989). An object-oriented ap preach to the design of geographic information systems. Computers & Graphics, X3,409-4 18.

Orenstein, J. A., & Manola, F. A. (1988). PROBE spatial data modeling and query processing in an image database application. IEEE Transactions on Software Engineering, 14,61 l-629.

Rhind, D. (1988). A GIS research agenda. International Journal of Ge- ographical Information Systems, 2,23-28.

Roussopoulos, N., Faloutsos, C., & Sellis, T. (1988). An efficient picto- rial database system for PSQL. IEEE Transactions on Software En- gineering, 14, 639-650.

Scholl, M., & Voisard, A. (1992). Geographic applications: An experi- ence with Or. In F. Bancilhon, C. Delobel, & P. Kanellakis (Eds.), Building an object-oriented database system-The story of Oz (pp. 585-618). San Mateo, CA: Morgan Kaufmann.

Smith, J. M., & Smith, D. C. P. (1977a). Database abstractions: Aggre- gation. Communications oftheACM, 20, 405-413.

Smith, J. M., &Smith, D. C. P. (1977b). Database abstractions: Aggre- gation and generalization. ACM Transactions on Database Systems, 2, 105-133.

Tsichritzis, D. C., & Lochovsky, F. H. (1982). Data models. Englewood Cliffs, NJ: Prentice-Hall.

Wand, Y. (1989). A proposal for a formal model of objects. In W. Kim and F. H. Lochovsky (Eds.), Object-orientedconcepts, databases, and applications (pp. 537-559). New York: ACM Press.

Williamson, R., & Stucky, J. (199 1). An object-oriented geographical information system. In R. Gupta and E. Horowitz (Eds.), Object- oriented databases with applications to CASE, networks, and VLSI CAD (pp. 296-3 12). Englewood Cliffs, NJ: Prentice-Hall.

Worboys, M. F., Heamshaw, H. M., & Maguire, D. J. (I 990). Object- oriented modelling for spatial databases. International Journal of Ge- ographical Information Systems, 4,369-383.

Zhan, F., & Mark, D. M. (1992). Object-oriented spatial knowledge representation and processing: Formalization of core classes and their relationships. Proceedings of the 5th International Symposium on Spatial Data Handling, Charleston, SC, pp. 662-611.

704 JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE-October 1994