geodatabase database design - lakehead...

10
Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty of Natural Resources Management Lakehead University

Upload: others

Post on 19-Jun-2020

23 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

GeodatabaseDatabase Design

Tomislav SapicGIS Technologist

Faculty of Natural Resources ManagementLakehead University

Page 2: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

Table Design

• Two types of attribute tables in GIS:

– Feature attribute table

– Non-spatial attribute table

• Table fields (columns) are also referred to as attributes and are in GIS essentially variables which values provide characteristics (measurements, descriptions) of individual features represented by records (rows).

• When designing a table and its fields and attributes the overall rules are:

– Simple values

– Values are of minimal length

– Individual values – smallest element

– Values are easily understandable and searchable

– Values are unambiguous

– Potential for errors is minimized

– Data redundancy is reduced

– Potential logical inconsistency is avoided

Page 3: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

• In order to reduce the overall data size and redundancy a database is often separated into smaller tables that are related and linked to each other, creating a relational database.

• The process of redesigning tables to reduce data redundancy and potential logical inconsistency, including, when needed, creating separate tables linked to each other, is called normalization.

• Database normalization is defined through degrees of normalization – First Normal Form, Second Normal Form, Third Normal Form, Fourth Normal Form, Fifth Normal Form, etc.

• A common GIS database conforms to the Third Normal Form and those below it.

• The goal of the normalization:

– Avoid (eliminate) insertion, update and deletion anomalies.

– Avoid redundant data and waste space that may cause data integrity problems.

– Avoid possible logical inconsistency.

– To ensure the separate tables can be maintained and updated separately and linked when necessary.

– To facilitate a distributed database (Cheng 2006)

• Normalization of databases was first proposed and explained by E.F. Codd, an IBM researcher, in his 1970 paper “A Relational Model of Data for Large Shared Data Banks” (Codd 1970). Since then, the whole concept has been somewhat modified but the main components of it are still valid.

Database Design

Page 4: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

• Un-normalized table (multiple and redundant values)

First Normal Form (1NF)

Source: Chang (2006)

Database Normalization

1. There’s no top-to-bottom ordering to the rows.2. There’s no left-to-right ordering to the columns.3. There are no duplicated rows (there is a (composite) Primary Key field in the table).4. Every row-and-field intersection contains exactly one value for the applicable domain

(and nothing else).

First Normal Form (1NF)

Page 5: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

• The table should conform to 1NF.• All non-prime attributes should be functionally dependent on a primary key (or the whole of a

composite primary key).• A Primary Key is a field with unique values, that identifies each record in the table.• A Foreign Key is a field with non-unique values and which individual values match values in a Primary

Key field.

Second Normal Form (2NF)

Second Normal Form (2NF)

• A Foreign Key field is used to connect a table to another table with a Primary Key field and can contain non-unique values.

• Another and widely used definition of 2NF is that a non-prime field cannot be dependent on a subset of a composite primary key – field (attribute) dependence on a subset of a composite primary key is called partial dependency.

Source: Chang (2006)

Page 6: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

Source: Chang (2006)

Third Normal Form (3NF)

• The table should conform to 2NF.

• All non-prime fields can be functionally dependent only on a (composite) primary key field and not on a non-prime field – field (attribute) dependence on a non-prime field is called transitive dependency.

• The above type of functional dependency between non-prime fields can potentially cause logical inconsistency in the data (data quality!).

Primary KeyForeign Key

Third Normal Form (3NF)

Page 7: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

Database Design - Table Join in ArcGIS• Often, related data are stored in two or more different tables for the reasons of database

simplification, easier editing and management, smaller size.• Two tables can be joined as long as they both have a field in which some or all values are the same

as in the field in the other table(s) and the values are stored as the same data type. • The linking field in the origin table is called the primary key and in the destination table the foreign

key.

Primary Key

Foreign Key

One to Many

• There are three types of relationships between two tables: o One to Oneo One to Manyo Many to Many

Many to Many

Origin Table

Destination Table

Destination TableOrigin Table

Destination Table

Origin Table Origin Table

http://www.kenticosolutions.com/Developer-Tips/Tip/May-2011/Many-to-Many-relationships-in-the-Kentico-CMS-Cont.aspx

ArcGIS 10.1, Help

ArcGIS 10.1, Help

Page 8: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

Database Design

• In ArcGIS, origin vs destination table and simple vs composite relationship.

Primary key Foreign key

Page 9: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

Database Design

• In ArcGIS, it is important to get the origin and destination table right.

Page 10: Geodatabase Database Design - Lakehead Universityflash.lakeheadu.ca/~forspatial/4419/gdb_database_design.pdf · Geodatabase Database Design Tomislav Sapic GIS Technologist Faculty

Sources:

Chang, Kang-tsung. 2006. Introduction to Geographic Information Systems. McGraw Hill Higher Education. Third Edition.Codd, E.F. 1970. A relational model of data for large shared data banks. Communications of the ACM, Vol. 13. https://www.seas.upenn.edu/~zives/03f/cis550/codd.pdfKent, W. 1983. A Simple Guide to Five Normal Forms in Relational Database Theory. Communications of the ACM, Vol 26(2). http://www.bkent.net/Doc/simple5.htm