©2007 by austin troy. all rights reserved lecture 6 database concepts introduction to gis by brian...
TRANSCRIPT
©2007 by Austin Troy. All rights reserved
Lecture 6
Database Concepts
Introduction to GIS
By Brian VoigtUniversity of Vermont
Thanks are due to Dr Troy, upon whose lecture much of this material is based.
Fundamentals of GIS
Lecture slides by Austin Troy & Brian Voigt, University of Vermont, © 2011
People• typically use a decimal (base 10) number system• each digit corresponds to 10 to some power• a number with 3 digits has 103 or 1000 possibilities
Computers• use a base 2 system … a bit (Binary digIT) has 2
possible values• a byte is 8 “on-off switches” or bits • each switch/bit represents a binary number• one byte is 28 or 256 possibilities
How is Data Stored?
©2007 by Austin Troy. All rights reserved
Example: 12,345 (base 10)
1*10,000 + 2*1,000 + 3*100 + 4*10 + 5*1
1*104 + 2*103 + 3*102 + 4*101 + 5*100
Introduction to GIS
How is Data Stored?
©2007 by Austin Troy. All rights reserved
• Switch combinations determine decimal number based on the formula:
• N10= 2b-1+2b-2+…2b-b
• Where b= number of bits storing the number• Hence the binary number
111111112 = 27*1+ 26 *1 + 25 *1 + 24 *1 + 23 *1
+22 *1 + 21 *1 + 20 *1 = 25510
111111102 = 27*1+ 26 *1 + 25 *1 + 24 *1 + 23 *1 +22 1 + 21 *1 + 20 *0 = 25410
Introduction to GIS
How do binary numbers translate to real numbers?
©2007 by Austin Troy. All rights reserved
• Number of possible values for a unit of data is an exponential function of the number of switches • 28 = 256 eight bit data• 216 = 65,536 sixteen bit data• 232 = 4,294,967,296 thirty two bit data
Introduction to GIS
Number of Possible Values
©2007 by Austin Troy. All rights reserved
Number of bits determines data types
Examples of Integer data types Byte: 28 (0 to 255) Short Integer: 216 (ranges from –32,767 to +32,767 without decimals, the sixteenth bit determines sign) Long Integer: 232 (+/-2.147483648e+09 )
Floating point: precision can go to either side of decimal place
Single precision: 32 bit Double precision: 64 bit
Introduction to GIS
©2007 by Austin Troy. All rights reserved
Currency
Date (recognizes order in dates)
String (text)
When numbers are represented as text they
have no numerical properties (e.g. zip codes)
Boolean (TRUE/FALSE ; yes/no)
Object (“BLOB”, e.g., image file)
Introduction to GIS
Other Data Types
©2007 by Austin Troy. All rights reserved
Three Classic Database Models
• Hierarchical
• Network
• Relational - ArcView and ArcInfo use this model
Introduction to GIS
©2007 by Austin Troy. All rights reserved
Hierarchical Database ModelA one-to-many method for storing data in a database that
looks like a family tree with one root and a number of branches or subdivisions. Problem: table linkages must be known in advance
Introduction to GIS
Groovy 70s TV
Action shows Drama Sitcoms
Dukes of Hazzard CHIPs
Dallas Fantasy Island
Love Boat Starsky and Hutch
Tom Wopat
Eric Estrada
Gavin McLeodLarry
Wilcox
Larry Hagman Ricardo
Montalban
David Soul
Ted Lange
©2007 by Austin Troy. All rights reserved
Hierarchical Database Model
Introduction to GIS
Example where this model works well:
• Taxonomies• Soil classification
Works when: classes are totally mutually exclusive / exhaustive
©2007 by Austin Troy. All rights reserved
Hierarchical Database Model
Introduction to GIS
Problem:• Does not work when there are entities that belong to several classes or do not have mutual exclusivity• Think about the problems with Windows Explorer• Example: classifying your music collection
©2007 by Austin Troy. All rights reserved
Networked Database Model
A database design for storing information by linking all records that are related with a list of “pointers.” Problem: linkages in the tables must be known before. Not adaptable to change.
Introduction to GIS
Action shows Drama Sitcoms
Dukes of Hazzard CHIPs
Dallas Fantasy Island
Love Boat Three’s company
ABCCBSNBC
©2007 by Austin Troy. All rights reserved
Relational (Tabular) Database Model
A design used in database systems in which relationships are created between one or more flat files or tables based on the ideathat each pair of tables has a field in common, or “key”. In a relational database, the records are generally different in each table
The advantages: each table can be prepared and maintained separately, tables can remain separate until a query requires connecting, or relating them, relationships can be one to one, one to many, or many to one
Introduction to GIS
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Name Phone Address Student ID
*** *** *** ***
*** *** *** ***
*** *** *** ***
*** *** *** ***
*** *** *** ***
• Headings are the labels for the columns
• Fields, or columns, are attribute categories
• Records, or rows, are the unit that the data are specific to
• Cells are where individual values of a record for a field are stored
Data Tables (flat files)
RECORD
FIELD
CELL
©2007 by Austin Troy. All rights reserved
Introduction to GIS
Course name
Course number
enrollment faculty ID
*** *** *** ***
*** *** *** ***
*** *** *** ***
*** *** *** ***
*** *** *** ***
The key is a field that is common to two or more flat files; allows a query to be done across multiple tables or allows two tables to be joined
Data key
Name Phone Address faculty ID
*** *** *** ***
*** *** *** ***
*** *** *** ***
*** *** *** ***
*** *** *** ***
Flat file: professor info Flat file: course info
©2007 by Austin Troy. All rights reserved
• Based on the values of a field that can be found in both tables• The name of the field does not have to be the same• The data type has to be the same• In this case we have a one to one join; here the key is unique
Key A B
1
2
3
ID C
1
2
1
2
3
4
5
6
10
20Key A B
1
2
3
1
2
3
4
5
6
C
10
10
50JOIN
Introduction to GIS
3 50Table 1
Table 2
Joining Tables
Joined Table
©2007 by Austin Troy. All rights reserved
Join Tables
Key A B
1
1
2
Key C
1
2
1
2
3
4
5
6
10
20Key A B
1
1
2
1
2
3
4
5
6
C
10
10
20JOIN
Introduction to GIS
In this case we have a one to many join; here the key is not unique
©2007 by Austin Troy. All rights reserved
Relational (Tabular) Database Model: 70s TV exampleNow we can have various flat files (tables) with different record types and with various attributes specific to each record
Introduction to GIS
Actor Year born*
Sideburn length
David Soul 1948 serious
Eric Estrada 1949 moderate
Larry Wilcox
1953 slight
Tom Wopat 1950 major
Show Lead actor
Co-star Network*
Starsky and Hutch
David Soul
Paul Michael Glaser
ABC
CHIPs Eric Estrada
Larry Wilcox CBS
Dukes Tom Wopat
John Schneider
NBC
*entirely guessed at- I am not responsible for mistaken TV trivia
Table 1- specific to actors
Table 2- specific to shows
©2007 by Austin Troy. All rights reserved
Relational (Tabular) Database ModelThis allows queries that go across tables, like which CBS lead actors were born before 1951? Answer: Tom Wopat and David Sole
Introduction to GIS
Actor Year born*
Sideburn length
David Sole 1948 serious
Eric Estrada 1949 moderate
Larry Wilcox
1953 slight
Tom Wopat 1950 major
Show Lead actor
Co-star Network*
Starsky and Hutch
David Soul
Paul Michael Glaser
CBS
CHIPs Eric Estrada
Larry Wilcox ABC
Dukes Tom Wopat
John Schneider
CBS
*entirely guessed at- I am not responsible for mistaken TV trivia
It does this by combining information from the two tables, using common key fields
©2007 by Austin Troy. All rights reserved
Relational (Tabular) Database ModelObject-relational databases can contain other objects as well, like images, video clips, executable files, sounds, links
Introduction to GIS
Actor Year born*
Sideburn length
Picture
David Soul 1948 serious
Eric Estrada 1949 moderate
Larry Wilcox
1953 slight
Tom Wopat 1950 major
©2007 by Austin Troy. All rights reserved
Relational Database: another example: property lot info
Introduction to GIS
One-to-one relationship
Parcel ID
Street address
zoning
11 15 Maple St.
Residential-1
12 85 Brooks Ave
Commercial-2
13 74 Windam Ct.
Residential 4Owner Parcel ID occupation
J. Smith
13 lawyer
R. Jones
11 dentist
T. Flores
12 Real estate developer
©2007 by Austin Troy. All rights reserved
Relational database: one to many relationship
Introduction to GIS
One-to-many relationship
Parcel ID
Street address
zoning
11 15 Maple St.
Residential-1
12 85 Brooks Ave
Commercial-2
13 74 Windam Ct.
Residential 4
Owner Parcel ID occupation
J. Smith 13 lawyer
R. Jones 11 dentist
J. McCann
12 financier
T. Flores
12 Real estate developer
In this case, several people co-own the same lot, so no longer one lot, one person
©2007 by Austin Troy. All rights reserved
Assuming each owner owned several parcels, we would structure the database differently
Introduction to GIS
One-to-many relationshipParcel
IDStreet address
zoning
11 15 Maple St. Residential-1
12 85 Brooks Ave Commercial-2
13 74 Windam Ct. Residential 4
Owner occupation # properties owned
J. Smith lawyer 2
R. Jones dentist 5
J. McCann financier 2
T. Flores Real estate developer
3
Properties owned by T. Flores
Owner Parcel ID
Date of transaction
Flores 13 4-15-00
Flores 15 4-17-01
Flores 19 3-12-99
Note: this table includes data pertinent only to Flores’ ownership of these properties
©2007 by Austin Troy. All rights reserved
ExampleHere’s an example of a chart showing the relationships between flat files in a sample relational database for food suppliers* in Microsoft Access
Introduction to GIS
* This comes from an MS ACCESS sample database
©2007 by Austin Troy. All rights reserved
Introduction to GIS
* This comes from an MS ACCESS sample database
©2007 by Austin Troy. All rights reserved
Introduction to GIS
* This comes from an MS ACCESS sample database
A real time RDBMS allows for realtime linking and embedding of tables based on common fields
Here we see all the orders for product ID 3; there is no need to include product ID in that sub-table