the information school of the university of washington info-340: database management &...
TRANSCRIPT
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
INFO-340: Database Management &
Information Retrieval
David HendryClass L-02
INFO-340: Class 2 2
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Topics
• Information Systems • Database systems: Short History• Three-level ANSI-SPARC Architecture• Functions of a DBMS
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Q & ASyllabus
Assignment #1
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton Information Systems
INFO-340: Class 2 5
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Information Systems
• Examples– Airline reservation system– ATM network – File system on a PC– CD collection at home– Museum or art gallery – Website – File sharing system– A personal stamp collection or family
scrapbook
INFO-340: Class 2 6
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
• An Information System The resources that enable the collection, management, control, and dissemination of information throughout an organization
INFO-340: Class 2 7
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Components of Information System
• Stakeholders – Management– Division workers– Customers– Partners
• Inputs & Outputs – Traffic – Sales
• Data – Plans– Calendars & events – Part assemblies – Business
transactions
• Procedures – Updating data– Transferring data
INFO-340: Class 2 8
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Components of Systems
Supplier CustomerSystem
Environment
Input
Input Output
Output
Process StakeholderStakeholder
INFO-340: Class 2 9
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
System
Sub-systemBoundary
INFO-340: Class 2 10
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Three Key Ideas
• Systems are hierarchical – Systems consist of sub-systems
• Systems are nearly decomposable– Interaction between subsystems is weak
• System boundaries are arbitrary – Where you set a boundary requires
judgment
INFO-340: Class 2 11
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Class Exercise:Museum as Information
System
• What questions should you answer?
INFO-340: Class 2 12
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Museum as Information
System • Who are the stakeholders?• What is the environment? • What are the inputs, processes &
outputs? • Where are the system boundaries? • How does the system hierarchical
decompose? • Where does the strict
decomposition fail? ‘• Where are the feedback loops?
INFO-340: Class 2 13
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Components of Systems
• Environment: Where the system operates• System: Interacting components that work
together to complete a function• Subsystem: A system is made up of other
systems (HIERARCHICAL)• Boundary: What is inside and outside the
system• Inputs & Outputs: Material flowing into
and out of a system• Process: What gets done?
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Development Lifecycle
INFO-340: Class 2 15
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Development Lifecycle
Define: Vision/scopeNeeds assessment
Design: Invent thetechnological solution
Develop: Build the technology
Deploy: Delivery stabletechnology
Vision/scope document
Design specificationsdocument
Beta software
Version Release
INFO-340: Class 2 16
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Database Development
1. Analysis of functional requirements2. Conceptual design3. Logical design4. Physical design 5. Implement6. Test7. Maintain
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton Database Systems
INFO-340: Class 2 18
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Evolution of Database Systems
• File-based systems (1950s – now)• Application programs process files
• 1st Generation (mid 1960s – mid 1980s)
• Hierarchical & Network databases
• 2nd Generation (mid 1970s – now)• Relational database systems
• 3rd Generation (early 1990s – now)• Object-oriented database systems
INFO-340: Class 2 19
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
File Systems
• Application programs manage own data files and produce reports
• Collection of programs was often based on functional areas (payroll vs. personal)
INFO-340: Class 2 20
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
File-Based Data Processing
Payroll System
Personal Data
TaxData
ProjectsData
Project Management System
Personal Data
S1
S2
INFO-340: Class 2 21
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Weaknesses
• Program-data dependence• Separation and isolation of data• Duplication of data• Incompatibility of files • Many, many application programs
INFO-340: Class 2 22
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Key Lesson Learned
1. Program-data independence is good– Programs should not responsible for the
definition of data formats
2. Centralized control of data access is good
– Programs should not be responsible for security, access control, and certain kinds of data integrity
INFO-340: Class 2 23
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
1st Generation: Record-Based DBMS
• To address these problems two types of databases were developed in the 60s and early 70s
– Network data models– Hierarchical data models
INFO-340: Class 2 24
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Hierarchical/Network Data Model
Courses
Students
• Collections of ‘records’ • Pointers used to create ‘sets’
INFO-340: Class 2 25
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Lessons Learned
• Better on – Data independence– Sharing data
• However, complex application programming– Chasing ‘pointers’ to navigate data
INFO-340: Class 2 26
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
2nd Generation: Relational Model
• Data modeled as table, rows, columns • No pointer chasing • Grounded in theory (relational algebra)
INFO-340: Class 2 27
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
3rd Generation: Object-Oriented Database
Management Systems
• Domain objects (entities, relationships, etc.) modeled directly rather than with tables, rows, columns
• Very important in Engineering Domains
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Three-level ANSI-SPARC architecture
INFO-340: Class 2 29
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
INFO-340: Class 2 30
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
External Level
• Different users require different data views– Specific information for goals, job
roles, etc.
• Some information is derived/calculated– Dynamic calculations (age)– Complex combinations of data
INFO-340: Class 2 31
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Conceptual Level
• What data is stored and the relationships between the data
• Key concerns:– Entities, attributes, relationships– Data types– Constraints– Security and integrity info
INFO-340: Class 2 32
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Internal Level
• How the data is stored – Optimal run-time performance– Optimal space utilization
• Key concerns:– Storage space for data and indices– Record size and placement– Data compression and encryption
INFO-340: Class 2 33
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Schemas: Contain information for mapping from one level to the next
INFO-340: Class 2 34
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Data Independence
• Logical data independenceChanges in the conceptual schema do not cause the external schemas to ‘break’ (If they fail, they fail gracefully)
• Physical data independenceChanges to the internal schema do not cause the conceptual schema to ‘break’
INFO-340: Class 2 35
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Class Exercise
1. Working in teams of 3-4, select an example database application and sketch a picture of:
– External schema – Conceptual schema– Internal schema
2. Give an example of data independence and data dependence
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton Functions of DBMS
(See Chapter #2)
INFO-340: Class 2 37
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Functions of DBMS
1. Data storage, retrieval, and update2. A user-accessible catalog 3. Transaction support4. Concurrency control5. Recovery services6. Authorization services7. Support for data communication 8. Integrity services
INFO-340: Class 2 38
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Summary
• Evolution of Database Systems– File-based– 1st – 3rd generation systems
• Three-level ANSI-SPARC Architecture
• Functions of a DBMS
INFO-340: Class 2 39
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Data storage, retrieval, and update
Ability to store, retrieve and update data
Key idea: Hide internal representation of how this is achieved
INFO-340: Class 2 40
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
A user-accessible catalog
• Provide users with a catalog that complete describes the database– Tables and relationships– Names, types and sizes of data items– Etc.
• Purposes:– “Self revealing” for understanding data – Data integrity and security is enforced– Store auditing information
INFO-340: Class 2 41
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Transaction support• A transaction is a series of actions
– Example: Staff member quits1. Delete staff member from database2. Re-assign responsibilities to another staff
member
• Issue: Must avoid putting the database into an inconsistent state
• Thus: All steps of a transaction are completed or none are completed
INFO-340: Class 2 42
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Concurrency control
• Ensuring the multiple users do not conflict with each other and put the database into an inconsistent state
• Easy for read-only situations • Hard when multiple users can
read and write• See lost-update problem
INFO-340: Class 2 43
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Recovery services
• Databases ‘crash’– Power goes out– Disks and CPUs fail– Intruders cause systems to fail– Etc.
• Provide a method for recovering the database and returning it to a consistent state
INFO-340: Class 2 44
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Authorization services
• Depending on job role, have access to different information and operations – Querying data– Changing data– Deleting data – Adding data
• Must be able to give ‘access permissions’ to people
INFO-340: Class 2 45
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Support for data communication
• Ability to access central databases from remote client locations – This idea, of course, ‘powers the web’
• Databases must handle requests and responses
INFO-340: Class 2 46
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton
Integrity services
• Rules that specify the valid states of the data within the data base
• Examples– Every employee must have a
manager– Managers supervise a max of 10
employees
INFO-340: Class 2 47
Th
e I
nfo
rmati
on
Sch
ool
of
the
Un
ivers
ity o
f W
ash
ing
ton4GLs
• High-level applications that are ‘closer’ to users goals
• Example types (e.g., Access):– Form generators – Report generators– Graphics generators– Application generators