foundations of databases may bring a a4 double-sided hand-written cheat sheet . passing earns you...

34
Foundations of Databases (Datenbanken I) Prof. Dr. Torsten Grust U T¨ ubingen, Database Systems [email protected] Summer term 2009 Welcome to the completely rectangular world of. . . ... Relational Database Management Systems (RDBMSs). What this course is about: Convince you that there is more to database technology than just open-file(), read()/write(), close-file(). Make you see how versatile the strictly tabular data model supported by relational databases can be. Make you best friends with SQL, the principal language spoken by relational database systems. We will encounter a healthy mix of good, clean theory and highly relevant CS practice—knowledge of RDBMSs and SQL makes you sexy (in a sense).

Upload: vuongque

Post on 25-Jun-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Foundations of Databases(Datenbanken I)

Prof. Dr. Torsten Grust

U Tubingen, Database Systems

[email protected]

Summer term 2009

Welcome to the completely rectangular world of. . .2

. . . Relational Database Management Systems (RDBMSs).

What this course is about:

• Convince you that there is more to database technology

than just open-file(), read()/write(), close-file().

• Make you see how versatile the strictly tabular data model

supported by relational databases can be.

• Make you best friends with SQL, the principal language

spoken by relational database systems.

• We will encounter a healthy mix of good, clean theory and

highly relevant CS practice—knowledge of RDBMSs and SQL

makes you sexy (in a sense).

Page 2: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Administrativa (1)3

Lectures

Time slot Room

Monday, 1515–1645 Sand 6/7, gr. Horsaal

Tuesday, 0815–0945 Sand 6/7, gr. Horsaal

Practice

Time slot Room

Thursday, 1015–1145 Sand 6/7, kl. Horsaal

Will this be a good fit for most of you? Please speak up.

Administrativa (2)4

End-term Exam

• 90 mins. examination on Monday, July 20th, 1515.

• You may bring a A4 double-sided hand-written cheat

sheet.

• Passing earns you 6 ECTS.

Assignments and Grading

• We will distribute, collect, and grade weekly

assignments.

• You may—and you should—work in teams of two.

• Scoring 2/3 of the overall points in the assignments

earns you an additional 2 ECTS.

Page 3: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Administrativa (3)5

Course web home

http://www-db.informatik.uni-tuebingen.de/

teaching/ss09/db1

• Download slides

(PDF—please bring a print-out and take notes)

• Download assignments, sample tabular data, code

snippets, . . .

• Please check now and then (“. . . assignment unsolvable

as given. . . ”, “. . . no lecture on . . . ”, etc.)

• Contact information

Just drop by in our offices, send e-mail first if you

require specific help/longer attention.

These Slides6

Examples

Definitions

Code snippets

Quizzes

• A specific slide set suitable for printing (lighter colors, . . . )

will be available on the course web home.

Page 4: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Read a Book, Write some SQL7

Text book

(any introductory book is probably fine—ask me)

Alfons Kemper, Andre Eickler

Datenbanksysteme—Eine Einfuhrung

Oldenbourg Verlag

6th or 7th ed.

Install IBM DB2 V9.5 Express-C

• : Full-featured, fast, freely available.

• We will bring it with us for almost any lecture.

• Download @ http://db2express.com/

Page 5: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

ActiveRecord (Ruby on Rails)9

• If time permits, we will close the course with an introduction

to and overview of ActiveRecord.

. ActiveRecord enables a truly seamless

embedding of database access and query

functionality into programming and scripting

languages (here: Ruby).

. You write Ruby fragments, the ActiveRecord framework

generates equivalent SQL commands for you.

. ActiveRecord is the glue between relational

databases and front-end web applications,

usually developed using Ruby on Rails.

Introduction10

• After completing this chapter, you should be able to:

. explain basic notions: database state, schema, query,

update, data model, DDL, DML,

. explain the role of the DBMS,

. explain data independence, declarativity, and the three

schema architecture,

. name different classes of users of a database application

system,

. name some DBMS tools.

Page 6: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Introduction11

Overview

1. Basic Database Notions

2. Database Management Systems (DBMS)

3. Programmer’s View, Data Independence

4. Database Users and Database Tools

Task of a Database (1)12

• What is a database? Difficult question. There is no precise

and generally accepted definition.

• Naıve approach:

The main task of the database system (DBS) is to answer

certain questions about a subset of the real world, e.g.

Questioning a DBS

Which homeworkshas Ann Smith

completed?

// DatabaseSystem

// 12

Page 7: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Task of a Database (2)13

• The DBS acts only as a storage for information. The

information must first be entered and then kept current.

Keeping a DBS current

Ann Smith hasdone Homework 3

and received10 points for it

// DatabaseSystem

// ok.

• A DBS is computerized version of a card–index box/filing

cabinet (but more powerful and efficient).

Task of a Database (3)14

• Normal database systems do not perform particularly

complicated computations on the stored data in order to

answer questions.

• However, a DBS can retrieve the requested data quickly from

a huge set of data (giga bytes, tera bytes, � main memory

size).

• A DBS can also aggregate/combine several pieces of stored

data to answer more complex questions (“Compute the

average points for Homework 3.”)

Page 8: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Task of a Database (4)15

• Above, the question “Which homeworks has Ann Smith

completed? ” was shown in natural (English) language.

• Making machines understand natural language is a tough task

(and bears a large potential for misunderstandings).

• Therefore, questions (or queries) are normally written in a

formal language, these days typically SQL.

SQL

SQL ≡ Structured Query Language, development

started in 1986, current version SQL:2003.

Pronounced S–Q–L, or Sequel.

State, Query, Update16

• The set of stored data is called the database state:

CurrentState

Query

SELECT HOMEWORK FROM SOLVEDWHERE STUDENT = ’Ann Smith’

// Answer

• Entering, modifying, or deleting information changes the

database state:

CurrentState

Update

INSERT INTO SOLVEDVALUES (’Ann Smith’, 3, 10)

// NewState

Page 9: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Structured Information (1)17

• Each database can store only information of a predeclared

structure (a limited domain of discourse):

Structure mismatch

Today’s specialin the cafeteria

is pizza.

// HomeworkDBS

// Error.

• Because the data are structured, not simply text, complex

query formulations are possible, e.g. “How many homeworks

has each student done?”

Structured Information (2)18

• Actually, a database system stores only plain data (character

strings, numbers), and not information.

• Data becomes information by interpretation.

• Therefore, real–world concepts like students, homework,

cafeterias, etc., need to be defined/declared before the

database can be used.

A pure text database?

Which types of questions could we pose on a DBS

storing text (character strings) only with no further

structure provided?

Page 10: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

State vs. Schema (1)19

• Database Schema:

. Formal definition of the structure of the database

contents.

. Determines the possible database states.

. Defined only once (when the DB is created).

. In a programming language, this corresponds to variable

declaration (assigning a type to a variable).

Variable declaration

Example: variable declaration in C: short int i

Possible states of variable i? -32768 6 i 6 32767

State vs. Schema (2)20

• Database State (Instance of the Schema):

. Contains the actual data, structured according to the

schema.

. Changes often

(whenever database information is updated).

. Corresponds to current contents/value of a programming

language variable.

Variable state change

In state s, variable i has value 41. Now perform

state change (s to s ′) via assignment i = i + 1.

Page 11: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

State vs. Schema (3)21

• In the relational model, the data is structured in form of

tables (relations).

• Each table has a name, a sequence of named columns

(attributes), and a set of rows (tuples).

A table

SOLVEDoDB Schema

STUDENT HOMEWORK POINTS

Ann Smith 1 10 )DB State

Ann Smith 2 8

Michael Jones 1 9

Michael Jones 2 9

Data Model (1)22

• Defines a formal language (syntax & semantics) for

. declaring database schema

. querying the current database state

. changing the database state.1

• Examples:

(Network Model, Hierarchical Model), Relational Model,

Entity Relationship Model, Object–Oriented Models,

UML, XML.

1“Data model” is, regrettably, widely used for “Database schema”.

Page 12: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Introduction23

Overview

1. Basic Database Notions

2. Database Management Systems (DBMS)

3. Programmer’s View, Data Independence

4. Database Users and Database Tools

DBMS (1)24

• A Database Management System (DBMS) is an

application–independent software system that implements a

data model, i.e., allows for

. definition of a DB schema for some concrete application,

. storage of an instance of this schema on, e.g., a disk,

. querying the current instance (database state),

. changing the database state.

Application–independent vs. concrete application

Since a DBMS is application–independent, how will the DBMS

ensure to interpret the stored application data correctly?

Page 13: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

DBMS (2)25

• Normal users do not need to use SQL for their daily tasks of

data entry or data lookup.

• These users use application programs that have been

developed specifically for this task and offer a more accessible

user interface.

• Internally, these application programs translate the user

requests into SQL statements (queries, updates) in order to

communicate with the DBMS.

DBMS (3)26

• Often, several different application programs are used to

access the same centralized database.

• For example, the Homework DBS might provide:

. A read–only web interface for students.

. A program used by the TA (Hiwi) to load homework and

exam points.

. A program that prints a report for the professor used to

assign grades.

• The interactive SQL interface (SQL console) that comes

with the DBMS is simply yet another way to access the

DBMS.

Page 14: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

DBMS (4)27

User AOO

��

User BOO

��

Application ProgramOO

��

DBMS Tool (e.g., SQL console)OO

��

Database Management System (DBMS)

DB Schema

��

OO

DB State

��

OO

DB Application Systems (1)28

• Often, different users access the same database concurrently

(i.e., at the same time, touching the same data).

• The DBMS is usually implemented as a background server

process (or set of such processes) that is accessed over the

network by application programs (clients).

• One can also view the DBMS as an extension of the

operating system (a more powerful file system).

Page 15: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

DB Application Systems (2)29

Client–Server Architecture

Client

User A(Application)

Client

User B(SQL console)

Network

Server

DBMS

DB Application Systems (3)30

Three-Tier Architecture

Thin client

User A(Browser)

Thin client

User B(Browser)

ApplicationServer

Application

Web Server

Server

DBMS

Page 16: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

DB Application System (4)31

• A recap of database vocabulary:

. A database (DB) consists of a DB schema and a DB state.

. A database management system (DBMS) is a software

system that implements a data model (e.g., a Relational

DBMS (RDBMS) implements the relational model).

. A database system (DBS) consists of a DBMS and a

database.

. A database application system consists of a DBS and a

set of application programs.

Introduction32

Overview

1. Basic Database Notions

2. Database Management Systems (DBMS)

3. Programmer’s View, Data Independence

4. Database Users and Database Tools

Page 17: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Persistent Storage (1)33

• Today:

5 // factorial // 120

• Tomorrow:

5 // factorial // 120

⇒ To evaluate factorial (n 7→ n!), no persistent storage is

necessary. The output is a function of the input only.

Persistent Storage (2)34

• Today:

Ann // Homework points // 20

• Tomorrow:

Ann // Homework points // 30

⇒ The output is a function of the input and a persistent

state.

Page 18: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Persistent Storage (3)35

A DBS provides persistent state

InputAnn

// Homework points // Output30

Persistent state

Persistent information

Information that lives longer than a single process (program

execution). Survives power outage and a reboot of the

operating system.

Persistent Storage (4)36

Which of the following processes/devices need persis-

tent storage? If so, for which particular task?

1○ Web browser

2○ Pocket calculator

3○ Mobile phone

4○ Screen saver

5○ DVD recorder

Page 19: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Typed Persistent Data (1)37

• Classical way to implement persistence:

. Information needed in subsequent program invocations is

saved into a file.

. The operating system (OS) maintains the file on disk.

. Disks provide persistent memory: the contents is not lost if

the machine is switched off or the OS is rebooted.

OS files and persistence

The above statement is basically true but care should

be taken nevertheless. Why?

. File systems are predecessors of modern DBMS.

Typed Persistent Data (2)38

• Implementing persistence with files:

. OS files are usually nothing but sequences of bytes.

. A record structure must be defined on top of this (much

like in Assembler languages):

0 40 42 44

A n n S m i t h . . . 0 3 1 0

. The record and file structure is contained only in the

programmers’ heads.�

. The OS file system cannot prevent misinterpretation,

overflows, etc., because it is not aware of the file structure

Page 20: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Typed Persistent Data (3)39

• Implementing Persistence with a DBMS:

. The structure of the information to be stored must be

defined in a way the DBMS understands:

SQL DDL command

CREATE TABLE SOLVED (STUDENT VARCHAR(40),

HOMEWORK NUMERIC(2),

POINTS NUMERIC(2))

. The file structure is formally documented.

. The system can detect type errors in application programs.

. Simplified programming (higher abstraction level).

A Subprogram Library (1)40

• Most DBMSs use OS files to store the data. (Some use raw

disk device access.)

• One can view a DBMS as a subprogram library that can be

used for file access.

• Compared with the direct OS system calls for file access, the

DBMS offers higher level operations.

• The DBMS offers a wide varietry of algorithms that one

would otherwise have to program.

Page 21: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

A Subprogram Library (2)41

• For instance, a typical Relational DBMS contains routines for

. sorting (e.g., external merge sort),

. searching (e.g., B-trees),

. file space management, buffer management,

. aggregation, statistical evaluation.

• The algorithms are optimized for large data sets (that do not

fit into main memory).

• The DBMS also offers multi-user support (locking) and

safety measures to protect data against system crashes.

Data Independence (1)42

• The DBMS is a layer of software above the OS files. The

files can be accessed only via the DBMS.

• The DBMS may change the file structure internally (reorder

records, splits files, etc.) for performance reasons.

This goes unnoticed by the application program.�

• Compare with the idea of abstract data types:

The implementation changes, the interface is kept stable.

Page 22: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Data Independence (2)43

• Typical example:

. At the beginning, a professor used the homeworks DB only

for his courses in the current term.

. Since the DB was small and there were relatively few

accesses, it was sufficient to store the data as a heap file.

. Later, the entire university used the DB, and information of

previous courses had to be kept for some time.

. DB size grows significantly, DB access much more

frequently.

. An index file (e.g., a B-tree) is now needed to provide fast

access.

Data Independence (3)44

• Without DBMS:

. Using the new B-tree index to access the file must be

explicitly built into the lookup (query) commands.

. Thus, application programs need to be changed if the mode

of file access is changed.

. If one forgets to change a seldolmly used application

program, and this program does not update the index when

the data has been updated, the DB becomes inconsistent.

Page 23: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Data Independence (4)45

• With Relational DBMS:

. Already at the interface, the system completely hides the

(non-)existence of indexes on files.

. Queries and updates do not have to and cannot refer to

indexes.

. The system automatically

1○ modifies the index in case of data updates,

2○ uses the index to evaluate queries against the indexed

data when advantageous.

Data Independence (5)46

• Conceptual Schema (“interface”):

. Only logical information content of the database, relevant

to the subset of the real world modelled in the DB.

. Simplified view of the DB: physical storage details hidden.

• Internal/Physical Schema (“implementation”):

. Indexes,

. Division of tables among disks,

. Storage management if tables grow or shrink,

. Placement of new rows in a table (sort order, clustering).

Page 24: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Data Independence (6)47

1○ The user enters a query (e.g., in SQL) that only refers to

the conceptual schema.

2○ The DBMS translates this into a query/program (execution

plan) which refers to the the internal schema.

This is done by the the query optimizer.

3○ The DBMS executes the translated query on the persistent

instance of the internal schema.

4○ The DBMS translates the result back to the conceptual

level.

Back-translation?

Why would this be necessary and what would be

typical back-translation steps?

Data Independence (7)48

Changing the internal schema

Conceptual Schema

New Translation

QQQQQQQQQQQ Same Conceptual Schema

Old Internal Schema(no B-tree index)

// New Internal Schema(with B-tree index)

Page 25: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Declarative Languages (1)49

• Physical data independence requires that the query language

(SQL) cannot refer to indexes.

• Declarative query languages go one step further:

. Queries should only describe what information is sought,

. but should not prescribe any particular method how to

compute/retrieve the desired information.

Kowalski

Algorithm = Logic + Control

Imperative/Procedural Languages: explicit control, implicit logic

Declarative/Descriptive Laguages: implicit control, explicit logic

Declarative Languages (2)50

• SQL is a declarative language. The user describes conditions

the requested data is required to fulfill:

SQL query

SELECT X.POINTS

FROM SOLVED X

WHERE X.STUDENT = ’Ann Smith’

AND X.HOMEWORK = 3

• Ofter, simpler formulations of the same query are possible,

with SQL users do not have to think about efficient execution.

• More concise than imperative programming: less expensive

program development and maintenance.

Page 26: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Declarative Languages (3)51

• Declarative query languages

. allow powerful optimizers

(no evaluation method is prescribed)

. need powerful optimizers

(naıve evaluation is almost always too inefficient).

• Independence of current hardware technology and software

quality:

. Today’s queries will use tomorrow’s DBMS setup and

algorithms when a new version of the DBMS is released.

Logical Data Independence (1)52

• Logical data independence allows for changes to the logical

information content of the database.

• Such changes are obviously restricted to additions to the

logical information content.

. Example: add column SUBMISSION DATE to table SOLVED.

• Such additions may be required for new applications.

• It should not be necessary to change old applications only

because records now contain additional information.

Page 27: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Logical Data Independence (2)53

• Logical data independence is important when there are

application programs with distinct, but overlapping

information needs.

• Logical data independence also helps to integrate previously

distinct databases.

. In earlier times, every department of a company had its own

DB/data files.

. Now, businesses generally aim at one central DB.

Logical Data Independence (3)54

• If a company uses more than one DB, the information in

these databases will normally overlap, i.e., some pieces of

information will be stored several times.

• Data is called redundant if it can be derived from other data

and knowledge internal to the application.

• Problems with redundancy:

. Duplicates data entry and update efforts.

. Sooner or later, data copies will get out-of-sync and thus

inconsistent.

. Wastes storage space, also on backup media.

Page 28: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Logical Data Independence (4)55

• External Schemas/Views:

. Logical data independence requires a third level of database

schemas, the external schemas or views.

. Each user (department, . . . ) may have an individual view

of the data.

. An external view contains a subset of the information in

the database, maybe slightly restructured.

Views may also be vital because of security reasons.

. In contrast, the conceptual schema describes the complete

information content of the database.

Three–Schema Architecture56

Three–Schema Architecture [ANSI/Sparc 1978]

User User

External Schema 1VVVVVVV

. . . External Schema nhhhhhhh

Conceptual Schema

Stored data

Internal Schema

Page 29: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

More DBMS Functions (1)57

• Transactions:

. Sequences of DB commands (queries and updates) are

executed as an atomic unit (“all or nothing”).

� DBMS may crash during/after a sequence of commands

is/has been executed. The DBMS then performs

undo/redo.

. Support for backup and recovery.

. Support of concurrent users.

� Each user is given the illusion to be the only DB user at

any time. DBMS performs locking and conflict detection.

More DBMS Functions (2)58

• Security:

. Access rights: Who may perform which operations on

which table?

. Auditing: DBMS remembers who did what/when.

• Integrity:

. The DBMS checks that the entered data is

plausible/complete (such checks may also span several

tables).

. DBMS rejects updates (insertions and deletions) which

would violate defined business rules.

Page 30: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

More DBMS Functions (3)59

• Data Dictionary:

. Metadata (“data about data”, schema, user list, access

rights) is availble in system tables, e.g.:

System tables

SYS TABLESTABLE NAME OWNER

SOLVED GRUSTSYS TABLES SYSSYS COLUMNS SYS

SYS COLUMNSTABLE NAME SEQ COL NAME

SOLVED 1 STUDENTSOLVED 2 HOMEWORKSOLVED 3 POINTS

SYS TABLES 1 TABLE NAMESYS TABLES 2 OWNERSYS COLUMNS 1 TABLE NAMESYS COLUMNS 2 SEQSYS COLUMNS 3 COL NAME

Introduction60

Overview

1. Basic Database Notions

2. Database Management Systems (DBMS)

3. Programmer’s View, Data Independence

4. Database Users and Database Tools

Page 31: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Database Users (1)61

• Database Administrator (DBA):

. Should know about all schemas, may change the conceptual

and the internal schema (creates tables, creates/drops

indexes). Can damage everything.

. Gives access rights to users. Ensures security.

. Monitors system performance.

(Transaction throughput #TX/s, # concurrent users, index

sizes, . . . )

. Monitors available disk space and installs new disks.

. Ensures that backup copies of the data are made. Does

recovery after disk failures, etc.

Database Users (2)62

• Application Programmer:

. Writes programs for standard, all-day tasks, to be used by

the naıve users (see below):

� safe data entry,� report generation,� data browsing.

. Knows SQL well, plus programming languages and

development tools.

. Usually supervised by DBA.

. Might do conceptual schema design (knows which table the

application will need to access/create).

Page 32: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Database Users (3)63

• Sophisticated User (one kind of “end user”):

. Knows SQL and/or some query tools, may use SQL console.

. Does non-standard aggregations/evaluations of the data

without help from application programmers.

. May generate complex queries.

• Naıve User (the other kind of “end user”):

. Uses DB only via application programs, often unaware of

existence of DBMS back-end.

. Primarily data entry user, simple browsing-style queries

against external views.

Database Tools64

• Interactive SQL console

• Graphical/menu-based query tools

• Interface for DB access from standard programing

lanugages (C, C++, Java)

• Tools for form-based DB application (4GL)

• Report generators

• Web interface

• Tools for data import/export, backup & recovery,

performance monitoring, . . .

Page 33: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Summary (1)65

• Functions of database systems:

. Persistence

. Integration/Redundancy Avoidance

. Physical and Logical Data Independence

. Subprogram Library: many algorithms built-in, especially

tuned for external memory access (disks)

. Query and Update evaluation

Summary (2)66

• Functions of database systems, continued:

. High data safety and availability (Backup & Recovery)

. Combinations of operations into atomic transactions

. Multi-user support: synchronization of concurrent accesses

. Integrity Enforcement

. View management

. Security via data access control

. System catalog management (metadata)

Page 34: Foundations of Databases may bring a A4 double-sided hand-written cheat sheet . Passing earns you 6ECTS. Assignments and Grading We will distribute, collect, and grade weekly assignments

Summary (3)67

• The main goal of the DBMS is to give the user a simplified

view on the persistent storage, i.e., to hide any complications

introduced by the DBMS physical layer.

• The user does not worry about

. physical storage details

. different information needs of other users

. efficient query formulation

. possibility of system crashes/disk failures

. presence of concurrent users accessing identical data

subsets.

Exercise68

Data-intensive programming

• Suppose homework points data is stored in a

line-structured text file with the format

Student Name:Homework Number:Points

e.g.,

Ann Smith:3:10

• Suppose you have to write a C program that prints the

total number of points per student (students sorted

alphabetically).

• How would you judge the programming effort (in terms

of lines and time)?

• In SQL, this takes 4 lines and approx. 1 1/2 minutes.