overview of a database management system single boxes:- represent system components double boxes:-...

38
Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :- indicate control and data flow Dashed Lines:- indicate data flow only

Upload: edwina-shields

Post on 11-Jan-2016

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Overview of a Database Management System

Single Boxes:- represent system components

Double Boxes:- represent in-memory data structures

Solid Lines :- indicate control and data flow

Dashed Lines:- indicate data flow only

Page 2: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-
Page 3: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Overview of a Database Management System

Sources of commands to the DBMS.1. End Users and application programs that ask

for data or modify data

2. A DBA (Database Administrator) a person or persons responsible for the structure or

schema of the database.

(Schema – the overall description of the database logical structures that is defined by the DDL)

Page 4: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Three main components of DBMS1. Storage Manager - responsible for storing

data, metadata, indexes and logs. 2. Query Manager – parses queries, optimized

them by selecting query plan, and executes the plan on the stored data

3. Transaction Manager - logging database changes to support recovery after a system crashes. Also support concurrent execution of transactions in:

• Atomicity – a tranx is performed either completely or not at all

• Isolation – a tranx is executed as if there was no other concurrently executing tranx.

Page 5: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Data-Definition Language Commands

• DDL (Data Definition Language) – defines the format of each data element in a database. Database tables for eg. Are created and dropped using the DDL

• Parsed by a DDL processor and pass to the execute engine

• Then goes through the index / file / record manager to alter the metadata, that is the schema information for the database.(metadata – information describing the nature and structure of an

organization’s data: data about data)

Page 6: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Overview of Query Processing

• A user or an application program initiates some action that does not affect the schema of the database, but may affect the content of the database (if the action is a modification command) or will extract data from the database (if action is a query)

• These command are expressed in DML (Data Manipulation Language)

• ie,SQL is an example of a DML.

Page 7: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Overview of Query Processing

• DML statements are handled by two separate subsystems:-

- Answering the Query

- Transaction processing

Page 8: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Answering the Query

1:- Query is parsed and optimized by a Query Compiler.

2:- The resulting Query Plan or sequence of actions the DBMS will perform to answer the Query is passed to the execution engine.

Page 9: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Answering the Query

3:- The execution engine issues a sequence of requests for small pieces of data, typically records or tuples of a relation, to a resource manager that knows about data files (holding relations), that format and size of records in those files, and index files which help find elements of data files quickly.

Page 10: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Answering the Query

4:- The requests for data are translated into pages and these requests are passed to the buffer manager.

(Task of a buffer manager is to bring appropriate portions of the data from secondary storage (disk) where it kept permanently to main memory buffer. Normally the page or disk block is the unit of transfer between buffer and disk).

Page 11: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Answering the Query

- The buffer manager communicates with a storage manager to get data from disk

- The storage manager might involve OS command

- Typically DBMS issues commands directly to the disk controller

Page 12: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Transaction Processing

• Queries and other DML actions are grouped into transactions, which are units that must be executed atomically and isolation from one another.

• Each Query or modification action is a transaction by itself.

• Execution of a transactions must be durable.

Page 13: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Transaction Processing

• Two major parts of a Transaction Processor

1. Concurrency control manager or scheduler

– responsible for answering atomicity and isolation of transaction.

2. Logging and recovery manager

– responsible for the durability of transactions

Page 14: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Transaction Processing

• DBMS offer the guarantee of durability

• Transaction manager therefore accepts transaction commands from an application which tell the transaction manager when transaction begin and end as well as information about the expectations of the application

Page 15: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Transaction Processing

• Task that transaction processor performs

1.Logging: -to assure durability, every change in the database is logged separately on disk. Log Manager assure when a system failure occurs, a recovery manager will be able to examine the log of changes and restore the database.

Page 16: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Transaction Processing

2. Concurrency control:- - assure that the individual actions of multiple transactions are executed in such an order that the net effect is the same as if the transactions had in fact executed in their entirely, one at a time.

Page 17: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Transaction Processing

Typical scheduler does its work by maintaining LOCK on certain pieces of the database. These locks prevent two transactions from accessing the same piece of data in ways that interact badly. Lock are generally stored in a main memory lock tables

Page 18: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Transaction Processing

3. Deadlock resolution:- A situation where none can proceed because each needs something another transaction has. –

Transaction Manager has the responsibility to intervene and cancel (roll back or abort) one or more transactions to let the other proceed.

Page 19: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Storage and Buffer Management

• Database normally resides in the secondary memory

• Data must be in the main memory for any useful operation to be perform

• It is the storage manager’s job to control the placement of data on disk and its movement between disk and main memory.

Page 20: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Storage and Buffer Management

• Simple database – storage manager might be nothing more than the file system of the underlying OS.

• For efficiency purpose DBMS normally control storage on disk directly.

• Storage manager keeps track of the location of files on the disk and obtains the block or blocks containing a file on request from the buffer manager.

Page 21: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Storage and Buffer Management

• The buffer manager is responsible for partitioning the available main memory into buffers, which are page-sized regions into which disk blocks can be transferred.

Page 22: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Storage and Buffer Management

• All DBMS components that need information from disk will interact with the buffers and the buffer manager, either directly or through the execution engine.

Page 23: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Storage and Buffer Management

• Kind of information that various component may need include:

Data :- the contents of the database

Metadata :- the database schema that describes the structure of, and constrains on, the database

Statistics :- Information gathered and stored by the DBMS about data properties.

Page 24: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

The Query Processor

• The portion of the DBMS that most affects the performance that the user sees is the query processor.

• Two component that represent the query processor

- Query Compiler

- Execution Engine.

Page 25: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

The Query Processor

• Query Compiler

-translate the query into an internal form called query plan

- then a sequence of operations performed on the data

Page 26: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

The Query Processor

• Query Compiler consists of three major units(i) Query parser :- which build a tree structure from the textual form of the query

(ii) Query Processor :- which perform semantic checks on the query and perform some tree transformations to turn the parse tree into a tree of algebraic operators representing the initial query plan.

(iii) Query Optimizer :- transforms the initial query plan into the best available sequence of operations on

the actual data.

Page 27: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

The Query Processor

• The query compiler uses metadata and statistics about the data to decide which sequence of operations is likely to be the fastest.– For example: the existence of an index, which

is a specialized data structure that facilitates access to data, given values for one or more components of that data, can make one plan much faster then another.

Page 28: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

The Query Processor

• Execution Engine :- has the responsibility for executing each of the steps in the chosen query plan

The execution engine interacts with most of the other component of the DBMS, either directly or through the buffers. Data must get into the Buffer in order to be manipulate. It needs to interact with the scheduler to avoid accessing data that is locked and with the log manager to make sure that all database changes are properly logged.

Page 29: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

The Acid Properties of Transactions

• Properly implemented transaction are commonly said to meet the “ACID test” where :-“A” = atomicity, the all or nothing execution of transactions.“I” = isolation, the fact that each transaction must appear to be executed as if no other transaction is executing at the same time.

Page 30: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

The Acid Properties of Transactions

“D”= durability, the condition that the effect on the database of a transaction must never be lost once the transaction has completed.

The remaining “C” stands for consistency. That is, all databases have consistency constraints, or expectations about relationships among data elements (eg. Account balances may not be negative) Transactions are expected to preserver the consistency of the database.

Page 31: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Summary

• End of lecture

Page 32: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Outline of Database System Studies

• Ideas related to Database system can be divided into three broad categories.1. Design of Databases

How does one develop a useful database?What kind of information go into database?How is the information structured?What assumption are made about types or values of data items?How do data items connect?

Page 33: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Outline of Database System Studies

2. Database Programming.

How does one express queries and other operations on the database?

How does one use other capabilities of a DBMS such as transactions or constraints

in an application?

How is database programming combined with conventional programming?

Page 34: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Outline of Database System Studies

3. Database System Implementation.How does one build a DBMS, including such matters as query processing and organizing storage for efficient assess?

Storage management: how secondary storage is used effectively to hold data and allow it to be accessed quickly.

Query processing: how queries expressed in a very high-level language such as SQL can be executed efficiently.

Transaction management: how to support transaction with ACID properties.

Page 35: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Outline of Database System Studies

4. Information Integration

Much of the recent evolution of database systems has been toward capabilities that allow different data sources, which may be databases and/or information resources that are not managed by a DBMS, to work together in a larger whole.

Page 36: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Index• How Indexes are Implemented

You can use the Indexed property to set a single-field index. An index speeds up queries on the indexed fields as well as sorting and grouping operations. For example, if you search for specific employee names in a LastName field, you can create an index for this field to speed up the search for a specific name.

SettingThe Indexed property uses the following settings.

Setting DescriptionNo (Default) No index.Yes (Duplicates OK) The index allows duplicates.Yes (No Duplicates) The index doesn't allow duplicates.

Page 37: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Index• RemarksUse the Indexed property to find and sort records by

using a single field in a table. The field can hold either unique or non-unique values. For example, you can create an index on an EmployeeID field in an Employees table in which each employee ID is unique or you can create an index on a Name field in which some names may be duplicates.

• Note   You can't index Memo, Hyperlink, or OLE Object data type fields.

You can create as many indexes as you need. The indexes are created when you save the table and are automatically updated when you change or add records. You can add or delete indexes at any time in table Design view.

Page 38: Overview of a Database Management System Single Boxes:- represent system components Double Boxes:- represent in- memory data structures Solid Lines :-

Summary• Overview of a Database Management System

- DDL Data Definition Language Command- Query Processing- Storage and Buffer Management- Transaction Processing- The Query Processor

• Outline of Database-System Studies - Design of a Database- Database Programming- Database System Implementation- Information Integration