dr. t. y. lin | sjsu | cs 157a | fall 2011 chapter 1 the worlds of database systems 1

29
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011 Chapter 1 Chapter 1 THE WORLDS OF THE WORLDS OF DATABASE SYSTEMS DATABASE SYSTEMS 1

Upload: marlene-washington

Post on 31-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

Chapter 1Chapter 1

THE WORLDS OF THE WORLDS OF DATABASE SYSTEMSDATABASE SYSTEMS

11

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

The Worlds of Database The Worlds of Database SystemsSystems

1.1 The Evolution of Database Systems1.1 The Evolution of Database Systems

1.2 Overview of a Database Management 1.2 Overview of a Database Management SystemSystem

1.3 Outline of Database-System Studies1.3 Outline of Database-System Studies

1.L4 Big Data- Reading material1.L4 Big Data- Reading material

1.4 References for Chapter 11.4 References for Chapter 1

22

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

The Worlds of Database The Worlds of Database SystemsSystems

1) Databases are involved with almost 1) Databases are involved with almost every business in the world.every business in the world.

2) Almost any website has a database 2) Almost any website has a database behind the scene that serving up the behind the scene that serving up the information you request.information you request.

3) Big Data on the Clouds3) Big Data on the Clouds

33

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

The Worlds of Database The Worlds of Database SystemsSystems

3) Corporations maintain all their important 3) Corporations maintain all their important records in databases. records in databases.

4) The power of databases comes from a 4) The power of databases comes from a powerful software that has developed powerful software that has developed over several decades and is called a over several decades and is called a Database Management SystemDatabase Management System or or DBMSDBMS..

5) Big Data on the Clouds5) Big Data on the Clouds

See the reading materialSee the reading material

44

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

THE EVOLUTION OF DATABASE THE EVOLUTION OF DATABASE SYSTEMSSYSTEMS

Section 1.1 Section 1.1

55

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1 The Evolution of Database 1.1 The Evolution of Database SystemsSystems

What is a database?What is a database?

Database is a collection of information Database is a collection of information that exists over a long period of time, that exists over a long period of time, even many years. even many years.

(TYLIN: when it starts, it never ends until (TYLIN: when it starts, it never ends until dies)dies)

The term database refers to a collection The term database refers to a collection of data that is managed by a DBMS.of data that is managed by a DBMS.

What the DBMS's do?What the DBMS's do?

66

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1 The Evolution of Database 1.1 The Evolution of Database Systems (cont'd)Systems (cont'd)

A DBMS is expected to:A DBMS is expected to:

1.1. Allow users to create new databases by Allow users to create new databases by declaring the logical structure of the data declaring the logical structure of the data (schema) using a specialized language called (schema) using a specialized language called Data Definition Language (Data Definition Language (DDLDDL).).

2.2. Give users the ability to query (a question Give users the ability to query (a question about the data) the data, modify the data about the data) the data, modify the data using a specialized language called Data using a specialized language called Data Manipulation Language (Manipulation Language (DMLDML).).

3.3. Support the storage of huge amount of data Support the storage of huge amount of data using very using very efficient access methodsefficient access methods..

77

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1 The Evolution of Database 1.1 The Evolution of Database Systems (cont'd)Systems (cont'd)

A DBMS is expected to: (cont'd)A DBMS is expected to: (cont'd)

4.4. Enable Enable durabilitydurability, the recovery of the data in , the recovery of the data in the case of failures.the case of failures.

5.5. Control access to data from many users Control access to data from many users concurrently without any unexpected concurrently without any unexpected interactions (called interactions (called isolationisolation))

88

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1 The Evolution of Database 1.1 The Evolution of Database SystemsSystems

1.1.1 Early Database Management Systems1.1.1 Early Database Management Systems

1.1.2 Relational Database Systems1.1.2 Relational Database Systems

1.1.3 Smaller and Smaller Systems1.1.3 Smaller and Smaller Systems

1.1.4 Bigger and Bigger Systems1.1.4 Bigger and Bigger Systems

1.1.5 Information Integration1.1.5 Information Integration

99

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1.1 Early Database 1.1.1 Early Database Management SystemsManagement Systems

The first DBMS's appeared in the late The first DBMS's appeared in the late 1960's.1960's.

These systems evolved from file systems These systems evolved from file systems that could just store large amount of that could just store large amount of data over a long period of time.data over a long period of time.

They did not support the requirements They did not support the requirements we counted in previous slides.we counted in previous slides.

1010

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1.1 Early Database 1.1.1 Early Database Management Systems (cont'd)Management Systems (cont'd)

The first important applications of The first important applications of DBMS's are:DBMS's are: Banking systemsBanking systems

Airline reservation systemsAirline reservation systems

Corporate record keepingCorporate record keeping

The early DBMS's used several different The early DBMS's used several different data models like 'hierarchical' or tree-data models like 'hierarchical' or tree-based model and 'network' or graph-based model and 'network' or graph-based model.based model.

These early DBMS's did not support a These early DBMS's did not support a high-level query language.high-level query language.

1111

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1.2 Relational Database 1.1.2 Relational Database SystemsSystems

Relational Model (RM) was born in 1970 Relational Model (RM) was born in 1970 by a famous paper written by Ted Codd.by a famous paper written by Ted Codd.

TYLin: David Hsiao Column based RMTYLin: David Hsiao Column based RM

Codd proposed a new two dimensional Codd proposed a new two dimensional (table) organization of data, which in (table) organization of data, which in pure mathematics is called relation.pure mathematics is called relation.

1212

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1.2 Relational Database 1.1.2 Relational Database SystemsSystems

In this new model, the programmers were In this new model, the programmers were not involved with the storage structure.not involved with the storage structure.

Queries could be expressed in a very Queries could be expressed in a very high-level language.high-level language.

By 1990, relational database systems By 1990, relational database systems were the norm.were the norm.

1313

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1.3 Smaller and Smaller 1.1.3 Smaller and Smaller SystemsSystems

Originally, DBMS's were large, expensive Originally, DBMS's were large, expensive software running on large computers.software running on large computers.

The size was necessary because storing a The size was necessary because storing a gigabyte of data required a large gigabyte of data required a large computer.computer.

But today, hundreds of gigabytes fit on a But today, hundreds of gigabytes fit on a single disk and we can put it on a laptop!single disk and we can put it on a laptop!

1414

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1.3 Smaller and Smaller 1.1.3 Smaller and Smaller SystemsSystems

Another important trend Another important trend (may not be (may not be there any more) there any more) is the use of documents is the use of documents using XML (eXtensible Modeling Language) using XML (eXtensible Modeling Language) ..

(In CS267) Large collections of small (In CS267) Large collections of small documents can serve as a database, and documents can serve as a database, and methods of querying and manipulating methods of querying and manipulating them are different.them are different.

1515

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1.4 Bigger and Bigger 1.1.4 Bigger and Bigger SystemsSystems

A gigabyte is not much data anymore!A gigabyte is not much data anymore!

Corporate routinely use terabytes(10Corporate routinely use terabytes(101212 bytes) and petabytes (10bytes) and petabytes (101515 bytes) of data bytes) of data storage. Here are some examples:storage. Here are some examples: Google holds petabytes of data for its crawler Google holds petabytes of data for its crawler

of the Web.of the Web.

Satellites send down petabytes of information.Satellites send down petabytes of information.

Amazon keeps millions of products' picture and Amazon keeps millions of products' picture and info.info.

YouTube keeps millions of movies.YouTube keeps millions of movies.

And so forth ...!And so forth ...!1616

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1.5 Information Integration1.1.5 Information Integration

Consider a large company with Consider a large company with many divisions.many divisions.

Each division has its own database Each division has its own database for its products and employees for its products and employees independent of other divisions.independent of other divisions.

How can we integrate the How can we integrate the information?information?

1717

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.1.5 Information Integration1.1.5 Information Integration One popular approach is the creation One popular approach is the creation

of data warehousing where information of data warehousing where information from many legacy databases is copied from many legacy databases is copied periodically.periodically.

Another approach is the Another approach is the implementation of a middleware to implementation of a middleware to integrate and translate data.integrate and translate data.

1818

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

OVERVIEW OF A DATABASE OVERVIEW OF A DATABASE MANAGEMENT SYSTEMMANAGEMENT SYSTEM

Section 1.2Section 1.2

1919

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.2 Overview of a Database 1.2 Overview of a Database Management SystemManagement System

1.2.1 Data-Definition Language Commands1.2.1 Data-Definition Language Commands

1.2.2 Overview of Query Processing1.2.2 Overview of Query Processing

1.2.3 Storage and Buffer Management1.2.3 Storage and Buffer Management

1.2.4 Transaction Processing1.2.4 Transaction Processing

1.2.5 The Query Processor1.2.5 The Query Processor

2020

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.2.1 Data-Definition 1.2.1 Data-Definition Language CommandsLanguage Commands

2121

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.2.2 Overview of Query 1.2.2 Overview of Query ProcessingProcessing

2222

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.2.3 Storage and Buffer 1.2.3 Storage and Buffer ManagementManagement

2323

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.2.4 Transaction Processing1.2.4 Transaction Processing

2424

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.2.5 The Query Processor1.2.5 The Query Processor

2525

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

OUTLINE OF DATABASE-OUTLINE OF DATABASE-SYSTEM STUDIESSYSTEM STUDIES

Section 1.3Section 1.3

2626

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

1.3 Outline of Database-1.3 Outline of Database-System StudiesSystem Studies

2727

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

REFERENCES FOR CHAPTER 1REFERENCES FOR CHAPTER 1Section 1.4Section 1.4

2828

Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011Dr. T. Y. Lin | SJSU | CS 157A | Fall 2011

References for Chapter 1References for Chapter 1

2929