msc it ufce8k-15-m data management prakash chatterjee room 3p16 [email protected]...
TRANSCRIPT
MSc IT UFCE8K-15-M Data ManagementPrakash ChatterjeeRoom [email protected]://www.cems.uwe.ac.uk/~p-chatterjee/modules/dm
Lecture 2 : The Relational Data Model
UCE8K15-M Data Management 2012/13
Origins of the Relational Model
The relational model was developed by EF Codd in the early 1970s. Commercial systems based on the relational model appeared in the
late 1970s. At present there are several hundred relational DBMSs and most
computer vendors support 'relational' software. Examples of well-known products include Oracle, DB2, Sybase,
MySQL, MS.SQL Server and MS Access.
Informally, a relational system is a system in which:1. The data is perceived by the user as tables (and nothing but tables).2. The operators available to the user for (e.g.) retrieval are operators that derive “new” tables from "old" ones. For example, there is one operator, restrict, which extract a subset of the rows of a given table, and another, project, which extracts a subset of columns - and a row subset and a column subset of a table can both be regarded in turn as tables in their own right.
UCE8K15-M Data Management 2012/13
Components and terminology (1)
The model uses terminology taken from mathematics, particularly set theory and predicate logic. Basic terminology used in relational theory includes:
relation - this corresponds to a table or flat file with columns and rows
tuple - a row of a relation
attribute - a named column of a relation
domain - the set of allowable values for one or more attributes degree of a relation - the number of attributes it contains
cardinality of relation - the number of tuples it contains.
UCE8K15-M Data Management 2012/13
Properties of relations
There is only one data structure in the relational data model - the relation.
Every relation and every attribute within a relation must have a distinct name.
Attribute (column) values of a relation are atomic (i.e. single valued).
All values in an attribute (column) are taken from same domain.
The ordering of columns in a relation is not significant.
Duplicate tuples (rows) are not allowed (e.g. each row in a relation must be distinct).
The ordering of tuples (rows) and attributes (columns) is not significant.
UCE8K15-M Data Management 2012/13
Relational algebra & relational calculus
Relational algebra (ra) and relational calculus (rc) are both formal (mathematically based) languages defined by EF Codd.
ra & rc are logically equivalent languages. ra is “procedural” and rc is “declarative” in nature.
ra and rc are the formal grounding of the relational database model and illustrate the basic operations required by any data manipulation language such as SQL.
Relational algebra is an offshoot of first-order logic, is a set of relations closed under operators. Operators operate on one or more relations to yield a relation.
The “closure” property relates to the fact that from any given relational operation another relation is output - it os often refereed to as the “relations in – relations out” property.
UCE8K15-M Data Management 2012/13
Relational algebra operators (1)
Each relational operator takes one or more relations as its input and produces a new relation as output (closure). Codd originally defined eight operators, in two classes:
Set operators: UNION INTERSECTION
DIFFERENCE DIVIDE
The special relational operators: RESTRICT PROJECT
JOIN Cartesian PRODUCT
UCE8K15-M Data Management 2012/13
Relational algebra operators (3)dept – emp – salgrade example (1)
dept – emp – salgrade example
Department : dept (depno, dname, location)
Employee : emp (empno, ename, mgr, sal, deptno)
Salary Grade : salgrade (grade, losal, hisal)
UCE8K15-M Data Management 2012/13
Relational algebra operators (4)dept – emp – salgrade example (2)
dept table
deptno dname location
10 Accounting New York
20 Research Dallas
30 Sales Chicago
40 Operations Boston
UCE8K15-M Data Management 2012/13
Relational algebra operators (5)dept – emp – salgrade example (3)
empno ename mgr sal deptno
7369 SMITH 7902 £800.00 20
7499 ALLEN 7698 £1,600.00 30
7521 WARD 7698 £1,250.00 30
7566 JONES 7839 £2,975.00 20
7654 MARTIN 7698 £1,250.00 30
7698 BLAKE 7839 £2,850.00 30
7782 CLARK 7839 £2,450.00 10
7788 SCOTT 7566 £3,000.00 20
7839 KING £5,000.00 10
7844 TURNER 7698 £1,500.00 30
7876 ADAMS 7788 £1,100.00 20
7900 JAMES 7698 £950.00 30
7902 FORD 7566 £3,000.00 20
7934 MILLER 7782 £1,300.00 10
emp table
UCE8K15-M Data Management 2012/13
Relational algebra operators (6)dept – emp – salgrade example (4)
salgrade table
grade losal hisal
1 £700.00 £1,200.00
2 £1,201.00 £1,400.00
3 £1,401.00 £2,000.00
4 £2,001.00 £3,000.00
5 £3,001.00 £99,999.00
UCE8K15-M Data Management 2012/13
Relational algebra operators (7)dept – emp – salgrade example (5)
Restrict Subset of the Rows in a Table
RESTRICT EMP WHERE sal > 2000
empno ename mgr sal deptno
7566 JONES 7839 £2,975.00 20
7698 BLAKE 7839 £2,850.00 30
7782 CLARK 7839 £2,450.00 10
7788 SCOTT 7566 £3,000.00 20
7839 KING £5,000.00 10
7902 FORD 7566 £3,000.00 20
UCE8K15-M Data Management 2012/13
Relational algebra operators (8)dept – emp – salgrade example (6)
Projectsubset the Columns in a Table
PROJECT EMP [EMPNO, SAL,DEPTNO]empno sal deptno
7369 £800.00 20
7499 £1,600.00 30
7521 £1,250.00 30
7566 £2,975.00 20
7654 £1,250.00 30
7698 £2,850.00 30
7782 £2,450.00 10
7788 £3,000.00 20
7839 £5,000.00 10
7844 £1,500.00 30
7876 £1,100.00 20
7900 £950.00 30
7902 £3,000.00 20
7934 £1,300.00 10
UCE8K15-M Data Management 2012/13
Relational algebra operators (9)dept – emp – salgrade example (7)
Restrict-ProjectRESTRICT EMP WHERE SAL >2000PROJECT EMP[EMPNO, SAL, DEPTNO]
empno sal deptno
7566 £2,975.00 20
7698 £2,850.00 30
7782 £2,450.00 10
7788 £3,000.00 20
7839 £5,000.00 10
7902 £3,000.00 20
call this EMPX
Could you reverse these operations - always? ( project then restrict?)
UCE8K15-M Data Management 2012/13
Relational algebra operators (10)dept – emp – salgrade example (8)
Product combine each row of one table with each row of the other
PRODUCT DEPT with EMPXempno sal EMPX.
deptno dept.
Depnodname loc
7566 £2,975.00 20 10 Accounting New York
7698 £2,850.00 30 10 Accounting New York
7782 £2,450.00 10 10 Accounting New York
7788 £3,000.00 20 10 Accounting New York
7839 £5,000.00 10 10 Accounting New York
7902 £3,000.00 20 10 Accounting New York
7566 £2,975.00 20 20 Research Dallas
7698 £2,850.00 30 20 Research Dallas
7782 £2,450.00 10 20 Research Dallas
7788 £3,000.00 20 20 Research Dallas
7839 £5,000.00 10 20 Research Dallas
7902 £3,000.00 20 20 Research Dallas
UCE8K15-M Data Management 2012/13
Relational algebra operators (11)dept – emp – salgrade example (9)
7566 £2,975.00 20 30 Sales Chicago
7698 £2,850.00 30 30 Sales Chicago
7782 £2,450.00 10 30 Sales Chicago
7788 £3,000.00 20 30 Sales Chicago
7839 £5,000.00 10 30 Sales Chicago
7902 £3,000.00 20 30 Sales Chicago
7566 £2,975.00 20 40 Operations Boston
7698 £2,850.00 30 40 Operations Boston
7782 £2,450.00 10 40 Operations Boston
7788 £3,000.00 20 40 Operations Boston
7839 £5,000.00 10 40 Operations Boston
7902 £3,000.00 20 40 Operations Boston
UCE8K15-M Data Management 2012/13
Relational algebra operators (12)dept – emp – salgrade example (10)
Product (Cartesian product)
DEPT has 4 recordsEMPX has 6 records
so DEPT x EMPX has 24 records
but not very useful
UCE8K15-M Data Management 2012/13
Relational algebra operators (13)dept – emp – salgrade example (11)
Equi-Joinproduct restricted to rows which have matching common domain
empno
sal EMPX. deptno
dept. deptno
dname loc
7566 £2,975.00 20 20 Research Dallas
7698 £2,850.00 30 30 Sales Chicago
7782 £2,450.00 10 10 Accounting New York
7788 £3,000.00 20 20 Research Dallas
7839 £5,000.00 10 10 Accounting New York
7902 £3,000.00 20 20 Research Dallas
UCE8K15-M Data Management 2012/13
Relational algebra operators (14)dept – emp – salgrade example (12)
Natural Joinequi-join projected with the duplicate column removed
empno sal deptno
dname loc
7566 £2,975.00 20 Research Dallas
7698 £2,850.00 30 Sales Chicago
7782 £2,450.00 10 Accounting New York
7788 £3,000.00 20 Research Dallas
7839 £5,000.00 10 Accounting New York
7902 £3,000.00 20 Research Dallas
UCE8K15-M Data Management 2012/13
Basic SQL
SELECT * FROM EMP WHERE SAL > 2000;
SELECT ENAME,SAL,DEPTNO FROM EMP;
SELECT ENAME,SAL,DEPTNO FROM EMP WHERE SAL > 2000;
SELECT * FROM EMP, DEPT WHERE SAL > 2000;
SELECT * FROM EMP,DEPT WHERE SAL > 2000 AND
EMP.DEPTNO = DEPT.DEPTNO;
SELECT EMPNO, SAL, DEPTNO, DNAME FROM EMP,DEPT
WHERE SAL > 2000 AND EMP.DEPTNO = DEPT.DEPTNO;
UCE8K15-M Data Management 2012/13
Bibliography / Readings / Home based activities
Bibliography- An Introduction to Database Systems (8th ed.), C J Date, Addison Wesley 2004- Database Management Systems, P Ward & G Defoulas, Thomson 2006
Readings- Introduction to SQL’ McGraw-Hill/Osbourne (handout)
Home based activities- Ensure you download xampp and install on home PC or laptop (if you have a
slow home internet connection – download to data key or CD here at UWE)- Copy the SQL Workbook onto your data key or CD.- Import the tables from the SQL Workbook into your home MySQL DB. Begin
working through some of the query examples in the workbook using PHPMyAdmin.