sql i. sql – introduction standard dml/ddl for relational db’s dml = “data manipulation...

24
SQL I SQL I

Post on 22-Dec-2015

240 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

SQL I SQL I

Page 2: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

SQL – IntroductionSQL – Introduction

Standard DML/DDL for relational DB’s DML = “Data Manipulation Language” (queries, updates)

DDL = “Data Definition Language” (create tables, indexes, …)

Also includes: view definition

security

integrity constraints

transactions

History: System R project at IBM: “SEQUEL”

later, becomes standard: Structured Query Language

Page 3: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

A Simple SELECT-FROM-WHERE QueryA Simple SELECT-FROM-WHERE Querybname lno amt

Downtown L-170 3000 Redwood L-230 4000 Perry L-260 1700 Redwood L-450 3000

SELECT bnameFROM loanWHERE amt > 1000

Similar to:

bname ( amt>1000 (loan))

But not quite….

bname

Downtown Redwood Perry Redwood

Duplicates are retained,i.e. result not a set

Why preserve duplicates?

•eliminating them is costly•often, users don’t care•can also write: SELECT DISTINCT bname FROM loan WHERE amt> 1000

Page 4: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Another SFW queryAnother SFW query

SELECT cname, balanceFROM depositor, accountWHERE depositor.acct_no = account.acct_no

depositor (customer-name, account-number)account (account-number, branch-name, balance)

Similar to :

cname, balance (depositor account)

Note: you can also write SELECT cname, balance FROM depositor AS d, account AS a WHERE d.acct_no = a.acct_no

cname balance

Johnson 500 Smith 400 Turner 350 Smith 300 Jones 240 Smith 300

Page 5: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

In generalIn general

SELECT A1, A2, …, An

FROM r1, r2, …, rm

[WHERE P] WHERE clause optional (missing WHERE clause means WHERE is true)

Conceptual Algorithm:

1. FROM clause: cartesian product ( X )• t1 r1 x r2 x … x rm

2. WHERE clause: selection ( s )

• t2 P ( t1)

3. SELECT clause: projection ( p )

• result A1, A2, …., An (t2)

Note: will never be implemented with product (X) !

Page 6: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

The SELECT clauseThe SELECT clause

equivalent to projection, despite name

can use “*” to get all attributes

e.g.,

can write SELECT DISTINCT to eliminate duplicates

can write SELECT ALL to preserve duplicate (default)

can include arithmetic expressions

e.g., SELECT bname, acct_no, balance*1.05

FROM account

SELECT *FROM loan

Page 7: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

The WHERE clauseThe WHERE clause

equivalent to selection, despite name…

WHERE predicate can be: Simple:

attribute relop attribute or constant

(relop: )

Complex: using AND, OR, NOT, BETWEEN

e.g.

, , , , ,

SELECT lnoFROM loanWHERE amt BETWEEN 9000 AND 10000

SELECT lnoFROM loanWHERE amt >= 9000 AND amt <= 10000

Page 8: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Formal Semantics of SQLFormal Semantics of SQL

RA can only express SELECT DISTINCT queries

to express SQL, must extend to a bag algebra ,

a bag (aka: multiset) like sets, but can have duplicates

e.g. { 4, 5, 4, 6}

e.g.

cname balance

Johnson 500 Smith 400 Turner 350 Smith 300 Jones 240 Smith 300

balances =

Page 9: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

The FROM clauseThe FROM clause

Equivalent to cartesian product ( X )

(or depending on the WHERE clause)

binds tuples in relations to variable names

e.g.: FROM borrower, loan computes: borrower x loan

identifies borrower, loan columns (attrs) in the results

e.g. allowing one to write:

WHERE borrower.lno = loan.lno

FROM borrower b, loan lWHERE b.lno = l.lno •Simplifies the expression

•Needed for self-joins

Page 10: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Step 1 – Cross ProductStep 1 – Cross Product

S.sid S.name S.login S.age S.gpa E.sid E.cid E.grade 53666 Jones jones@cs 18 3.4 53831 Carnatic101 C 53666 Jones jones@cs 18 3.4 53832 Reggae203 B 53666 Jones jones@cs 18 3.4 53650 Topology112 A 53666 Jones jones@cs 18 3.4 53666 History105 B 53688 Smith smith@ee 18 3.2 53831 Carnatic101 C 53688 Smith smith@ee 18 3.2 53831 Reggae203 B 53688 Smith smith@ee 18 3.2 53650 Topology112 A 53688 Smith smith@ee 18 3.2 53666 History105 B

SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade=‘B'

Page 11: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Step 2) Discard tuples that fail predicateStep 2) Discard tuples that fail predicate

S.sid S.name S.login S.age S.gpa E.sid E.cid E.grade 53666 Jones jones@cs 18 3.4 53831 Carnatic101 C 53666 Jones jones@cs 18 3.4 53832 Reggae203 B 53666 Jones jones@cs 18 3.4 53650 Topology112 A 53666 Jones jones@cs 18 3.4 53666 History105 B 53688 Smith smith@ee 18 3.2 53831 Carnatic101 C 53688 Smith smith@ee 18 3.2 53831 Reggae203 B 53688 Smith smith@ee 18 3.2 53650 Topology112 A 53688 Smith smith@ee 18 3.2 53666 History105 B

SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade=‘B'

Page 12: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Step 3) Discard Unwanted ColumnsStep 3) Discard Unwanted Columns

S.sid S.name S.login S.age S.gpa E.sid E.cid E.grade 53666 Jones jones@cs 18 3.4 53831 Carnatic101 C 53666 Jones jones@cs 18 3.4 53832 Reggae203 B 53666 Jones jones@cs 18 3.4 53650 Topology112 A 53666 Jones jones@cs 18 3.4 53666 History105 B 53688 Smith smith@ee 18 3.2 53831 Carnatic101 C 53688 Smith smith@ee 18 3.2 53831 Reggae203 B 53688 Smith smith@ee 18 3.2 53650 Topology112 A 53688 Smith smith@ee 18 3.2 53666 History105 B

SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade=‘B'

Page 13: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Formal Semantics of SQL: RA*Formal Semantics of SQL: RA*

1. *P(r) : preserves copies in r

• * cname = “Smith” (balances)

*A1, A2, …, An (r): no duplicate elimination

* cname ( balances) =

cname balance

Smith 400 Smith 300 Smith 300

cname

Johnson Smith Turner Smith Jones Smith

cname balance

Johnson 500 Smith 400 Turner 350 Smith 300 Jones 240 Smith 300

balances

Page 14: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Formal Semantics of SQL: RA*Formal Semantics of SQL: RA*

3. r * s : additive union:

e.g. if r = , s =

then: r * s =

A B1 a1 a2 b

A B2 b3 a1 a

A B1 a1 a2 b2 b3 a1 a

Page 15: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Formal Semantics of SQL: RA*Formal Semantics of SQL: RA*

4. r -* s : bag difference

e.g. r -* s =

s - * r =

A B1 a

A B3 a

A B1 a1 a2 b

A B2 b3 a1 a

r s

Page 16: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Formal Semantics of SQL: RA*Formal Semantics of SQL: RA*

5. r x* s : cartesian product with bags

e.g. if: r = , s =

then: r x* s =

A B1 a1 a2 b

C

A B C1 a 1 a 1 a 1 a 2 b 2 b

Page 17: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Formal Semantics of SQL: RA*Formal Semantics of SQL: RA*

SELECT A1, A2, …, An FROM r1, r2, …, rm WHERE P

Query

Semantics:

* A1, A2, ..., An( *P (r1 x* r2 x* ... rm))

Query: SELECT DISTINCT A1, A2, ..., An FROM r1, r2, ..., rm WHERE P

Q: What is the only RA operator that need be changed above?

Ans: *

Page 18: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

More SQL: ASMore SQL: AS

1. Using AS in FROM clause

introduces tuple variables

e.g.:

2. Using AS in SELECT clause

renames columns in result ( )

e.g.:

SELET DISTINCT T.bnameFROM branch AS T, branch AS SWHERE T.assets > S.assets

SELET bname, acct_no, balance*1.05 AS newbalFROM account bname acct_no newbal

Downtown A-170 450 Redwood A-230 400

Page 19: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

More SQL: IntroMore SQL: Intro

Give a name to a query result ( )

E.g.

intuitively: BranchNames

CREATE TABLE BranchNames ASSELECT DISTINCT bnameFROM branch

SELECT DISTINCT bnameFROM branch

Page 20: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

More SQL - String OperationsMore SQL - String Operations

SQL includes a string-matching operator

percent (%). The % character matches any substring.

underscore (_). The _ character matches any character.

E.g. Find the names of all customers whose street includes the substring “Main”.

SELECT customer-nameFROM customerWHERE cstreet LIKE ‘%Main%’

match the name “Main%”: use: like ‘Main\%’ escape ‘\’

SQL supports a variety of string operations such as•concatenation (using “||”)• converting from upper to lower case (and vice versa)• finding string length, extracting substrings, etc.

Page 21: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

More SQL: Set/Bag operationsMore SQL: Set/Bag operations

Set operations:

UNION , INTERSECT, EXCEPT (MINUS)

Bag operations:

UNION ALL, INTERSECT ALL, EXCEPT ALL

Duplicate counting: Given m copies of in r and n copies of in s

Q: How many copies of in ....

1. r UNION ALL s

2. r INTERSECT ALL s 3. r EXCEPT ALL s

1. Ans : m + n

2. Ans: min (m,n)

3. Ans: max(0, m-n)

Page 22: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

More SQL: Set/Bag operationsMore SQL: Set/Bag operations

Example Queries:

(SELECT cname FROM depositor)

(SELECT cname FROM borrower)

?

? = UNION•returns names of customers with saving accts, loans, or both

? = INTERSECT•returns names of customers with saving accts AND loans

? = EXCEPT•returns names of customers with saving accts but NOT loans

Page 23: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Order byOrder by

Example: List in alphabetical order, the names of all customers with loans at Perry branch:

SELECT DISTINCT cnameFROM borrower b, loan lWHERE b.lno = l.lno AND bname = “Perry”ORDER BY cname

Result: cname Adams Byers Smith .....

can also write: ORDER BY cname DESC or ORDER BY cname ASC (default)

like SELECT DISTINCT, very expensive...

Page 24: SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”

Aggregate OperatorsAggregate Operators

• Aggregate Operators:

AVG (col): average of values in colMIN (col) : minimum value in colMAX (col): maximum value in colSUM (col): sum of values in colCOUNT (col): number of values in col

Examples: 1. Find the average acct balance @ Perry: SELECT AVG (bal) FROM account WHERE bname = “Perry” 2. Find the number of tuples in customer: SELECT COUNT(*) FROM customer

account (acct_no, bname, bal)

3. Find the number of depositors SELECT COUNT( DISTINCT cname) FROM depositor

COUNT, SUM, AVGhave a DISTINCT version