cs 338sql queries5-1 sql queries lecture topics the sql query language table aliases joins and...

42
CS 338 SQL Queries 5-1 SQL Queries Lecture Topics The SQL query language Table aliases Joins and unions Select statement syntax SQL and Views • Nulls Textbook Chapter 5

Upload: lillian-ray

Post on 28-Dec-2015

223 views

Category:

Documents


0 download

TRANSCRIPT

CS 338 SQL Queries 5-1

SQL Queries

Lecture Topics

• The SQL query language

• Table aliases

• Joins and unions

• Select statement syntax

• SQL and Views

• Nulls

Textbook

• Chapter 5

CS 338 SQL Queries 5-2

The SQL Query Language

• Expressing the algebraic operators

• More examples of querying in SQL

• Expressiveness and limitations

CS 338 SQL Queries 5-3

Retrieving all information from a table

E.g. “The vendors.”

select * from Vendor

CS 338 SQL Queries 5-4

Selecting data

E.g. “The vendors in Waterloo.”

select * from Vendor where City = 'Waterloo'

E.g. “Vendors that are in Waterloo or have a balance exceeding 100.”

select * from Vendor where City = 'Waterloo' or Vbal > 100

CS 338 SQL Queries 5-5

Projecting columns

E.g. “The names of vendors.”

select distinct Vname from Vendor

But, note:

select Vname from Vendor

Vname

SearsKmartEssoEsso

In SQL, a query returns a multiset of tuples; that is, the same row can appear more than once.

Vname

SearsKmartEsso

CS 338 SQL Queries 5-6

E.g. “Names of customers and vendors that have a common transaction.”

Solution 1:select Vname, Cnamefrom Customer, Transaction, Vendorwhere Transaction.AccNum = Customer.AccNumand Transaction.Vno = Vendor.Vno

Table aliases• column names appearing in several tables

must be made unambiguous

• alias: a name for referring to a table.

• terminology: table aliases, tuple variables, correlation names

CS 338 SQL Queries 5-7

Solution 2:select Vname, Cnamefrom Customer as C,

Transaction as T,

Vendor as Vwhere T.AccNum = C.AccNumand T.Vno = V.Vno

Alternate syntax:select Vname, Cnamefrom Customer C, Transaction T,

Vendor Vwhere T.AccNum = C.AccNumand T.Vno = V.Vno

...continued [table aliases]

CS 338 SQL Queries 5-8

Cross products and joins

E.g. “All combinations of vendors and transactions.”

select * from Vendor, Transaction

E.g. “Names of vendors and their transaction amounts.”

select Vname, Amount from Vendor V, Transaction T where V.Vno = T.Vno

Vname AmountKmart 13.25Kmart 19Esso 25Esso 16.13Esso 33.12

CS 338 SQL Queries 5-9

Set difference

E.g. “Vendor numbers for vendors with no transactions.”

select Vno from Vendor Vwhere not exists (select * from Transaction T where T.Vno = V.Vno)

• not defined explicitly in earlier standards; standard in SQL92; some products do support it (EXCEPT)

• use exists, subselects to compute set difference

CS 338 SQL Queries 5-10

Subselects

• Select statements can be nested almost anywhere:

• in a select list:– lists vnames for each transaction select tno, (select vname from vendor v where v.vno = t.vno) from transaction t

– subselect returns single attribute & row• in a from clause:

– list tno & vnames for Waterloo vendorsselect tno, v.vname from transaction t, (select * from vendor where city='Waterloo') as v where t.vno=v.vno

– subselect returns a table with alias– similar to views (without the view defn!)

CS 338 SQL Queries 5-11

…continued [subselects]

• In a where clause:select * from transaction t where exists (select * from vendor v where city='Waterloo' and v.vno=t.vno)

– useful with exists, not exists

– also useful with in operator (discussion following)

– can be used in place of any single value (see discussion on aggregate functions following)

CS 338 SQL Queries 5-12

Outer Join

• Consider the following schema:– F( fid, name, dean, budget, etc);

foreign key dean references FM( eid );

– FM( eid, name, rank, salary, etc);

• Query: list all FMs and the name of the faculty of which he/she is the dean

• Easy to do the other way: list all faculty and the name of the dean

– following the FK connection “towards” the primary key is easy, but the opposite direction might not be

– might not be any corresponding value

CS 338 SQL Queries 5-13

…continued [outer join]

• use select list subselect:select eid, name, rank,

(select F.name from F where F.dean = FM.eid)

from FM

• if no row results from the subselect, NULL is substituted

– produces a column consisting of the name of the faculty the FM is dean of, or NULL

– won’t work if someone is dean of more that one faculty (why?)

• SQL defines a special operator to do this:select eid, FM.name,

rank, F.namefrom FM left outer join Fon F.dean = FM.eid)

CS 338 SQL Queries 5-14

…continued [outer join]

• variations of outer join:– left outer join

– right outer join

– full outer join

• require use of on clause to identify foreign-key relationship

• basic operation:

– preserves all the rows in one table, and supplies nulls for the other table when it does not meet the join condition

CS 338 SQL Queries 5-15

Computing a set union

E.g. “Vendors that are in Waterloo or have a balance exceeding 100.”

(select * from Vendor where City = 'Waterloo')union(select * from Vendor where Vbal > 100)

CS 338 SQL Queries 5-16

More on SQL Queries

E.g. “Vendor names for vendor numbers 1, 2 and 3.”

select Vname from Vendorwhere Vno in (1,2,3)

Result:VnameSearsKmartEsso

• selecting rows based upon set membership

• in: set membership

CS 338 SQL Queries 5-17

E.g. “Names of vendors with no transactions on January 16, 1996.”

select Vname from Vendorwhere Vno not in (select Vno from Transaction where Tdate = 960116)

Result:

VnameSearsEssoEsso Recall that SQL does

not remove duplicates automatically.

...continued [in predicate]

• membership testing often useful with subqueries

CS 338 SQL Queries 5-18

select distinct Vname from Vendorwhere Vno not in (select Vno from Transaction where Tdate = 960116)

Result:

VnameSearsEsso

...continued [in predicate, select distinct]

• avoiding duplicates: distinct

CS 338 SQL Queries 5-19

…continued [column aliasing]

E.g. “Names of vendors and customers.”

(select Vname as Name from Vendor)union(select Cname as Name from Customer)

CS 338 SQL Queries 5-20

• terminology: column aliasing, expression aliasing

• can be used for column titles

...continued [column aliasing]

E.g. “Transaction amounts for Esso.”

select Amountas "Transaction Amounts"from Vendor, Transactionwhere Vendor.Vname = 'Esso'and Vendor.Vno = Transaction.Vno

Result:

Transaction Amounts

25.0016.1433.12

CS 338 SQL Queries 5-21

E.g. “Names of customers with all transactions on vendors in the same city.”

select Cname from Customer Cwhere exists (select * from Transaction T1, Vendor V1 where T1.AccNum = C.AccNum and T1.Vno = V1.Vno and not exists (select * from Transaction T2, Vendor V2 where T2.AccNum = C.AccNum and T2.Vno = V2.Vno and V1.City <> V2.City))

...continued [exists predicate]

• testing for (non-)emptiness of a subquery

• exists sub-query: true if value of sub-query contains at least one tuple

CS 338 SQL Queries 5-22

• tables are sets, order of rows indeterminate

• may want/need to order (sort) results

...continued [row ordering]

E.g. “Names of customers living in Ontario, in alphabetical order.”

select Cname from Customerwhere Prov = 'Ont'order by Cname

CS 338 SQL Queries 5-23

E.g. “Vendor cities, names and balances in alphabetical order of vendor names and in descending order of balances.”

select City, Vname, Vbal from Vendororder by Vname, Vbal desc

Result:

City Vname VbalWaterlooMontrealOttawaToronto

EssoEssoKMartSears

2.25 0.00671.05 200

...continued [row ordering]

CS 338 SQL Queries 5-24

Additional operators for predicates:

• like pattern: string pattern matching% matches any string (including zero-

length)

_ (underscore) matches any single character

• Attr between Value1 and Value2Attr >= Value1) and (Attr <= Value2))

...continued [operators, string-matching]

CS 338 SQL Queries 5-25

E.g. “Employees whose name consists of ‘Wong’ preceded by five characters, and who live on Elm street.”

select Name from Employeewhere Name like '_____Wong'and Street like '%Elm street'

E.g. “Names of vendors whose balance is between $100 and $500.”

select VName from Vendorwhere VBal between 100 and 500

...continued [operators, between]

Name Street

A. WongB.C. WongE.F. WongG.H.I. Wong

123 Elm street1 Elm street456 Elm street456 Elm street

Employee:

CS 338 SQL Queries 5-26

Aggregate functions:

• count(*)– number of tuples

• count(column) count(distinct column)

– number of (nonduplicate) values

• sum(expr)

sum(distinct expr)– sum of values

• avg(expr)

avg(distinct expr)– average of values

• max(expr)– largest value

• min(expr)– smallest value

...continued [aggregate functions]

CS 338 SQL Queries 5-27

E.g. “Number of transactions.”

select count(*) from transaction

E.g. “Number of vendors with transactions.”

select count(distinct Vno) from

transaction

E.g. “Total vendor balances.”

select sum(Vbal) from Vendor

E.g. “Average customer balance.”

select avg(Cbal) from Customer

E.g. “Transactions of less than average amt”

select * from transaction

where amount < (select

avg(amount) from Transaction)

...continued [aggregate functions]

CS 338 SQL Queries 5-28

• grouping rows together, according to a common value

• Syntax: select list group by columns

• list contains only attributes used for grouping, or aggregate functions applied to the groups

AccNum SUM(Amount)101102103

38.2516.1352.12

...continued [row grouping]

E.g. “The total amount of transactions for each account.”

select AccNum, sum(Amount)from Transactiongroup by AccNum

Result:

CS 338 SQL Queries 5-29

• grouped select can be ordered, subject to the same restrictions on the select list

...continued

E.g. “The total amount of transactions for each account, in increasing order of amount.”

select AccNum, sum(Amount)from Transactiongroup by AccNumorder by sum(Amount)

Result:AccNum SUM(Amount)

102 16.13101 38.25103 52.12

CS 338 SQL Queries 5-30

E.g. “The total amount of transactions for accounts that have more than one transaction.”

select AccNum, sum(Amount)from Transactiongroup by AccNumhaving count(*) > 1

Result:

AccNum SUM(Amount)

101103

38.2552.12

...continued

• groups can be qualified using having

CS 338 SQL Queries 5-31

Select statement syntax

• For all selects:

select [ all | distinct ] exp {,exp} from table [[ as ] alias ] {,table [[ as ] alias ] }[ where cond ][ group by col {,col}[ having cond ]][union [ all ] select]

• For top-level queries:

select [ order by resultcol [ asc | desc ] {,resultcol [ asc | desc ]}]

CS 338 SQL Queries 5-32

Semantics of an SQL query

• compute cross product of all tables in from clause

• eliminate rows not satisfying where condition

• group rows according to group by clause

• eliminate groups not satisfying having condition

• evaluate expressions in select target list

• eliminate duplicate rows if distinct specified

• compute union of each select

• sort rows according to order by

CS 338 SQL Queries 5-33

The power of the SQL query language

• can express anything in the relational algebra, and more:

– result of a query can have duplicate tuples

– result of a query can be ordered

– can count

– aggregate functions & grouping

• there are limitations:– other aggregate functions?

– no aggregate functions on subqueries

– no recursion or iteration

– generalized constraints

– not programmable like ordinary programming languages

CS 338 SQL Queries 5-34

More views

• Definition: a view is a derived table whose definition, not the table itself, is stored – the set of views and tables comprises

the external schema

• Creating a view:CREATE VIEW viewname [ (column-name) [,column-name])]AS select-statement;

• Example:CREATE VIEW VTotals(vno,amt) AS SELECT Vno, SUM(Amount) FROM TransactionGROUP BY Vno

• Removing views:DROP VIEW viewname

• Example:DROP VIEW Vtotals

CS 338 SQL Queries 5-35

...continued

• A view is a virtual table that is computed dynamically (not stored explicitly)

• Any derivable table can be defined as a view (some minor restrictions on the SELECT)

• A table defined as a view can be used in the same way as a base table:– retrieval (SELECT)

– view definition (view of view)

• But: updates can be performed only on certain views– views derived from a single base table

– views with each row and attribute corresponding to a distinct, unique row and attribute in the base table

CS 338 SQL Queries 5-36

Pros & cons of views

• Views provide several advantages:– users are independent of DB growth

– users are independent of DB restructuring (except for updating)

– users’ perception can be simplified

– the same data (base table) can be viewed in different ways by different users

– security for hidden data

• Problem with views:– creating & view requires special

permission (DBA or “resource”)

– can use nested selects instead of view-name, i.e. use the select statement that defines the view

• can be arbitrarily complex, including aggregates, having, union, etc

CS 338 SQL Queries 5-37

The “view update” problem

• Consider the previous view example:

CREATE VIEW VTotals(vno,amt) AS SELECT Vno, SUM(Amount) FROM TransactionGROUP BY Vno

• An update to this view cannot be translated to a base-table operation

• Example:UPDATE VTotals SET amt=amt+1– what rows in Transaction should be

modified??

• There is no simple answer:– non-deterministic

– still a research problem:• DBMS can try to guess• force the user/DBA to decide

CS 338 SQL Queries 5-38

Nulls in SQL

• Unknown: not yet known, but will be known eventually

• Not applicable: does not apply to a particular tuple

• Not the same as 0 or ‘’ (null string)

• “Not applicable” often used to simplify DB design

• Null values complicate expression evaluation. E.g.:

select average(vacation) from emps

select count(*) from emps

select name from emps where vacation <= 10

select name from emps where vacation > 10

CS 338 SQL Queries 5-39

Three-valued logic

• A where predicate returns unknown for any tuple that contains null

• Null also results from empty (sub)selects: select name from emps where exists(select...)

• Relational operations =, <>, <, <=, >, >= yield unknown if either operand is null

• Cannot use =, <> to test null, use: expr is null expr is not null

• Test for unknown with: expr is unknown

• Three-valued logic tables:

and T F UT T F UF F F FU U F U

or T F UT T T TF T F UU T U U

not T FF TU U

CS 338 SQL Queries 5-40

Review of SQL statements

• DDL: {create|drop} {table|view}, grant, revoke

• DML: insert, delete, update, select

• more later (e.g. transaction processing)

Examples:

create table EssoVendors(Vno INTEGER not null, City VARCHAR(10), Vbal DECIMAL(10,2), primary key (Vno) );

insert into EssoVendors select Vno, City, VBal from Vendor where Vname like '%Esso%'

CS 338 SQL Queries 5-41

insert into EssoVendorsvalues (5, 'Kitchener', 123.45)

insert into EssoVendors (Vbal, Vno, City)values(666.66, 6, 'Route 66')

update EssoVendors set Vbal = Vbal * 1.01

update EssoVendorsset Vbal = Vbal * 1.02where Vbal < 50.00

delete from Transactionwhere Vno in (select Vno from EssoVendors)

...continued

CS 338 SQL Queries 5-42

The “last word” on SQL – for now

• Many, many details omitted– table-spaces, named schemas

– table ownership

– stored procedures & triggers

– constraints (unique, check, …)

– and others

• Most commercial products implement their own version of SQL– typically a cross between SQL89 and

SQL92

– lots of extra features

– “your mileage may vary”

• The SQL vendor documents are essential to any realistic SQL project