introduction to structured query language (sql) · introduction to structured query language (sql)...
TRANSCRIPT
Introduction to Structured Query Language (SQL)
SQL is the most popular Data Manipulation Language (DML), and in fact, the standard language for all
relational DBMS’s.
The main operations:
Definition and modification of tables, views: CREATE, DROP, ALTER
Inserting data, modification of data: INSERT, DELETE, UPDATE
Searching for some data: SELECT
CREATE TABLE command is used to specify:
(1) The table name
(2) Description of the attributes, including:
(a) Attribute name
(b) The data type of the attribute
(c) Constraints on the attribute values, including:
(i) Whether some tuple may have NULL value for this attribute;
(ii) The range of values allowed for this attribute
(iii) Referential constraints on the attribute
(3) Primary key attributes
(4) Foreign key attributes
Example 1:
Here is a simplified DB table for records of books in a library:
CREATE TABLE Book (
isbn VARCHAR(15) NOT NULL,
title VARCHAR(200) NOT NULL,
catalog_no VARCHAR(15) NOT NULL,
copy_no INT NOT NULL DEFAULT 1,
keywords CHAR(100) NULL,
purchase_date DATE NULL,
PRIMARY KEY CLUSTERED(catalog_no, copy_no))
1
The following table lists some of the commonly used data types. For a complete list, you can check any
online SQL documentation, e.g. at http://dev.mysql.com/doc/mysql/en/index.html
Data type Description/Notes
BOOLEAN, INT BOOLEAN is 1 bit; INT is a signed 32-bit integer.
FLOAT, DOUBLE single or double precision floating point numbers
CHAR(N), VARCHAR(N) character string of at most N characters
DATETIME
DATE
Date and time stored in format: 'YYYY-MM-DD HH:MM:SS'
Date stored in format 'YYYY-MM-DD' You can specify Date and Time in several formats:
• As a string in either 'YYYY-MM-DD HH:MM:SS' or 'YY-MM-DD HH:MM:SS' format. Any punctuation character can be used as the delimiter between date parts or time parts. e.g., '98-12-31 11:30:45', '98.12.31 11+30+45', '98/12/31 11*30*45', …
• If only Date is specified for a datetime, then the time part of the entries will be 00:00:00.
BLOB This type is used to store large binary objects, e.g. files, images,…
Example 2:
The following examples shows how some common types of constraints can be defined on tables. CHECK
constraints are checked whenever the value of this attribute is modified or a new row is inserted. The
requested operation is disallowed if the constraint fails. Foreign key constraints are checked at these two
types of actions, plus, when a referenced attribute in the parent table is changed or deleted.
CREATE TABLE Borrows (
catalog_num VARCHAR(15) NOT NULL,
copy_num INT NOT NULL,
issue_date DATE NOT NULL,
person_id CHAR(8) NOT NULL,
PRIMARY KEY CLUSTERED(catalog_num, copy_num, person_id, issue_date),
CONSTRAINT fk_borrows_book FOREIGN KEY(catalog_num, copy_num) REFERENCES
Books(catalog_no, copy_no),
CONSTRAINT fk_borrows_person FOREIGN KEY(person_id) REFERENCES Person( id) )
2
Example 3:
CREATE TABLE Person (
lname VARCHAR(35) NOT NULL,
fnames VARCHAR(50) NOT NULL,
email varchar(60) NOT NULL UNIQUE CHECK ( email LIKE ‘%@%’),
id CHAR(8) NOT NULL,
phone CHAR(12) NULL,
PRIMARY KEY (id) )
Each time you insert a row of data (or even change the value of any attribute of a tuple in a table), several
types of constraints are checked to ensure against incorrect data entry. These include:
- the data is the correct type (domain integrity)
- disallowing a ‘null’ value if the design requires the attribute to be non-null
- the value entered must match any constraint specified via ‘CHECK’ functions
- if the attribute(s) are key, or UNIQUE, then no two rows in the table will be allowed to have these
values repeated
- If the attribute refers to an attribute of another table, the entered value must be present in at least one
row of the referred table
If the primary key has more than one attributes, then you must use the keyword CLUSTERED.
WARNINGS ABOUT IMPLEMENTATIONS:
1. Some DBMS’s, e.g. MySQL, do not have complete implementation yet -- for example,
CHECK (..) constraints do no work on the version provided by ITSC;
2. Referential constraints specification: Notice that when we define the Table ‘Borrows’, it
references entries from a table ‘Person’, which has not yet been defined. In many DBMS’s, such
references will not be allowed. In such cases, you will first need to create the table ‘Borrows’
without the constraint fk_borrows_person. Next, you create the table ‘Person’; and finally, you
add the fk_borrows_person constraint by the use of an “ALTER TABLE Borrows …” command.
DROP TABLE command is used to:
(1) Delete all the data in a table AND
(2) Delete the definition of the table itself from the DB.
3
Example 1:
DROP TABLE Person;
However, in this case, we have another table ‘Borrows’, which references some attribute form the
‘Person’ table. The deletion of all data from ‘Person’ will result in violation of referential constraints for
each row in ‘Borrows’. Therefore, most DBMS’s will warn/disallow the above DROP TABLE command.
In such cases, you may choose to use the CASCADE option:
Example 1a:
DROP TABLE Person CASCADE;
This command does the following:
- if there is any attribute of ‘Person’ that is referenced from another table, then that referential constraints
are deleted from the definitions of the referencing table.
- all data in the table is deleted, and
- the definition of the table itself is deleted from the DB.
ALTER TABLE command is used to:
(1) Add a new column in a table
(2) Delete a column from a table
(3) Add/Delete a constraint specified on a table
Suppose we want to add a column, fines, to store the outstanding total fine that a person needs to pay.
Example 1:
ALTER TABLE Person ADD fines FLOAT;
In this case, the attribute ‘fines’ will be defined as ‘NULL’ by default, since there may already be some
rows of data in the table ‘Person’, which have no value set for ‘fines’. If you must add a ‘NOT NULL’
attribute, then you must also provide a default value for it. In our example, suppose that the library will
categorize all books so as to control the allowed period of borrowing. This attribute can be added as
follows:
4
Example 2:
ALTER TABLE Book ADD category VARCHAR(10) NOT NULL DEFAULT “normal” CHECK
(category in (“normal”, “reserve”, “media”));
In general, you should try to design a DB such that there is no need for ALTER commands. What
happens to data integrity after an ALTER TABLE command is issued?
Example 3a.
ALTER TABLE Borrows DROP CONSTRIANT fk_borrows_person;
Suppose we now enter a record in ‘Borrows’ with a person-id that is not in the Person table. This will be
allowed, since the corresponding foreign key constraint was dropped. Suppose we now realize our
mistake, and add back the foreign key constraint:
Example 3b:
ALTER TABLE Borrows ADD CONSTRAINT fk_borrows_person FOREIGN KEY(person_id)
REFERENCES Person( id);
What happens to the Person whose record was entered between the two ALTER TABLE commands ?
INSERT INTO TABLEcommand is used to:
(1) Add one or more rows of data into a table
Example 1:
INSERT INTO Person VALUES ( ‘Bush’, ‘George W.’, ‘[email protected]’, ‘09112001’, NULL, 0);
Notice that
- Since the ‘fines’ attribute was added after the table was created, it is the last attribute in the table.
- Since NULL values are allowed for the attribute ‘phone’, we can enter NULL. Also, CHAR and
VARCHAR data types must be put in single-quotes. Numeric types (INT, FLOAT) are not quoted. In
mySQL, DATE type is usually input in the format ‘YYYY-MM-DD’ and is also displayed as such.
In many DBMS’s, you will be allowed to set your format for DATE and DATETIME types, after which
you can enter such data in your specified format.
5
Example 2:
INSERT INTO Book VALUES ( ‘0321122267’, ‘Fundamentals of Database Systems’, ‘QA76.9.D3’, 1,
‘Databases’, ‘2004-09-25’);
Notes:
- The arguments to VALUES are placed in exactly the same sequence as they occur in the CREATE
TABLE command.
- The data type of each argument must match the data type of the attribute
- A null entry can be entered as NULL (no quotes)
- Records are inserted one at a time, so it is quite tedious to enter each row of data into a table by
manually typing the INSERT INTO command -- it is much easier to create a data entry form, and use a
program to make and run the INSERT command (you will learn this in your labs).
Most DBMS’s, you will be allowed to directly import data from a text file into a table. However, there are
several restrictions regarding how the data in such input files must be formatted (e.g. each entry must be
separated by exactly one TAB, each row ends with a Newline, NULL entries are entered using special
symbols such as \N, and so on).
Most DBMS’s will also allow you to directly import multiple rows or entire tables and their definitions
from other RDBMS’s.
DELETE FROM TABLEcommand is used to:
(1) Delete one or more rows of data into a table
Example 1:
DELETE FROM Person
WHERE id= ‘09112001’;
The above command will delete the entire row corresponding to the person with id=’09112001’, namely
the record for the person called George W. Bush from our DB. DELETE can delete more than one row
from a table:
6
Example 2:
DELETE FROM Person WHERE lname=’Bush’;
This will delete all records for people with last name “Bush”.
Example 3:
DELETE FROM Borrows where 1
The above SQL command will delete every row of the table ‘Borrows’.
Example 4a:
DELETE FROM Borrows WHERE person_id IN (“09112001”, “55554444”, “12345678”);
This will delete all records of borrowed books that were taken persons whose id is in the provided list.
The list can be written explicitly (as in example 4a), or it can be generated by a different DB query, as in
example 4b below. The “SELECT …” part in the query below is itself an SQL query, which returns a list
of id’s from the table ‘Persons’, for those persons who have last name “Bush”.
Example 4b:
DELETE FROM Borrows WHERE person_id IN (SELECT id FROM Person WHERE lname= ‘Bush’);
UPDATE TABLE command is used to:
(1) Modify the value of one or more cells in a table.
UPDATE modifies some data in rows that are already in the table. It cannot create new rows (you must
use the INSERT command to do so).
Example 1:
UPDATE Borrows SET issue_date=CURRENT_DATE( ) WHERE person_id=’09112001’;
There are two things to note in the above example:
- The function CURRENT_DATE( ) returns the current date according to the time/date set on the DB
server. This function is specific to MySQL, although all DBMS’s have a similar function.
- The effect of the above operation is to change the issue_date of every book that has been borrowed by
the person with id = 09112001.
7
In the following example, we shall see that you can put arithmetic expressions inside SQL queries.
Example 2:
UPDATE Person SET fines= fines*2.0 WHERE id=’09112001’;
The above query will have the effect of doubling the fines for the person with id = 09112001. In practice,
we can use fairly complex conditions in the ‘WHERE’ clause to isolate particular tuples in which we
want to make changes. We will see many examples of this in our study of the SELECT command.
SELECT command is used to:
(1) Output required information from one or more tables.
The SQL SELECT command is equivalent to the combination of all the RA functions and even some
extra functionality. We shall use a ‘lern-by-example’ approach, using the Employee-Department-Projects
database from the Elmasri-Navathe textbook for all examples.
Example 1: Report the birth date and address of employee named "John Smith"
SELECT BDate, Address
FROM EMPLOYEE
WHERE Fname = ‘John’ AND
Lname = ‘Smith’;
OUTPUT BDate Address
9-Jan-55 731 Fonden
Notes:
- The SELECT command has at least two clauses: SELECT [list of attributes] FROM [tables]. If the
FROM clause has more than one table, then all of the named tables will be JOIN-ed. The attributes listed
after SELECT must belong to one of the tables named in the SELECT command.
- In the above example, the WHERE [expression] is also used. This expression is evaluated for each row
of the named table (EMPLOYEE in our example). If it is TRUE, then the named attributes of that row are
output.
8
- The output of a SELECT command may contain repeated identical rows (this is different from RA). If
you want SQL to eliminate any repeated rows, you can use the DISTINCT keyword:
Example 1a: Report the SSN of Employees who spend more than 15 hours on some project.
The difference in the outputs with or without the use of DISTINCT is shown below. Using DISTINCT is
preferred if we only want to know how many people work over 15 Hrs on some project; if we want to
know how many assignments of over 15 Hours per week are there, then the second query is useful.
SELECT DISTINCT ESSN
FROM WORKS_ON
WHERE Hours > 15;
OUTPUT ESSN
123456789 666884444 453453453 999887777 987987987
SELECT ESSN
FROM WORKS_ON
WHERE Hours > 15;
OUTPUT ESSN
123456789 666884444 453453453 453453453 999887777 987987987 987987987
Example 2: Report the Name and address of employees working in the “Research” department.
SELECT Fname, Lname, Address
FROM EMPLOYEE, DEPARTMENT
WHERE Dname = ‘Research’ AND
Dnumber = Dno
- The information is spread across two tables, EMPLOYEE and DEPARTMENT, so we need a JOIN
operation, which is done by listing both tables in the FROM clause.
- The join-condition is: (Dnumber = Dno);
- The selection condition is (Dname = ‘Research’)
9
OUTPUT Fname Lname Address
John Smith 731 Fonden Franklin Wong 638 Voss Ramesh Narayan 975 Fire Oak Joyce English 5631 Rice
Example 3: For each project located in Stafford, list the project number, the controlling department, and
the department manager's last name and address.
SELECT Pnumber, Dnum, Lname, Address
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE Dnum = Dnumber AND
MgrSSN = SSN AND
Plocation = ‘Stafford’;
OUTPUT
Pnumber Dnum Lname Address
10 4 Wallace 291 Berry 30 4 Wallace 291 Berry
- In this case, there are two join operations:
1. The join condition (Dnum = Dnumber) relates a project to its controlling department.
2. The join condition (MgrSSN = SSN) relates the controlling department to the employee who
manages that department.
The Dot-notation:
Sometimes, attributes in different tables can have the same name (e.g., Dnumber in DEPARTMENT and
DEPT_LOCATIONS). If two such tables need to be joined, then we must specify which attribute we are
really referring to. This is done by using the DOT-Notation for naming of attributes:
DEPARTMENT.Dnumber, DEPT_LOCATIONS.Dnumber, EMPLOYEE.Lname, etc.
ALIAS
Some queries need to refer to the same table twice. In this case, we can assign ALIAS names to the tables:
10
Example 4: For each employee, give the last name, and the last name of his/her supervisor.
SELECT E.Lname, S.Lname
FROM EMPLOYEE AS E, EMPLOYEE AS S
WHERE E.SuperSSN = S.SSN
OUTPUT
E.Lname S.Lname
Smith Wong Wong Borg Zeleya Wallace Wallace Borg Narayan Wong English Wong Jabbar Wallace
- An alias is defined using the keyword “AS”. It is a method of assigning an alternate name to an object,
e.g. a table or an attribute. E and S (both identical to the EMPLOYEE table), are joined using the
condition (E.SuperSSN = S.SSN).
Example 5: Print SSN of all employees.
SELECT SSN
FROM EMPLOYEE
OUTPUT SSN 123456789 333445555 999887777 987654321 666884444 453453453 987987987 888665555
11
- If the WHERE clause is missing, then SQL assumes that WHERE is TRUE for all rows; the above
query is the same as: SELECT SSN FROM EMPLOYEE WHERE 1;
Example 6: A common error in writing queries:
SELECT SSN, Dname
FROM EMPLOYEE, DEPARTMENT
- How many rows will the output contain? If you are JOIN-ing two or more tables, don’t forget to
specify the JOIN-condition!
Example 7: You can use ‘*’ to denote all attributes
SELECT *
FROM EMPLOYEE
- This command will output the entire EMPLOYEE Table.
- The ‘*’ can also be used in the dot-notation:
Example 7a:
SELECT DEPT_LOCATION.*, DEPARTMENT.Dname
FROM DEPT_LOCATION, DEPARTMENT
WHERE DEPT_LOCATION.Dnumber = DEPARTMENT.Dnumber
OUTPUT
Dnumber Dlocation Dname 1 Houston Headquarters 4 Stafford Administration 5 Bellaire Research 5 Sugarland Research 5 Houston Research
12
Example 8: List all projects which either use an employee called "Wong", or are controlled by a
department managed by somebody called "Wong".
(SELECT Pname
FROM PROJECT, WORKS_ON, EMPLOYEE
WHERE Pnumber = PNo AND ESSN = SSN AND LName = 'Wong' )
UNION
(SELECT PName
FROM PROJECT, DEPARTMENT, EMPLOYEE
WHERE DNum = Dnumber AND SSN = MgrSSN AND LName = 'Wong');
First sub-query Second sub-query OUTPUT
PName ProductY ProductZ Computerization Reorganisation
PName ProductX ProductY ProductZ
OUTPUT PName ProductX ProductY ProductZ Computerization Reorganisation
- The output is the set union of the results of two separate sub-queries;
- In some systems, other set theoretic operators are available (set-difference is called EXCEPT, and
intersection is called INTERSECT); however, it is better not to assume that these are available
- For UNION to succeed, if the results of the two sub-queries have the same attributes, that are defined
identically.
Nested Queries
One of the most powerful features of SQL is that it allows arbitrary nesting of the queries within other
queries. This is good because it allows us to logically break down a complex query into simpler ones, and
then combine the queries to produce the final result. There are two types of nested queries: un-correlated
and correlated.
Un-correlated nested queries are those in which the nested query (the inner query) can be solved by
itself.
If the inner query also makes a reference to some attribute(s) of the outer query, then the query is called a
correlated nested query.
13
Example 9: Report the name and address of all employees working in the 'Research' department.
SELECT Fname, Lname, Address
FROM EMPLOYEE
WHERE Dno IN ( SELECT Dnumber
FROM DEPARTMENT
WHERE Dname = 'Research' )
The inner query, (SELECT Dnumber …) outputs a table with one column:
Result of inner query: Dnumber5
The outer query then tests the condition (WHERE Dno IN …): here, the result of the inner query is
treated as a set, and the IN-clause tests for set-membership. The type of the attribute compared ising ‘IN’
(in our example, Dno) must match the type of the attribute output in the inner query (Dnumber). vThe
result of the query is:
OUTPUT Fname Lname Address Dno John Smith 731 Fonden 5 Franklin Wong 638 Voss 5 Ramesh Narayan 975 Fire Oak 5 Joyce English 5631 Rice 5
Example 10: (correlated nested query) Get the names of all employees who have a dependent with the
same first name.
SELECT E.Fname, E.Lname
FROM EMPLOYEE AS E
WHERE E.SSN IN ( SELECT ESSN
FROM DEPENDENT
WHERE ESSN = E.SSN
AND
E.Fname = DependentName )
14
OUTPUT E.Fname E.Lname
- The output is empty (in this case), but it is useful to understand how nested queries work:
For each tuple of the outer query,
Evaluate the inner query;
Check if the ‘WHERE…’ is TRUE;
If TRUE, output the result for this tuple.
Go to next tuple.
- Some general guidelines when writing nested queries:
1. Do not write nested queries of more than 3-levels.
2. By using aliases, it is always possible to create a single level query exactly equivalent to a multi-
level query:
Example 10a:
SELECT E.Fname, E.Lname
FROM EMPLOYEE AS E, DEPENDENT AS D,
WHERE E.SSN = D.ESSN AND E.Fname = DependentName
3. Always use explicit references in nested queries: tableName.attributeName
Why? Because the same table may be referred to in different levels of the nested query. A
reference to an unqualified attribute refers to the relation used in the innermost nested query
which uses that relation.
- Up to now, we have used nested queries in which the WHERE clause links to the inner query using the
set-element check, ‘IN’. This is only useful if the inner query has a single attribute in its “SELECT …”
list. However, often we need to identify tuples of inner queries by multiple attributes. In such cases, we
must use the EXISTS operator.
Example 11: Get names of employees who work for at least one project.
SELECT Fname, Lname
FROM EMPLOYEE
WHERE EXISTS ( SELECT *
FROM WORKS_ON
WHERE SSN = ESSN )
15
OUTPUT Fname Lname John Smith Franklin Wong Alicia Zeleya Ramesh Narayan Joyce English Ahmad Jabbar James Borg
Let’s see how the EXISTS operator works.
For each tuple of the outer sub-query {
Evaluate inner sub-query as OutputI;
if OutputI has at least one tuple,
EXISTS is TRUE report the attributes listed in the outer sub-query from this tuple; }
You can use logical inverse of the IN and EXISTS operators, by using “NOT IN” or “NOT EXISTS”.
The evaluation of the NOT EXISTS operator is as follows:
For each tuple of the outer sub-query {
Evaluate the inner query as OutputI;
If OutputI has no tuples,
NOT EXISTS is TRUE report the attributes listed in the outer sub-query from this tuple; }
NOTE: EXISTS is only useful in correlated nested queries (WHY?)
Example 12: Find names of employees who do not work for even one project.
SELECT Fname, Lname
FROM EMPLOYEE
WHERE NOT EXISTS ( SELECT *
FROM WORKS_ON
WHERE SSN = ESSN )
OUTPUT Fname Lname Jennifer Wallace
16
String comparison using ‘wildcards’. For all CHAR(n), VARCHAR(n), and even DATE, DATETIME
type of entries, we can perform substring matching. This is very useful when we do not know the exact
string that is input. For substring matches, there are two wildcards: ‘%’ matches zero or more contiguous
characters; ‘_’ matches exactly one character. The only operator that allows the use of wildcards is the
LIKE operator.
Example 13: Find names of all Employees who live on Fonden street.
SELECT Lname
FROM EMPLOYEE
WHERE Address LIKE ‘%Fonden%’;
OUTPUT Lname Smith
- This is useful if you don’t recall whether your data entry was “731 Fonden St” or 731 Fonden”, or if you
even forgot the house number.
Example 13a: Find names of all projects with name starting with ‘Product’.
SELECT PName
FROM PROJECT
WHERE PName LIKE ‘Product_’;
OUTPUT PName
ProductX ProductY ProductZ
- Depending on which version of SQL your DBMS uses, LIKE may be case-sensitive or case-insensitive.
- Advanced users: A very powerful pattern matching function for strings is called REGEXP or RLIKE. It
acts just like a regular expression in PHP or PERL. For example, if you want to match any LName
starting with ‘J’ or ‘j’, you can use:
17
SELECT * FROM EMPLOYEE WHERE LName RLIKE ‘^[Jj]’;
To select records of all LNames ending with ‘ja’, you can use:
SELECT * FROM EMPLOYEE WHERE LName RLIKE ‘ja$’;
Post-processing outputs: Aggregate Functions and Grouping. Often, you would like to generate an
output, and then perform some post-processing operations to get a meaningful result. This is a very useful
mechanism and you will find that you use it quite often. The usual mechanism is as follows:
1. Use the output of a SQL query;
2. Divide the output into groups, with each group having some common (specified) characteristic;
3. Compute some statistical value for each group
4. Report the output group-by-group
Common aggregation functions include:
Sum: find the total value of some numerical valued attribute of several tuples;
Max: find the tuple with the maximum value for a given attribute;
Min: find the tuple with the minimum value for a given attribute;
Avg: find the average value of some numerical valued attributes for several tuples.
Example 14: Get the minimum, maximum, average and total salaries for employees of the Research
department.
SELECT sum(Salary), max( Salary), min( Salary), avg( Salary)
FROM EMPLOYEE, DEPARTMENT
WHERE Dno = Dnumber AND Dname = 'Research'
OUTPUT
13300 4000 2500 3325
By default, SQL will not assign names for aggregated attributes. However, it is good practice to assign an
alias name to each:
18
Example 14a: Get the minimum, maximum, average and total salaries for employees of the Research
department.
SELECT sum(Salary) AS Tot, max( Salary) AS Max, min( Salary) AS Min, avg( Salary) AS Mean
FROM EMPLOYEE, DEPARTMENT
WHERE Dno = Dnumber AND Dname = 'Research'
OUTPUT
Tot Max Min Mean
13300 4000 2500 3325
- In the above example, we did not explicitly form a ‘group’, so all rows in the output of the query were
put into one group. It is more common to have several groups.
Example 15: For departments other than Headquarters, get the department number, the number of
employees in that department, and their average salary.
SELECT Dno, count(*) AS HeadCount, avg(Salary) AS MeanSalary
FROM EMPLOYEE, DEPARTMENT
WHERE Dno = Dnumber AND Dname <> 'Headquarters'
GROUP BY Dno;
OUTPUT Dno HeadCount MeanSalary 5 4 3325 4 3 3100
1. The SELECT...FROM...WHERE query is first evaluated
2. In the resulting table, every tuple which has the same Dno value is ‘grouped’
3. The count(*) function prints out the number of rows in the group
4. The avg(Salary) function computes the mean of the ‘Salary’ attribute of the rows of each group
It is possible also to conditionally allow/exclude some groups from the results:
19
Example 16: For ‘Large’ departments other than Headquarters, get the department number, the number of
employees in that department, and their average salary.
SELECT Dno, count(*) AS HeadCount, avg(Salary) AS MeanSalary
FROM EMPLOYEE, DEPARTMENT
WHERE Dno = Dnumber AND Dname <> 'Headquarters'
GROUP BY Dno
HAVING HeadCount > 3;
OUTPUT Dno HeadCount MeanSalary 5 4 3325
Example 17: Mathematical operators. Display the result of a 10% increase in Salary of employees
whose Last name starts with "B".
SELECT Lname, 1.1 * Salary AS IncreasedSalary
FROM EMPLOYEE
WHERE Lname LIKE 'B%'
OUTPUT Lname IncreasedSalary Borg 6050
Note that this does not change Borg’s salary -- it only displays what the increased value will be!
Sorted display of the output: Output can be sorted using the ORDER BY clause.
The default for the ordering is Ascending order. If you want to order in descending order, just use:
ORDER BY … DESC.
Example 18:
SELECT Lname, Salary
FROM EMPLOYEE
ORDER BY Salary DESC
20
OUTPUT Lname Salary Borg 5500 Wallace 4300 Wong 4000 Narayan 3800 Smith 3000 Zeleya 2500 English 2500 Jabbar 2500
SELECT is a powerful command. In addition, different DBMS’s will provide many extra functions to
allow the output to be formatted and grouped as you desire. It is important to read the user-guide and
tutorials for the DBMS you will use to learn these additional examples and functions.
VIEWS and Security Control
A view is a single, virtual table derived from a set of existing tables. It may be defined using any
combination of one or more existing tables or views. Views have two important uses:
(a) They can be used to show data that is conceptually related in one table, even though the Normalization
process has required us to store the data physically in separate tables.
(b) They can be used to hide some part of information from (some subset of) users, making it easier to
control data security.
Example 1: Create a view showing the names of employees, which project they work on, and how many
hours they spend on each project.
CREATE VIEW EMP_WORKS_ON
AS SELECT Fname, Lname, Pname, Hours
FROM EMPLOYEE, PROJECT, WORKS_ON
WHERE SSN = ESSN AND Pno = Pnumber;
Example 1a: Show the data in EMP_WORKS_ON.
SELECT * FROM EMP_WORKS_ON
21
EMP_WORKS_ON Fname Lname Pname Hours John Smith ProductX 32.5 John Smith ProductY 7.5 Ramesh Narayan ProductZ 40 Joyce English ProductX 20 Joyce English ProductY 20 Franklin Wong ProductY 10 Franklin Wong ProductZ 10 Franklin Wong Computerization 10 Franklin Wong Reorganization 10 Alicia Zeleya Newbenefits 30 Alicia Zeleya Computerization 10 Ahmad Jabbar Computerization 35 Ahmad Jabbar Newbenefits 5 Ahmad Jabbar Newbenefits 20 Ahmad Jabbar Reorganization 15 James Borg Reorganization null
Caution in using views
Views appear similar to any other table in a DB, yet it is important to understand the differences between
tables and views:
1. Each table has some information that exists in the computer memory (on the disk). A VIEW does
not correspond to any stored information on the disk. Data for a view is only generated when the query
is processed.
2. Update a view attribute data in the underlying table is updated
Example 2: What happens to employee hours if they work one-shift overtime?
UPDATE EMP_WORKS_ON
SET Hours = Hours * 1.5
What is the outcome ?
Since ‘Hours’ in EMP_WORKS_ON is actually derived from ‘Hours’ in WORKS_ON, therefore all the
data for ‘Hours’ in WORKS_ON will be modified.
22
This behaviour can cause some unexpected results. Suppose that John Smith, who is currently working on
‘ProductX’ project, is reassigned assigned to ‘ProductY’ project. We may be tempted to write:
Example 3 (incorrect):
UPDATE EMP_WORKS_ON
SET Pname = 'ProductY'
WHERE Lname = 'Smith' AND
Pname = 'ProductX'
What happens now? Since the base table for ‘PName’ is PROJECTS, therefore in that table, the data
‘ProductX’ is changed to ‘ProductY’ -- this is obviously incorrect (why ?).
The correct query should be something like the following.
Example 3:
UPDATE WORKS_ON
SET Pno = ( SELECT Pnumber FROM PROJECTS WHERE Pname = 'ProductY')
WHERE ESSN = ( SELECT SSN FROM EMPLOYEE WHERE Lname = 'Smith')
AND
Pno = ( SELECT Pnumber FROM PROJECT WHERE Pname = 'ProductX');
3. A view can contain computed attributes, which are usually not explicitly stored in normal tables.
Example 3:
CREATE VIEW DEPT_INFO
AS SELECT DName, count(*) AS NumEmps, sum( Salary) AS TotalSalary
FROM DEPARTMENT, EMPLOYEE
WHERE DNumber = DNo
GROUP BY DName;
- Note that you cannot UPDATE a computed attribute.
23
A big advantage of using a DBMS is that you can control arbitrarily fine level of control on who can do
what operations to which data. This control is specified via the GRANT and REVOKE commands.
The DB Administrator (DBA) creates all user accounts (including user name and passwords), and has the
right to control all the privileges. In SQL, these privileges cover the commands SELECT (i.e. authority to
see the data), UPDATE, INSERT and DELETE (individual DBMS’s may provide further types of
security control, including encryption etc).
Example 4: Allow user U1 to see/modify all Employee data except Salaries.
CREATE VIEW EMP_PERSONNEL AS
SELECT Fname, Minit, Lname, SSN, BDate, Address, Sex, SuperSSN, Dno
FROM EMPLOYEE;
GRANT SELECT, UPDATE ON EMP_PERSONNEL to U1;
If we expect that the user U1 will need to assign data-lookup duties to other users, then we could use:
GRANT SELECT, UPDATE ON EMP_PERSONNEL TO U1 WITH GRANT OPTION;
Now, U1 can log onto the DB, and let other users, e.g. U2, to see (but not modify) the data:
GRANT SELECT ON EMP_PERSONNEL TO U2;
If later U2 is to be denied access, the privilege can be revoked:
REVOKE SELECT ON EMP_PERSONNEL FROM U2;
NOTE: In general, it is good practice to use VIEW’s to GRANT access for SELECT, but it is better to
use actual tables for GRANT on INSERT, DELETE and UPDATE commands. The reason is that
modifying data from a view will actually change the data in the underlying table, which can cause
unexpected results as we saw in Example 3 above.
- You can GRANT access for individual columns:
GRANT UPDATE ON EMPLOYEE( Salary) TO U3;
24