introduction to the structured query language (sql) ufce8k-15-m: data management

26
Introduction to the Structured Query Language (SQL) UFCE8K-15-M: Data Management

Upload: emmeline-black

Post on 31-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Introduction to the Structured Query Language(SQL)

UFCE8K-15-M: Data Management

Origins & history• Early 1970’s – IBM develops Sequel as part of the System R

project at its San Hose Research Lab;• 1986 - ANSI & ISO publish the standard SQL-86;• 1987 – IBM publishes its own “standard” SQL called

Systems Architecture Database Interface (SAA-SQL);• 1989 – SQL-89 published by ANSI (extended version of

SQL-86);• 1992 – SQL-92 published with better support for algebraic

operations; • 1999 – SQL-1999 published with support for typing, stored

procedures, triggers, BLOBs etc.

• SQL-92 remains the most widely implemented standard • – and most database vendors also provide their own

(proprietary) extensions.

Components of SQLThe SQL language has several parts:• Data-definition language (DDL). The SQL DDL provides commands for

defining relation schemas, deleting relations, and modifying relation schemas.

• Interactive data-manipulation language (DML). The SQL DML includes a query language based on both the relational algebra and the tuple relational calculus. It includes also commands to insert tuples into, delete tuples from, and modify tuples in the database.

• View definition. The SQL DDL includes commands for defining views.• Transaction control. SQL includes commands for specifying the

beginning and ending of transactions.• Embedded SQL and dynamic SQL. Embedded and dynamic SQL

define how SQL statements can be embedded within general-purpose programming languages, such as C, C++, Java, PL/I, Cobol, Pascal, and Fortran.

• Integrity. The SQL DDL includes commands for specifying integrity constraints that the data stored in the database must satisfy. Updates that violate integrity constraints are disallowed.

• Authorization. The SQL DDL includes commands for specifying access rights to relations and views.

SQL Example (example db)

The Supplier-Parts Database

sid sname status city

1 Smith 20 London

2 Jones 10 Paris

3 Blake 30 Paris

4 Clark 20 London

5 Adams 30 Athens

s

pid pname colour weight city

1 Nut Red 12.0 London

2 Bolt Green 17.0 Paris

3 Screw Blue 17.0 Oslo

4 Screw Red 14.0 London

5 Cam Blue 12.0 Paris

6 Cog Red 19.0 London

sid pid qty

1 1 300

1 2 200

1 3 400

1 4 200

1 5 100

1 6 100

2 1 300

2 2 400

3 2 200

4 2 200

4 4 300

4 5 400

p

sp

SQL Example (project)

Project the columns

sname

Smith

Jones

Blake

Clark

Adams

SELECT sname FROM s

computed columns:

SELECT sname, status * 5 FROM s

sname status * 5

Smith 100

Jones 50

Blake 150

Clark 100

Adams 150

renamed columns:

SELECT sname AS Supplier, status * 5 AS 'Status times Five' FROM s

Supplier Status times Five

Smith 100

Jones 50

Blake 150

Clark 100

Adams 150

SELECT statement (restrict)

Restrict the rowsSELECT * FROM s WHERE city=‘London’

sid sname status city

s1 Smith 20 London

s4 Clark 20 London

complex condition:

SELECT * FROM s WHERE city=‘London’ OR status = 30

sid sname status

city

s1 Smith 20 London

s3 Blake 30 Paris

s4 Clark 20 London

s5 Adams 30 Athens

SELECT statement (restrict & project)

Restrict & Project

city

London

London

SELECT city FROM s WHERE sname='smith' OR status='20'

remove duplicate rows:

SELECT DISTINCT city FROM s WHERE sname='smith' OR status='20'

city

London

SELECT statement (group by & having)

Use the ‘GROUP BY’ clause to aggregate related rows

city Total Status

Athens 30

London 40

Paris 40

SELECT city, SUM(status) AS 'Total Status' FROM s GROUP BY city

• Group By and Having

Use the ‘HAVING’ clause to restrict rows aggregated with ‘GROUP BY’

city Total Status

London 40

Paris 40

SELECT city, SUM(status) AS 'Total Status' FROM s GROUP BY city HAVING SUM(status) > 30

For many of the modern uses of databases, it is often necessary to select some subset of the records from a table, and let some other program manipulate the results. In SQL the SELECT statement is the workhorse for these operations.

A summary of the SELECT statement:

SELECT columns or computationsFROM tableWHERE conditionGROUP BY columnsHAVING conditionORDER BY column [ASC | DESC]LIMIT offset,count;

SELECT statement summarized :

In SQL, the WHERE clause is used to operate on subsets of a table. The following comparison operators are available:

•Usual logical operators: < > <= >= = <>•BETWEEN used to test for a range•IN used to test group membership•Keyword NOT used for negation•LIKE operator allows wildcards

• _ means single character, % means anything• SELECT salary WHERE name LIKE ’Fred %’;

SQL Comparison operators :

SQL supports a very large number of data types & formats for internal storage of data.

Numeric•INTEGER, SMALLINT, BIGINT•NUMERIC(w,d), DECIMAL(w,d) - numbers with width w and d decimal places•REAL, DOUBLE PRECISION - machine and database dependent•FLOAT(p) - floating point number with p binary digits of precision

SQL data types :

Character• CHARACTER(L) - a fixed-length character of length L• CHARACTER VARYING(L) or VARCHAR(L) - supports maximum length of L

Binary• BIT(L), BIT VARYING(L) - like corresponding characters• BINARY LARGE OBJECT(L) or BLOB(L)

Temporal• DATE• TIME• TIMESTAMP

SQL data types (cont.) :

SQL Functions :

o SQL provides a wide range of predefined functions to perform data manipulation.

o Four types of functions:arithmetic (sqrt(), log(), mod(), round()

…)date (sysdate(), month(), dayname() …)character (length(), lower(), upper()…)aggregate (min(), max(), avg(), sum() …)

Database & Table description commands :

Since a single server can support many databases, eachcontaining many tables, with each table having a variety of columns, it’s often necessary to view which databases are available and what the table structures are within a particular database.

The following SQL commands are often used for these purposes :

• SHOW DATABASES;• SHOW TABLES IN database;• SHOW COLUMNS IN table;• DESCRIBE table; - shows the columns and their types

Inserting Records :Individual records can be entered using the INSERT command:INSERT INTO s VALUES(6, Thomas, 40, Cardiff);

Using the column names:INSERT INTO s (sno, sname, status, city)VALUES(6, Thomas, 40, Cardiff);

Insert multiple records:INSERT INTO s (sno, sname, status, city)VALUES(6, Thomas, 40, Cardiff), (7, Hamish, 30, Glasgow);

Upload from file:LOAD DATA INFILE ’supplier.tab’INTO TABLE sFIELDS TERMINATED BY ’\t’;

Updating (Editing) Existing Records :

To change one or more values of columns of a table, the UPDATE command can be used.

Edits are provided as a comma-separated list of column/value pairs.

UPDATE s SET status=status + 10WHERE city=’London’;

Note that the UPDATE command without a WHERE clause will update all the rows of a table.

Deleting Records :

To delete existing record/s the DELETE FROM command is used.

Note the WHERE clause in the DELETE syntax. The WHERE clause specifies which record or records that should be deleted. If the WHERE clause is omitted, all records will be deleted!

DELETE FROM s WHERE city=’London’;

Joins (1)

o The m-f database

id name age

1 tom 23

2 dick 20

3 harry 30

id name age

1 mary 23

2 anne 30

3 sue 34

m f

Joins are used to re-combine records which have data spread across many tables. The following simple example database with two tables - m, f – is used to illustrate the various kinds of joins.

Joins (2)Product (or Cartesian Product)

id name age id name age

1 tom 23 1 mary 23

2 dick 20 1 mary 23

3 harry 30 1 mary 23

1 tom 23 2 anne 30

2 dick 20 2 anne 30

3 harry 30 2 anne 30

1 tom 23 3 sue 34

2 dick 20 3 sue 34

3 harry 30 3 sue 34

SELECT * FROM m, f

Synonymous with the CROSS JOIN, hence: SELECT * FROM m CROSS JOIN f; would return the same result. This is not very useful but is the basis for all other joins.

Joins (3)

Natural joinJoins tables using some shared characteristic – usually

(but not necessarily) a foreign key.

SELECT * FROM m,f WHERE m.age = f.age

id name age id name age

1 tom 23 1 mary 23

3 harry 30 2 anne 30

Joins (4)

Inner joinsThe previous example, besides being a natural join,

is also an example of an inner join. An inner join retrieves data only from those rows where the join condition is met.

id name age id name age

3 harry 30 1 mary 23

SELECT * FROM m,f WHERE m.age > f.age

Joins (5)Outer joins

Unmatched rows can be included in the output using as outer join.

id name age id name age

1 tom 23 1 mary 23

2 dick 20 NULL NULL NULL

3 harry 30 2 anne 30

id name age id name age

1 tom 23 1 mary 23

3 harry 30 2 anne 30

NULL NULL NULL 3 sue 34

Right outer join:SELECT * FROM m RIGHT OUTER JOIN f ON m.age = f.age

Left outer join: SELECT * FROM m LEFT OUTER JOIN f ON m.age = f.age

Joins (6)Self Join

Special case of the inner join – here the table employee shows employees and their managers. Ruth manages Joe who manages Tom, Dick and Harry.

emp_id emp_name mgr_id

1 Tom 4

2 Dick 4

3 Harry 4

4 Joe 5

5 Ruth NULL

Employee Manager

Tom Joe

Dick Joe

Harry Joe

Joe Ruth

Show who manages who by name:SELECT E1.emp_name AS Employee, E2.emp_name AS ManagerFROM employee AS E1INNER JOIN employee AS E2 ON E1.mgr_id = E2.emp_id

oA database daemon is a process that looks after a number of DATABASES each of which comprises a number of TABLES each of which comprises data arranges in ROWS and COLUMNS.

o You don't access database information directly on the disc from any application program - you access it via your database daemon (a program called mysqld) which is usually running all the time and MUST be running whenever you want to access your data.

o You SHOW to look at the structure but you SELECT to look at the content.

o You UPDATE the content of table rows but you ALTER the structure of a table.

o You choose rows from a result set with a WHERE clause but you choose rows after you've grouped them with a HAVING clause.

o You add tables to each other "side by side" with a JOIN and you add results sets above/below each other with a UNION.

MySQL & SQL – some key facts (1)

o You use a USE command (or other function call) to set the current DATABASE but there is no concept of current TABLE to carry through from one query to the next on a connection.

oREPLACE gets rid of existing rows and puts new data in their place, but UPDATE changes existing rows in situe and leaves columns that you don't re-specify unaltered. DELETE gets rid of rows completely, and INSERT puts in completely new rows.

o Every row in a table has the same structure (homogeneous) but different columns will have different structures (hetrogeneous).

o If you JOIN tables, you tell MySQL how the tables are to be connected to each other (which row(s) in the first table match which row(s) in the second) using an ON clause.

oA regular JOIN return only result rows where the ON clause matches up a row from each table exactly, and a LEFT JOIN return the same results and IN ADDITION any records from the left hand table specified, NULL padded to make a complete result record.

o You use = to check whether something has an exact value, but you use IS NULL to check for the total absence of any defined value.

MySQL & SQL – some key facts (2)