ZEIT2301Design of Information Systems
SQL: Computing Statistics
School of Engineering and Information TechnologyUNSW@ADFA
Dr Kathryn Merrick
Topic 11: SQL Computing Statistics
In this lecture you will learn to use functions in SQL to compute simple statistics on data
1. Aggregating functions 2. Ordering functions 3. String functions 4. Date functions
Reference: http://www.w3schools.com/sql/
1. Aggregate Functions
Functions that operate on a single column (or expression) and return a single value
COUNT – counts the number of values SUM – returns the total of the values AVG – returns the average of the values MIN – returns the minimum value MAX – returns the maximum value
Used in SELECT clause NOT allowed in WHERE clause (very common mistake)
COUNT()
How many clubs are there?SELECT COUNT(*)FROM sportClub;
How many club presidents are there? SELECT COUNT(president)
FROM sportClub;
sportClub (sport, contactNo, sponsor, president, annualBudget )
Query returns a table with one row with one column.
* is a special shorthand;Query counts all rows of the table
Does not count nulls in the “president” column
COUNT(DISTINCT)
How many different club sponsors are there?
SELECT COUNT (DISTINCT sponsor)
FROM sportClub;
(supported by Oracle and SQL Server but not by Access)
sportClub (sport, contactNo, sponsor, president, annualBudget )
Discards duplicates
SUM()
Each club has an annual budget. What is the total budget amount for all clubs?
SELECT SUM(annualBudget)
FROM sportClub;
sportClub (sport, contactNo, sponsor, president, annualBudget )
Query returns a table with one row with one column.
Hint: The NRL Salary Cap for 2011 is $4.3m for the 25 highest paid players at each club.
AVG(), MIN(), MAX()
Find the average, minimum and maximum cost of the clubs’ budgets
SELECT AVG(annualBudget), MIN(annualBudget), MAX(annualBudget)
FROM sportClub;
Query returns one row with three columns.
Review: Column Name Aliases
Columns can be renamed in the result table using the AS clause to give more meaningful output
Also useful to avoid display of system generated
column names for calculated columns (MsAccess uses “Expr1”)
Select SUM(annualBudget) AS TotalBudget
The Bike Database Revisited
Bike name*
Number of riders*
Centre of mass height
Harley 1 0.724
Harley 2 0.775
Honda 1 0.831
Honda 2 0.881
Road conditions*
Coefficient of friction
Icy 0.1
Wet 0.5
Dry 0.9
Scenario ID*
Bike name
Number of riders
Road conditions
Can stoppie
1 Harley 1 Dry false
2 Harley 2 Dry false
3 Honda 1 Dry true
4 Honda 2 Dry true
Bike name*
Wheelbase
Harley 1.588
Honda 1.458
Aliasing Examples
SELECT MIN(wheelbase) AS minWheelbase FROM Bikes;
SELECT MAX(scenarioID) AS maxScenarioID FROM Scenarios;
SELECT AVG(wheelbase) AS avgWheelbase FROM Bikes;
SELECT COUNT(wheelbase) AS smallWheelbases
FROM Bikes
WHERE wheelbase < 1.5;
Aggregating Results
The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.
Eg: to find the total value of all orders by each customer, we can use the GROUP BY statement to group customers.
SELECT customer, SUM(orderPrice)
FROM OrdersGROUP BY customer
Orders(orderID, orderDate, orderPrice, customer)
Aggregating Results Solution
orderID orderDate orderPrice customer
1 2008/11/12 1000 Hansen
2 2008/10/23 1600 Nilsen
3 2008/09/02 700 Hansen
4 2008/09/03 300 Hansen
5 2008/08/30 2000 Jensen
6 2008/10/04 100 Nilsen
customer SUM(orderPrice)
Hansen 2000
Nilsen 1700
Jensen 2000
Orders
Query result
Filtering Groups
Individual rows can be filtered using a WHERE clause
BUT groups must be filtered using a HAVING clause
Eg: suppose we only want to display customer order totals less than $2000:
SELECT Customer,SUM(OrderPrice) FROM Orders
GROUP BY CustomerHAVING SUM(OrderPrice) < 2000
customer SUM(OrderPrice)
Nilsen 1700
In Class Exercise
What is the result of the following query on the Orders table?
SELECT customer, SUM(orderPrice)
FROM OrdersWHERE customer='Hansen' OR customer='Jensen'GROUP BY customerHAVING SUM(orderPrice) > 1500
orderID orderDate orderPrice customer
1 2008/11/12 1000 Hansen
2 2008/10/23 1600 Nilsen
3 2008/09/02 700 Hansen
4 2008/09/03 300 Hansen
5 2008/08/30 2000 Jensen
6 2008/10/04 100 Nilsen
2. Order Functions Find the first value of the orderPrice column
SELECT FIRST(orderPrice)
FROM Orders Equivalent to:
SELECT orderPrice
FROM Orders ORDER BY orderID LIMIT 1
Find the last value of the orderPrice column:SELECT LAST(orderPrice)
FROM Orders Equivalent to:
SELECT prderPrice
FROM Orders ORDER BY orderID DESC LIMIT 1
3. String Functions
Functions that operate on strings (varchars) UCASE() – convert a string to uppercase LCASE() – convert a string to lower case MID() – extract characters from the middle of a string LEN() – find the length of a string
UCASE()
personID lastName firstName address city
1 Hansen Ola Timoteivn 10 Sandnes
2 Svendson Tove Borgvn 23 Sandnes
3 Pettersen Kari Storgt 20 Stavanger
SELECT UCASE(lastName) as lastName, firstName FROM Persons
lastName firstName
HANSEN Ola
SVENDSON Tove
PETTERSEN Kari
Persons
MID()
SELECT MID(city,1,4) as SmallCity FROM Persons
Column name
Start Character
End Character
SmallCity
Sand
Sand
Stav
4. Date Functions
Functions for manipulating dates NOW() – get the current system date and time
SELECT productName, unitPrice, NOW() as perDate FROM Products
prod_Id productName unit unitPrice
1 Jarlsberg 1000 g 10.45
2 Mascarpone 1000 g 32.56
3 Gorgonzola 1000 g 15.67
productName unitPrice perDate
Jarlsberg 10.45 10/7/2008 11:25:02 AM
Mascarpone 32.56 10/7/2008 11:25:02 AM
Gorgonzola 15.67 10/7/2008 11:25:02 AM