advanced sql programming mark holm centerfield technology

44
Advanced SQL Programming Mark Holm Mark Holm Centerfield Technology Centerfield Technology

Upload: verity-wright

Post on 17-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advanced SQL Programming Mark Holm Centerfield Technology

Advanced SQLProgramming

Mark HolmMark Holm

Centerfield TechnologyCenterfield Technology

Page 2: Advanced SQL Programming Mark Holm Centerfield Technology

Goals

IntroduceIntroduce some useful advanced SQL some useful advanced SQL programming techniques programming techniques

Show you how to let the database do more Show you how to let the database do more work to reduce programming effortwork to reduce programming effort

Go over some basic techniques and tips to Go over some basic techniques and tips to improve performanceimprove performance

2

Page 3: Advanced SQL Programming Mark Holm Centerfield Technology

Notes

V4R3 and higher syntax used in examplesV4R3 and higher syntax used in examples Examples show only a small subset of what Examples show only a small subset of what

can be done!can be done!

3

Page 4: Advanced SQL Programming Mark Holm Centerfield Technology

Agenda

Joining files - techniques, do’s and don’tsJoining files - techniques, do’s and don’ts Query within a query - SubqueriesQuery within a query - Subqueries Stacking data - UnionsStacking data - Unions Simplifying data with ViewsSimplifying data with Views Referential Integrity and constraintsReferential Integrity and constraints Performance, performance, performancePerformance, performance, performance

4

Page 5: Advanced SQL Programming Mark Holm Centerfield Technology

Joining files

Joins are used to relate data from different tablesJoins are used to relate data from different tables Data can be retrieved with one “open file” rather Data can be retrieved with one “open file” rather

than manythan many Concept is identical to join logical files without an Concept is identical to join logical files without an

associated permanent object (except if the join is associated permanent object (except if the join is done with an SQL view)done with an SQL view)

5

Page 6: Advanced SQL Programming Mark Holm Centerfield Technology

Join types Inner JoinInner Join

– Used to find related dataUsed to find related data Left Outer (or simply Outer) JoinLeft Outer (or simply Outer) Join

– Used to find related data and ‘orphaned’ rowsUsed to find related data and ‘orphaned’ rows Exception JoinException Join

– Used to only find ‘orphaned’ rowsUsed to only find ‘orphaned’ rows Cross JoinCross Join

– Join all rows to all rowsJoin all rows to all rows

6

Page 7: Advanced SQL Programming Mark Holm Centerfield Technology

Sample tables

FirstName LastName DeptJohn Doe 397

Cindy Smith 450

Sally Anderson 250

Dept Area397 Development

550 Marketing

250 Sales

Em

plo

yee

tab

le

Dep

artm

ent

tab

le

Page 8: Advanced SQL Programming Mark Holm Centerfield Technology

Inner Join

SELECT LastName, Division FROM Employee, Department WHERE Employee.Dept = Department.Dept

• Method #1 - Using the WHERE Clause

• Method #2 - Using the JOIN Clause

SELECT LastName, Division FROM Employee INNER JOIN Department ON Employee.Dept = Department.Dept

NOTE: This method is useful if you need to influence the order of the tables are joined in for performance reasons. Only works on releases prior to V4R4.

8

Page 9: Advanced SQL Programming Mark Holm Centerfield Technology

Results

LastName AreaDoe Development

Anderson Sales

• Return list of employees that are in a valid department.

• Employee ‘Smith’ is not returned because she is not in a department listed in the ‘Department’ table

Res

ult

tab

le

9

Page 10: Advanced SQL Programming Mark Holm Centerfield Technology

Left Outer Join

SELECT LastName, Area FROM Employee LEFT OUTER JOIN Department ON Employee.Dept = Department.Dept

• Must use Join Syntax

10

Page 11: Advanced SQL Programming Mark Holm Centerfield Technology

Results

LastName AreaDoe Development

Smith -

Anderson Sales

• Return list of employees even if they are not in a valid department

• Employee ‘Smith’ has a NULL Area because it could not be associated with a valid Dept

Res

ult

tab

le

11

Page 12: Advanced SQL Programming Mark Holm Centerfield Technology

Exception Join

SELECT LastName, Area FROM Employee EXCEPTION JOIN Department ON Employee.Dept = Department.Dept

• Must use Join Syntax

12

Page 13: Advanced SQL Programming Mark Holm Centerfield Technology

Results

LastName AreaSmith -

• Return list of employees only if they are NOT in a valid department

• Employee ‘Smith’ is only one without a valid department

Res

ult

tab

le

13

Page 14: Advanced SQL Programming Mark Holm Centerfield Technology

WARNING!

The order tables are listed in the FROM The order tables are listed in the FROM clause is importantclause is important

For OUTER and EXCEPTION joins, the For OUTER and EXCEPTION joins, the database must join the tables in that order. database must join the tables in that order.

The result may be horrible performance…The result may be horrible performance…more on this topic latermore on this topic later

14

Page 15: Advanced SQL Programming Mark Holm Centerfield Technology

Observations

Joins provide one way to bury application Joins provide one way to bury application logic in the databaselogic in the database

Each join type has a purpose and can be Each join type has a purpose and can be used to not only get the data you want but used to not only get the data you want but identify “incomplete” informationidentify “incomplete” information

With some exceptions, if joined properly With some exceptions, if joined properly performance should be at least as good as performance should be at least as good as an applicationan application

15

Page 16: Advanced SQL Programming Mark Holm Centerfield Technology

Subqueries

Subqueries are a powerful way to select Subqueries are a powerful way to select only the data you need without separate only the data you need without separate statements.statements.

Example: List employees making a higher Example: List employees making a higher than average salarythan average salary

16

Page 17: Advanced SQL Programming Mark Holm Centerfield Technology

Subquery Example

SELECT FNAME, LNAME FROM EMPLOYEEWHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE)

SELECT FNAME, LNAME FROM EMPLOYEEWHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE WHERE LNAME = ’JONES’)

17

Page 18: Advanced SQL Programming Mark Holm Centerfield Technology

Subqueries - types

Correlated Correlated – Inner select refers to part of the outer (parent) Inner select refers to part of the outer (parent)

select (multiple evaluations)select (multiple evaluations) Non-CorrelatedNon-Correlated

– Inner select does not relate to outer query (one Inner select does not relate to outer query (one evaluation)evaluation)

18

Page 19: Advanced SQL Programming Mark Holm Centerfield Technology

Subquery Tips 1

Subquery optimization (2nd statement will Subquery optimization (2nd statement will be faster)be faster)– SELECT name FROM employee WHERE SELECT name FROM employee WHERE

salary > ALL (SELECT salary FROM salscale) salary > ALL (SELECT salary FROM salscale) – SELECT name FROM employee WHERE SELECT name FROM employee WHERE

salary > (SELECT max(salary) FROM salscale)salary > (SELECT max(salary) FROM salscale)

19

Page 20: Advanced SQL Programming Mark Holm Centerfield Technology

Subquery Tips 2

Subquery optimization (2nd statement will Subquery optimization (2nd statement will be faster)be faster)– SELECT name FROM employee WHERE SELECT name FROM employee WHERE

salary IN (SELECT salary FROM salscale) salary IN (SELECT salary FROM salscale) – SELECT name FROM employee WHERE SELECT name FROM employee WHERE

EXISTS (SELECT salary FROM salscale EXISTS (SELECT salary FROM salscale WHERE employee.salid = salscale.salid)WHERE employee.salid = salscale.salid)

20

Page 21: Advanced SQL Programming Mark Holm Centerfield Technology

UNIONs

Unions provide a way to append multiple row sets files in one statementUnions provide a way to append multiple row sets files in one statement Example: Process all of the orders from January and FebruaryExample: Process all of the orders from January and February

SELECT * FROM JanOrders WHERE SKU = 199976

UNION

SELECT * FROM FebOrders WHERE SKU = 199976

21

Page 22: Advanced SQL Programming Mark Holm Centerfield Technology

Unions

Each SELECT statement that is UNIONed Each SELECT statement that is UNIONed together must have the same number of together must have the same number of result columns and have compatible typesresult columns and have compatible types

Two forms of syntaxTwo forms of syntax– UNION ALL -- allow duplicate recordsUNION ALL -- allow duplicate records– UNION -- return only distinct rowsUNION -- return only distinct rows

22

Page 23: Advanced SQL Programming Mark Holm Centerfield Technology

Views

Views provide a convenient way to Views provide a convenient way to permanently put SQL logicpermanently put SQL logic

Create once and use many timesCreate once and use many times Also make the database more Also make the database more

understandable to usersunderstandable to users Can put simple business rules into views to Can put simple business rules into views to

ensure consistencyensure consistency

23

Page 24: Advanced SQL Programming Mark Holm Centerfield Technology

Views Example: Make it easy for the human resources department to run a Example: Make it easy for the human resources department to run a

report that shows ‘new’ employees. report that shows ‘new’ employees.

CREATE VIEW HR/NEWBIES (EMPLOYEE_NAME, DEPARTMENT, HIRE_DATE) AS

SELECT concat(concat(strip(last_name),','),strip(first_name)),

department,hire_date

FROM HR/EMPLOYEE WHERE (year(current date)-year(hire_date)) < 2

24

Page 25: Advanced SQL Programming Mark Holm Centerfield Technology

Performance

SQL performance is harder to predict and SQL performance is harder to predict and tune than native I/O.tune than native I/O.

SQL provides a powerful way to manipulate SQL provides a powerful way to manipulate data but you have little control over HOW it data but you have little control over HOW it does it.does it.

Query optimizer takes responsibility for Query optimizer takes responsibility for doing it ‘right’.doing it ‘right’.

25

Page 26: Advanced SQL Programming Mark Holm Centerfield Technology

Performance - diagnosis

Getting information about how the Getting information about how the optimizer processed a query is crucialoptimizer processed a query is crucial

Can be done via one or all of the following:Can be done via one or all of the following:– STRDBG: debug messages in job logSTRDBG: debug messages in job log– STRDBMON: optimizer info put in fileSTRDBMON: optimizer info put in file– QAQQINI: can be used to force messagesQAQQINI: can be used to force messages– CHGQRYA: messages put out when time limit CHGQRYA: messages put out when time limit

set to 0set to 0

26

Page 27: Advanced SQL Programming Mark Holm Centerfield Technology

Performance tips

Create indexesCreate indexes– Over columns that significantly limit data in Over columns that significantly limit data in

WHERE clauseWHERE clause– Over columns that join tables togetherOver columns that join tables together– Over columns used in ORDER BY and Over columns used in ORDER BY and

GROUP BY clausesGROUP BY clauses

27

Page 28: Advanced SQL Programming Mark Holm Centerfield Technology

Performance tips

Create Encoded Vector Indexes (EVI’s)Create Encoded Vector Indexes (EVI’s)– Most useful in heavy query environments with a Most useful in heavy query environments with a

lot of data (e.g. large data warehouses)lot of data (e.g. large data warehouses)– Helps queries that process between 20-60% of a Helps queries that process between 20-60% of a

table’s datatable’s data– Create over columns with a modest number of Create over columns with a modest number of

distinct values and those with data skewdistinct values and those with data skew– EVI’s bridge the gap between traditional indexes EVI’s bridge the gap between traditional indexes

and table scansand table scans

28

Page 29: Advanced SQL Programming Mark Holm Centerfield Technology

Performance tips Encourage optimizer to use indexesEncourage optimizer to use indexes

– Use keyed columns in WHERE clause if possibleUse keyed columns in WHERE clause if possible– Use ANDed conditions as much as possibleUse ANDed conditions as much as possible– OPTIMIZE FOR n ROWSOPTIMIZE FOR n ROWS– Don’t do things that eliminate index useDon’t do things that eliminate index use

Data conversion (binary-key = 1.5)Data conversion (binary-key = 1.5) LIKE clause w/leading wildcard (NAME LIKE LIKE clause w/leading wildcard (NAME LIKE

‘%JOE’)‘%JOE’)

29

Page 30: Advanced SQL Programming Mark Holm Centerfield Technology

Performance tips

Keep statements simpleKeep statements simple– Complex statements are much more difficult to Complex statements are much more difficult to

optimizeoptimize– Provide more opportunity for the optimizer to Provide more opportunity for the optimizer to

choose a sub-optimal plan of attackchoose a sub-optimal plan of attack

30

Page 31: Advanced SQL Programming Mark Holm Centerfield Technology

Performance tips

Enable DB2 to use parallelismEnable DB2 to use parallelism– Query processed by many tasks (CPU Query processed by many tasks (CPU

parallelism) or by getting data from many disks parallelism) or by getting data from many disks at once (I/O parallelism)at once (I/O parallelism)

– CPU parallelism requires IBM’s SMP feature CPU parallelism requires IBM’s SMP feature and a machine with multiple processorsand a machine with multiple processors

– Enabled via the QQRYDEGREE system value, Enabled via the QQRYDEGREE system value, CHGQRYA, or the QAQQINI fileCHGQRYA, or the QAQQINI file

31

Page 32: Advanced SQL Programming Mark Holm Centerfield Technology

Other useful features

CASE clause - conditional calculationsCASE clause - conditional calculations ALIAS - access to multi-member filesALIAS - access to multi-member files Primary/Foreign keys - referential integrityPrimary/Foreign keys - referential integrity ConstraintsConstraints

32

Page 33: Advanced SQL Programming Mark Holm Centerfield Technology

CASE

Conditional calculations with CASEConditional calculations with CASE

SELECT Warehouse, Description, CASE RegionCode WHEN 'E' THEN 'East Region' WHEN 'S' THEN 'South Region' WHEN 'M' THEN 'Midwest Region' WHEN 'W' THEN 'West Region' END FROM Locations

33

Page 34: Advanced SQL Programming Mark Holm Centerfield Technology

CASE

Avoiding calculation errors (e.g. division by 0)Avoiding calculation errors (e.g. division by 0)

SELECT Warehouse, Description, CASE NumInStock WHEN 0 THEN NULL ELSE CaseUnits/NumInStock END FROM Inventory

34

Page 35: Advanced SQL Programming Mark Holm Centerfield Technology

ALIAS names

The CREATE ALIAS statement creates an alias on a table, view, or member of a database file.

– CREATE ALIAS CREATE ALIAS alias-name alias-name FORFOR table member table member Example: Create an alias over the second Example: Create an alias over the second

member of a multi-member physical filemember of a multi-member physical file– CREATE ALIASCREATE ALIAS February February FORFOR MonthSales MonthSales

FebruaryFebruary

35

Page 36: Advanced SQL Programming Mark Holm Centerfield Technology

Referential Integrity

Keeps two or more files in synch with each Keeps two or more files in synch with each otherother

Ensures that children rows have parentsEnsures that children rows have parents Can also be used to automatically delete Can also be used to automatically delete

children when parents are deletedchildren when parents are deleted

36

Page 37: Advanced SQL Programming Mark Holm Centerfield Technology

Referential Integrity Rules

A row inserted into a child table must have A row inserted into a child table must have a parent row (typically in another table).a parent row (typically in another table).

Parent rulesParent rules– A parent row can not be deleted if there are A parent row can not be deleted if there are

dependent children (Restrict rule) ORdependent children (Restrict rule) OR– All children are also deleted (Cascade rule) ORAll children are also deleted (Cascade rule) OR– All children’s foreign keys are changed (Set All children’s foreign keys are changed (Set

Null and Set Default rules)Null and Set Default rules)

37

Page 38: Advanced SQL Programming Mark Holm Centerfield Technology

Parent table Child table

Pri

mar

y K

ey

For

eign

K

eyPri

mar

y k

ey m

ust

b

e u

niq

ue

38

Page 39: Advanced SQL Programming Mark Holm Centerfield Technology

Referential Integrity syntax

ALTER TABLE Hr/Employee ADD ALTER TABLE Hr/Employee ADD CONSTRAINT EmpPK PRIMARY KEY CONSTRAINT EmpPK PRIMARY KEY (EmployeeId)(EmployeeId)

ALTER TABLE Hr/Department ADD ALTER TABLE Hr/Department ADD CONSTRAINT EmpFK FOREIGN KEY CONSTRAINT EmpFK FOREIGN KEY (EmployeeId) REFERENCES Hr/Employee (EmployeeId) REFERENCES Hr/Employee (EmployeeId) ON DELETE CASCADE (EmployeeId) ON DELETE CASCADE ON UPDATE RESTRICTON UPDATE RESTRICT

39

Page 40: Advanced SQL Programming Mark Holm Centerfield Technology

Check Constraints

Rules which limit the allowable values in one or Rules which limit the allowable values in one or more columns:more columns:

CREATE TABLE Employee CREATE TABLE Employee

(FirstName CHAR(20), (FirstName CHAR(20),

LastName CHAR(30), LastName CHAR(30),

Salary CHECK (Salary>0 AND Salary<200000))Salary CHECK (Salary>0 AND Salary<200000))

40

Page 41: Advanced SQL Programming Mark Holm Centerfield Technology

Check Constraints

Effectively does data checking at the database Effectively does data checking at the database level.level.

Data checking done with display files or Data checking done with display files or application logic can now be done at the application logic can now be done at the database level.database level.

Ensures that it is always done and closes “back Ensures that it is always done and closes “back doors” like DFU, ODBC, 3-rd party utilities….doors” like DFU, ODBC, 3-rd party utilities….

41

Page 42: Advanced SQL Programming Mark Holm Centerfield Technology

Other resources

Database Design and Programming for DB2/400 - book by Paul Conte

SQL for Smarties - book by Joe Celko

SQL Tutorial - www.as400network.com AS/400 DB2 web site at http://www.as400.ibm.com/db2/db2main.htm Publications at http://publib.boulder.ibm.com/pubs/html/as400/ Our web site at http://www.centerfieldtechnology.com

42

Page 43: Advanced SQL Programming Mark Holm Centerfield Technology

Summary

SQL is a powerful way to access and SQL is a powerful way to access and process dataprocess data

Used effectively, it can reduce the time it Used effectively, it can reduce the time it takes to build applicationstakes to build applications

Once tuned, it can perform very close (and Once tuned, it can perform very close (and sometimes better) than HLL’s alonesometimes better) than HLL’s alone

43

Page 44: Advanced SQL Programming Mark Holm Centerfield Technology

Good Luck and

Happy SQLing