t15 beyond sql rahimi

Upload: suman-etikala

Post on 05-Apr-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 T15 Beyond SQL Rahimi

    1/127

  • 7/31/2019 T15 Beyond SQL Rahimi

    2/127

    Outline

    Data Warehousing concepts

    On-Line Analytical Processing (OLAP) Using SQL to perform analytic functions

    Oracles advanced analytic functions

    Acknowledgement:

    Some of the material in this presentation are based on thebook, Databases and Transaction Processing, fromAddison Wesley.

    Copyright 2011 Saeed Rahimi 2

  • 7/31/2019 T15 Beyond SQL Rahimi

    3/127

  • 7/31/2019 T15 Beyond SQL Rahimi

    4/127

    Data Warehouse Concepts OLTP vs. OLAP

    On Line Transaction Processing (OLTP)

    Short burst transactions Frequent modifications

    Updates Inserts Deletes

    Normalized (3NF, BCNF or 4NF) Transactions access only a small fraction of the database

    On-Line Analytic Processing (OLAP) Main use is in decision support, business analysis

    Complex, aggregate and time based queries

    Almost never updates (bulk update or load) Queries access a large portion of the database Queries usually take longer to run Normalized (1NF, 2NF)

    Copyright 2011 Saeed Rahimi 4

  • 7/31/2019 T15 Beyond SQL Rahimi

    5/127

    Data Warehouse Concepts OLTP vs. OLAP example

    OLTP query: Update the balance of an account to show a deposit Insert into the item table to show addition of a new

    inventory

    Transaction: Check the balance of checking and savings

    Transfer $500 from savings to checking

    OLAP query:

    How many skis did we sell in the Northeast and Midwestregions of the US during the last quarter of the last fiveyears

    Copyright 2011 Saeed Rahimi 5

  • 7/31/2019 T15 Beyond SQL Rahimi

    6/127

    Data Warehouse Concepts OLAP: Traditional vs. Newer Applications

    Traditional

    Uses data the enterprise gathers in its usual activities,perhaps in its OLTP system

    Queries are ad hoc, perhaps designed and carried out-

    Newer Applications

    Gather data (information) actively

    Buy if have to Mine the information to find out better ways of serving the

    customer and/or selling more products

    Hire professional to do so

    Copyright 2011 Saeed Rahimi 6

  • 7/31/2019 T15 Beyond SQL Rahimi

    7/127

    Data Warehouse Concepts Example: Traditional vs. Newer Applications

    Traditional

    How many skis were sold in all Northeast warehouses inthe years 2004 and 2005?

    Newer Prepare a profile of the skiers for the residents of the

    Northeast region

    Customize our advertising and marketing to actively sell

    products types these residents would want The newer approach requires Data Mining

    Finding of nuggets of gold in the vast see of informationcollected in the warehouse

    Copyright 2011 Saeed Rahimi 7

  • 7/31/2019 T15 Beyond SQL Rahimi

    8/127

    Data Warehouse Concepts

    BitmapJoin

    MaterializedMaterializedViews

    WarehouseSql mergeMultitable

    Insert

    Externaltables

    Data Warehouse Life Cycle

    OperationalSystems

    StoreTransform UsersExtract PerformanceAnalyzeLoad

    Copyright 2011 Saeed Rahimi 8

  • 7/31/2019 T15 Beyond SQL Rahimi

    9/127

    Data Warehouse Concepts Data Mining

    Data Mining is the art of knowledge discovery

    Knowledge is used to better the business

    Data mining vs. OLAP OLAP: What percentage of people who make over

    $50,000 defaulted on their mortgage in the year 2010?

    Data Mining: How can information about salary, networth, and other historical data be used to predict whowill default on their mortgage?

    Copyright 2011 Saeed Rahimi 9

  • 7/31/2019 T15 Beyond SQL Rahimi

    10/127

    Data Warehouse Concepts Data warehouse as a database

    OLAP applications are based on a table called, fact table

    For example, a supermarket application might be based onthe fact table Sales

    Sales Market_Id, Product_Id, Time_Id, Sales_Amt

    The table is viewed as multidimensional

    The first three columns are the dimensions representing

    supermarkets, products and time intervals

    The fourth column, the Sales_Amt, is a function of theother three

    Copyright 2011 Saeed Rahimi 10

  • 7/31/2019 T15 Beyond SQL Rahimi

    11/127

    Data Warehouse Concepts A Cube

    The fact table can be viewed as a three-dimensional cube

    Each entry in this three-dimensional view represents aspecific sales amount for a given market, product and for aspecific time period

    Copyright 2011 Saeed Rahimi 11

  • 7/31/2019 T15 Beyond SQL Rahimi

    12/127

    Data Warehouse Concepts Dimension Tables

    The dimensions of the fact table can be furtherdescribed with dimension tables

    Sales (Market_id, Product_Id, Time_Id, Sales_Amt)

    Dimension Tables

    Market (Market_Id, City, State, Region)

    Product (Product_Id, Name, Category, Price)

    Time (Time_Id, Week, Month, Quarter)

    Copyright 2011 Saeed Rahimi 12

  • 7/31/2019 T15 Beyond SQL Rahimi

    13/127

    Data Warehouse Concepts Star Schema

    Time

    ProductMarket Sales

    Copyright 2011 Saeed Rahimi 13

  • 7/31/2019 T15 Beyond SQL Rahimi

    14/127

    Data Warehouse Concepts Schema of the Sales data warehouse

    Time

    Time_Id

    Week

    Month

    Quarter

    A2

    A10

    A10

    A8

    Identifier_1

    Sales Data

    Warehouse

    Market Product

    Time

    Market

    Market_Id

    City

    State

    Region

    A2

    A20

    A20

    A20

    Identifier_1

    Product

    Product_Id

    Name

    Category

    Price

    A2

    A20

    A20

    N6,2

    Identifier_1

    Sales

    Sales_Amt N8,2

    Copyright 2011 Saeed Rahimi 14

  • 7/31/2019 T15 Beyond SQL Rahimi

    15/127

    Data Warehouse Concepts Warehouses dimension tables

    Marketcreate table MARKET (

    MARKET_ID CHAR(2) not null,CITY CHAR(20),

    STATE CHAR(20),

    REGION CHAR(20),

    constraint PK_MARKET primary key (MARKET_ID)

    )

    ro uctcreate table PRODUCT (

    PRODUCT_ID CHAR(2) not null,

    NAME CHAR(20),

    CATEGORY CHAR(20),

    PRICE NUMBER(6,2),

    constraint PK_PRODUCT primary key (PRODUCT_ID)

    )

    Timecreate table TIME (

    TIME_ID CHAR(2) not null,

    WEEK CHAR(10),

    MONTH CHAR(10),

    QUARTER CHAR(8),

    constraint PK_TIME primary key (TIME_ID)

    )Copyright 2011 Saeed Rahimi 15

  • 7/31/2019 T15 Beyond SQL Rahimi

    16/127

    Data Warehouse Concepts

    Warehouses fact tablecreate table SALES (

    MARKET_ID CHAR(2) not null,

    PRODUCT_ID CHAR(2) not null,

    TIME_ID CHAR(2) not null,

    SALES_AMT NUMBER(8,2),

    constraint PK_SALES primary key (MARKET_ID, PRODUCT_ID, TIME_ID),

    constraint FK_SALES_MARKET_MARKET foreign key (MARKET_ID)

    references MARKET (MARKET_ID),

    constraint FK_SALES_PRODUCT_PRODUCT foreign key (PRODUCT_ID)

    references PRODUCT (PRODUCT_ID),

    constraint FK_SALES_TIME_TIME foreign key (TIME_ID)

    references TIME (TIME_ID)

    )

    Copyright 2011 Saeed Rahimi 16

  • 7/31/2019 T15 Beyond SQL Rahimi

    17/127

    Data Warehouse Concepts Star Schema of the Warehouse

    Time Dimension Table

    Time

    Time_Id

    Week

    Month

    Quarter

    CHAR(2)

    CHAR(10)

    CHAR(10)

    CHAR(8)

    Market Dimension Table

    Sales Fact Table

    Product Dimension Table

    Market

    Market_Id

    City

    State

    Region

    CHAR(2)

    CHAR(20)

    CHAR(20)

    CHAR(20)

    Product

    Product_Id

    Name

    Category

    Price

    CHAR(2)

    CHAR(20)

    CHAR(20)

    NUMBER(6,2)

    Sales

    Market_Id

    Product_Id

    Time_Id

    Sales_Amt

    CHAR(2)

    CHAR(2)

    CHAR(2)

    NUMBER(8,2)

    Copyright 2011 Saeed Rahimi 17

  • 7/31/2019 T15 Beyond SQL Rahimi

    18/127

    Data Warehouse Concepts

    SQL> select * from sales;

    MI PI TI SALES_AMT

    -- -- -- ----------

    M1 P1 T1 1000

    M1 P2 T1 2000

    M1 P3 T1 1500

    M1 P4 T1 2500

    M2 P1 T1 500

    SQL> select * from Market;

    MI CITY STATE REGION

    -- -------------------- -------------------- ----------

    M1 Stony Brook New York East

    M2 Newark New Jersey East

    M3 Oakland California West

    SQL> select * from Product;

    The warehouse tables

    M2 P3 T1 0

    M2 P4 T1 3333M3 P1 T1 5000

    M3 P2 T1 8000

    M3 P3 T1 10

    M3 P4 T1 3300

    M1 P1 T2 1001

    M1 P2 T2 2001

    M1 P3 T2 1501

    M1 P4 T2 2501

    M2 P1 T2 501

    M2 P2 T2 801

    ...

    ...

    ...

    36 rows selected.

    PI NAME CATEGORY PRICE

    -- -------------------- -------------------- ----------P1 Beer Drink 1.98

    P2 Diapers Soft Goods 2.98

    P3 Cold Cuts Meat 3.98

    P4 Soda Drink 1.25

    SQL> select * from Time;

    TI WEEK MONTH QUARTER

    -- ---------- ---------- --------

    T1 Wk-1 January FirstT2 Wk-24 June Second

    T3 Wk-52 December Fourth

    Copyright 2011 Saeed Rahimi 18

  • 7/31/2019 T15 Beyond SQL Rahimi

    19/127

    Data Warehouse Concepts Constellation (snow flake) schema

    A data warehouse may use more than one fact

    table These fact tables may share the same dimension

    therefore forming a schema that looks like a snow

    flake.

    Time

    ProductMarket Sales

    WarehouseInventory

    Copyright 2011 Saeed Rahimi 19

  • 7/31/2019 T15 Beyond SQL Rahimi

    20/127

    An Introduction to

    OLAP Basic O erations

    using SQL

  • 7/31/2019 T15 Beyond SQL Rahimi

    21/127

    Data Warehouse Concepts

    OLAP Operations

    Aggregation OLAP queries usually total(aggregate) information in the fact table

    ,

    product, in each market, for each quarter, weuse the following:

    Copyright 2011 Saeed Rahimi 21

    SELECT S.Market_Id, S.Product_Id, time_ID, SUM (S.Sales_Amt) AS Total_Sale

    FROM Sales SGROUP BY S.Market_Id, S.Product_Id, time_ID

    order by time_id;

    Aggregation

  • 7/31/2019 T15 Beyond SQL Rahimi

    22/127

    Data Warehouse ConceptsSELECT S.Market_Id, S.Product_Id, time_ID, SUM (S.Sales_Amt) AS Total_SaleFROM Sales S

    GROUP BY S.Market_Id, S.Product_Id, time_ID

    order by time_id;

    MA PR TI TOTAL_SALE

    -- -- -- ----------

    M1 P1 T1 1000

    M2 P1 T1 500

    M3 P1 T1 5000

    M1 P2 T1 2000

    M2 P2 T1 800

    Copyright 2011 Saeed Rahimi 22

    M3 P2 T1 8000

    M1 P3 T1 1500

    M2 P3 T1 0

    M3 P3 T1 10

    M1 P4 T1 2500

    M2 P4 T1 3333

    M3 P4 T1 3300

    M1 P1 T2 1001

    M2 P1 T2 501M3 P1 T2 5001

    M1 P2 T2 2001

    M2 P2 T2 801

    M3 P2 T2 8001

    M1 P3 T2 1501

    Not all rows are shown

  • 7/31/2019 T15 Beyond SQL Rahimi

    23/127

    Data Warehouse Concepts OLAP Operations The query on previous page returns a three dimensional view of the

    results (Cube)

    We can collapse the time dimension and show sales for each product in

    each market. This is a two-dimensional view of the same result.

    SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt) AS Total_Sale

    FROM Sales S

    GROUP BY S.Market_Id, S.Product_Id;

    MI PI TOTAL_SALES

    -- -- -----------

    M1 P1 3003

    M1 P2 6003

    M1 P3 4503

    M1 P4 7503

    M2 P1 1503

    M2 P2 2403

    M2 P3 3

    M2 P4 7000

    M3 P1 15003

    M3 P2 24003

    M3 P3 33

    M3 P4 9903

    Copyright 2011 Saeed Rahimi 23

  • 7/31/2019 T15 Beyond SQL Rahimi

    24/127

    Data Warehouse Concepts OLAP Operations

    The same result can be viewed as follows.

    This is a pivoted view of the same results.

    Total Sales M1 M2 M3

    Product_Id

    Mar et_I

    P1 3003 1503 15003

    P2 6003 2402 24003

    P3 4503 3 33

    P4 7503 7000 9903

    Copyright 2011 Saeed Rahimi 24

  • 7/31/2019 T15 Beyond SQL Rahimi

    25/127

    Data Warehouse Concepts Pivoted view of the same result

    select Product_id,

    sum(case when Market_ID = 'M1'

    then Sales_Amt

    else NULL end)as M1,

    sum(case when Market_ID = 'M2'

    then Sales_Amt

    else NULL end)as M2,

    Copyright 2011 Saeed Rahimi 25

    sum(case when Market_ID = 'M3'

    then Sales_Amt

    else NULL end)as M3

    FROM Sales

    GROUP BY Product_Id;

    PR M1 M2 M3

    -- ---------- ---------- ----------P1 3003 1503 15003

    P2 6003 2403 24003

    P3 4503 3 33

    P4 7503 7000 9903

  • 7/31/2019 T15 Beyond SQL Rahimi

    26/127

    Data Warehouse Concepts OLAP Operations We can now get the product sales for all markets in all quarters as:

    SELECT S.Market_Id, SUM (S.Sales_Amt) AS Total_Sale

    FROM Sales S

    GROUP BY S.Market_Id;

    MA TOTAL_SALE

    -- ----------

    M1 21012

    M2 10909

    And, finally get the total sales over all products for all markets for alltime periods as:

    M3 48942

    Copyright 2011 Saeed Rahimi 26

    SELECT SUM (S.Sales_Amt) AS Total_Sale

    FROM Sales S;

    TOTAL_SALE

    ----------

    80863

  • 7/31/2019 T15 Beyond SQL Rahimi

    27/127

    Data Warehouse Concepts OLAP Operations

    Drilling Down Some dimension tables represent a hierarchy

    For example:

    Market dimension has: City State Region

    Time dimension has: Week Month Quarter

    When we execute queries that move down a hierarchy (e.g., fromaggregation over regions to aggregation over states) we are drillingdown.

    We are adding more columns of the dimension to the query

    To be able to drill down, we must have access to more specific

    information.

    Copyright 2011 Saeed Rahimi 27

  • 7/31/2019 T15 Beyond SQL Rahimi

    28/127

    Data Warehouse Concepts OLAP Operations

    Dimensions do not always form a hierarchy

    Some dimensions may have a lattice

    For example, time dimension can be represented as a lattice

    Weeks are not contained in months

    We can roll up days into weeks or months, but we can only rollup weeks into quarters

    Copyright 2011 Saeed Rahimi 28

  • 7/31/2019 T15 Beyond SQL Rahimi

    29/127

    Data Warehouse Concepts OLAP Operations

    Drilling Down Example:

    The first query aggregates total sales for products in each region

    The second query drills down to state level.

    SELECT S.Product_Id,M.Region, SUM (S.Sales_Amt)

    FROM Sales S, Market MWHERE M.Market_Id = S.Market_Id

    GROUP BY M.Region, S.Product_Id;

    SELECT S.Product_Id,M.State, SUM (S.Sales_Amt)

    FROM Sales S, Market M

    WHERE M.Market_Id = S.Market_Id

    GROUP BY M.State, S.Product_Id;

    Copyright 2011 Saeed Rahimi 29

  • 7/31/2019 T15 Beyond SQL Rahimi

    30/127

    Data Warehouse Concepts

    OLAP Operations

    Rolling Up When we execute queries that move upthe hierarchy (e.g., from states to regions) we arerolling up

    e can ro up n e erarc y or use e resu s oprevious queries from lower aggregates

    Copyright 2011 Saeed Rahimi 30

  • 7/31/2019 T15 Beyond SQL Rahimi

    31/127

    Data Warehouse Concepts OLAP Operations

    Rolling Up:

    The following query creates a table containing the total sales for each stateas:

    CREATE TABLE State_Sales AS

    SELECT S.Product_Id, M.State, SUM (S.Sales_Amt)Sales_Amt

    FROM Sales S, Market M

    WHERE M.Market_Id = S.Market_Id

    . , . _

    Table created.

    select * from state_sales;

    PR STATE SALES_AMT

    -- -------------------- ----------

    P1 California 15003

    P2 California 24003

    P3 California 33

    P4 California 9903

    P1 New Jersey 1503P2 New Jersey 2403

    P3 New Jersey 3

    P4 New Jersey 7000

    P1 New York 3003

    P2 New York 6003

    P3 New York 4503

    P4 New York 7503

    12 rows selected.

    Copyright 2011 Saeed Rahimi 31

  • 7/31/2019 T15 Beyond SQL Rahimi

    32/127

    Data Warehouse Concepts OLAP Operations

    Rolling Up: Example

    Then we can use the following to roll up the total sales for each region as

    SELECT T.Product_Id, R.Region, SUM (T.Sales_Amt)

    FROM State_Sales T,

    (SELECT DISTINCT M.Region, M.State FROM Market M) R

    WHERE R.State = T.State

    GROUP BY R.Region, T.Product_Id;

    PR REGION SUM(T.SALES_AMT)

    -- -------------------- ----------------

    P1 East 4506

    P2 East 8406

    P3 East 4506

    P4 East 14503

    P1 West 15003P2 West 24003

    P3 West 33

    P4 West 9903

    8 rows selected.

    Copyright 2011 Saeed Rahimi 32

  • 7/31/2019 T15 Beyond SQL Rahimi

    33/127

    Data Warehouse Concepts OLAP Operations

    Pivot pivoting is changing the orientation of the cube.

    Dimensions that we are pivoting on are used in the GROUP BYclause aggregation (SUM) is used on the remaining attributes

    Copyright 2011 Saeed Rahimi 33

    PR QUARTER SUM(SALES_AMT)

    -- -------- --------------

    P3 Fourth 1516

    P1 First 6500

    P2 First 10800

    P2 Second 10803

    P2 Fourth 10806

    P1 Second 6503

    P3 Second 1513

    P1 Fourth 6506

    P3 First 1510

    P4 First 9133

    P4 Second 9136

    P4 Fourth 6137

    PR Q1 Q2 Q4

    -- ---------- ---------- ----------

    P1 6500 6503 6506

    P2 10800 10803 10806

    P3 1510 1513 1516

    P4 9133 9136 6137

  • 7/31/2019 T15 Beyond SQL Rahimi

    34/127

    Data Warehouse Concepts

    OLAP Operations

    Product sales per quarter for all regions.SELECT S.Product_Id, T.Quarter, SUM (Sales_Amt)

    FROM Sales S, Time T

    WHERE T.Time_Id = S.Time_Id

    GROUP BY S.Product_Id, T.Quarter

    ORDER BY S.Product_Id, T.Quarter;

    PR QUARTER SUM(SALES_AMT)

    -- -------- --------------

    P1 First 6500

    P1 Fourth 6506

    P1 Second 6503

    P2 First 10800

    P2 Fourth 10806

    P2 Second 10803

    P3 First 1510

    P3 Fourth 1516

    P3 Second 1513

    P4 First 9133

    P4 Fourth 6137

    P4 Second 9136

    Copyright 2011 Saeed Rahimi 34

  • 7/31/2019 T15 Beyond SQL Rahimi

    35/127

    Data Warehouse Concepts OLAP Operations Pivoted results so that we can see the sales for each quarter

    over all products.

    Note: T3 is Q4 and not Q3 in our time table

    SQL> select S.Product_id,

    2 sum(case when S.Time_id = 'T1'

    3 then Sales_Amt

    Copyright 2011 Saeed Rahimi 35

    4 else NULL end) as Q1,

    5 sum(case when S.Time_id = 'T2'6 then Sales_Amt

    7 else NULL end) as Q2,

    8 sum(case when S.Time_id = 'T3'

    9 then Sales_Amt

    10 else NULL end) as Q4

    11 FROM Sales S

    12 GROUP BY S.Product_Id;

    PR Q1 Q2 Q4

    -- ---------- ---------- ----------

    P1 6500 6503 6506

    P2 10800 10803 10806

    P3 1510 1513 1516

    P4 9133 9136 6137

  • 7/31/2019 T15 Beyond SQL Rahimi

    36/127

    Data Warehouse Concepts Pivot

    Oracle 11g has a pivot operation that can also be used

    Unpivot does exactly the opposite of the pivot

    SQL> select * from (

    2 select Product_ID, Market_ID, Sales_Amt

    3 from Sales )

    Copyright 2011 Saeed Rahimi 36

    4 pivot

    5 (6 Sum(Sales_Amt)

    7 for Market_ID in ('M1','M2','M3')

    8 )

    9 order by Product_ID;

    PR 'M1' 'M2' 'M3'-- ---------- ---------- ----------

    P1 3003 1503 15003

    P2 6003 2403 24003

    P3 4503 3 33

    P4 7503 7000 9903

  • 7/31/2019 T15 Beyond SQL Rahimi

    37/127

    Data Warehouse Concepts Pivot

    Can you write a pivot statement that generates the report onpage 35?

    Copyright 2011 Saeed Rahimi 37

  • 7/31/2019 T15 Beyond SQL Rahimi

    38/127

    Data Warehouse Concepts Pivot

    What if you do not know the exact number of values forMarket_ID ini the statement on page 36?

    Copyright 2011 Saeed Rahimi 38

  • 7/31/2019 T15 Beyond SQL Rahimi

    39/127

    Data Warehouse Concepts Pivot Oracle has the capability to deal with any number in the IN

    construct of the pivot statement and generate an XML report.

    SET LONG 99999

    select * from (

    select Product_ID, Market_ID, Sales_Amt

    See answer on next page

    Copyright 2011 Saeed Rahimi 39

    _ _ _

    from Sales )

    pivot xml(

    Sum(Sales_Amt)

    for Market_ID in (any)

    )

    order by Product_ID;

  • 7/31/2019 T15 Beyond SQL Rahimi

    40/127

    Data Warehouse ConceptsPR--MARKET_ID_XML

    --------------------------------------------------------------------------------

    P1

    M13003M21503M315003

    P2

    M1

  • 7/31/2019 T15 Beyond SQL Rahimi

    41/127

    Data Warehouse Concepts

    OLAP Operations

    Slice - A slice is a subset of a cube corresponding toa single value for one or more members of thedimensions not in the subset

    value, we are performing a slice Slicing Sales Cube in the time dimension:

    total sales of each product in Wk-1

    SELECT S.Product_Id, SUM (Sales_Amt)

    FROM Sales S, Time T

    WHERE T.Time_Id = S.Time_Id AND T.Week = Wk-1

    GROUP BY S. Product_Id;

    Copyright 2011 Saeed Rahimi 41

  • 7/31/2019 T15 Beyond SQL Rahimi

    42/127

    Data Warehouse Concepts

    OLAP OperationsDice The dice operation is a slice on more than two

    dimensions of a cube (or more than two consecutiveslices).

    When we use a GROUP BY clause in a uer to

    specify part of a hierarchy, we are partitioning themulti-dimensional cube into sub-cubes. Therefore, weare dicing the cube

    Example:

    SELECT S.Product_Id, T.Quarter, SUM (Sales_Amt)FROM Sales S, Time T

    WHERE T.Time_Id = S.Time_Id

    GROUP BY T.Quarter, S.Product_Id;

    Copyright 2011 Saeed Rahimi 42

  • 7/31/2019 T15 Beyond SQL Rahimi

    43/127

    Data Warehouse Concepts

    OLAP Operations Dice Dicing Sales in the time dimension: total sales for each product in each quarter.

    SELECT S.Product_Id, T.Quarter, SUM (Sales_Amt)

    FROM Sales S, Time T

    WHERE T.Time_Id = S.Time_Id

    GROUP BY T.Quarter, S.Product_Id

    ORDER BY Product_ID;

    Copyright 2011 Saeed Rahimi 43

    PR QUARTER SUM(SALES_AMT)

    -- -------- --------------P1 First 6500

    P1 Fourth 6506

    P1 Second 6503

    P2 First 10800

    P2 Fourth 10806

    P2 Second 10803P3 First 1510

    P3 Fourth 1516

    P3 Second 1513

    P4 First 9133

    P4 Fourth 6137

    P4 Second 9136

  • 7/31/2019 T15 Beyond SQL Rahimi

    44/127

    OLAP Basic Operations

    usin Oracles Anal tic Functions

  • 7/31/2019 T15 Beyond SQL Rahimi

    45/127

    Data Warehouse Concepts

    OLAP Operations

    OLAP queries use the GROUP BY clause ofSQL to get the answer Standard options for GROUP BY are limited

    It is not easy to formulate all OLAP needs in SQL92

    SQL 1999 has extended SQL with additionalaggregate functions to support OLAP needs

    Oracle 11g supports these extensions We will examine these functions next

    Copyright 2011 Saeed Rahimi 45

  • 7/31/2019 T15 Beyond SQL Rahimi

    46/127

    Data Warehouse Concepts

    OLAP Operations

    The Cube Operator Suppose we want to obtain a tabular view of the

    information that contains:

    Total sales of each product for each market Total sales of each market for each product

    And the grand total of all sales for all market for allproducts!

    What we are after is depicted on the next page.

    Copyright 2011 Saeed Rahimi 46

    D t W h C t

  • 7/31/2019 T15 Beyond SQL Rahimi

    47/127

    Data Warehouse Concepts

    OLAP Operations

    Sales application in the form of a spreadsheet

    Market_Id

    Sum(Sales_Amt) M1 M2 M3 Total

    P1 3003 1503 15003 19509

    P2 6003 2402 24003 32408

    Product_Id P3 4503 3 33 4539

    P4 7503 7000 9903 24406

    Total 21012 10908 48942 80862

  • 7/31/2019 T15 Beyond SQL Rahimi

    48/127

    Data Warehouse Concepts

    OLAP Operations

    To create this sheet using the standard SQL operations, we need to usethe following three queries:

    -- One query to calculate the entries, without the totals

    --

    SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)

    FROM Sales S

    . ar e _ , . ro uc _ ;

    MA PR SUM(S.SALES_AMT)-- -- ----------------

    M1 P1 3003

    M1 P2 6003

    M1 P3 4503

    M1 P4 7503

    M2 P1 1503

    M2 P2 2403M2 P3 3

    M2 P4 7000

    M3 P1 15003

    M3 P2 24003

    M3 P3 33

    M3 P4 9903

    Copyright 2011 Saeed Rahimi 48

  • 7/31/2019 T15 Beyond SQL Rahimi

    49/127

    Data Warehouse Concepts

    OLAP Operations-- One to calculate the row totals

    --

    SELECT S.Product_Id, SUM (Sales_Amt)FROM Sales S

    GROUP BY S.Product_Id;

    PR SUM(SALES_AMT)

    -- --------------

    P1 19509

    P2 32409

    P3 4539

    P4 24406

    -- And one to calculate the column totals

    --

    SELECT S.Market_Id, SUM (Sales_Amt)

    FROM Sales S

    GROUP BY S.Market_Id;

    MA SUM(SALES_AMT)

    -- --------------

    M1 21012

    M2 10909

    M3 48942

    Copyright 2011 Saeed Rahimi 49

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    50/127

    Data Warehouse Concepts

    Question:

    Can we use some of the queries we used before togenerate the same results?

    Give it a try (hint see slide on page 35)

    Copyright 2011 Saeed Rahimi 50

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    51/127

    Data Warehouse Concepts

    SQL> select C.Product_id,

    2 sum(case when Market_ID = 'M1'

    3 then c.Sales_Amt

    4 else NULL end)as M1,

    5 sum(case when Market_ID = 'M2'6 then c.Sales_Amt

    7 else NULL end)as M2,

    8 sum(case when Market_ID = 'M3'

    9 then c.Sales_Amt

    Copyright 2011 Saeed Rahimi 51

    e se en as ,

    11 sum(c.sales_amt) as Total

    12 FROM (SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)as sales_amt

    13 FROM Sales S

    14 GROUP BY S.Market_Id, S.Product_Id) C

    15 GROUP BY C.Product_Id;

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    52/127

    Data Warehouse Concepts

    SQL> select C.Product_id,

    2 sum(case when Market_ID = 'M1'

    3 then c.Sales_Amt

    4 else NULL end)as M1,

    5 sum(case when Market_ID = 'M2'6 then c.Sales_Amt

    7 else NULL end)as M2,

    8 sum(case when Market_ID = 'M3'

    9 then c.Sales_Amt

    Copyright 2011 Saeed Rahimi 52

    e se en as ,

    11 sum(c.sales_amt) as Total

    12 FROM (SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)as sales_amt

    13 FROM Sales S

    14 GROUP BY S.Market_Id, S.Product_Id) C

    15 GROUP BY C.Product_Id;

    PR M1 M2 M3 TOTAL

    -- ---------- ---------- ---------- ----------

    P4 7503 7000 9903 24406

    P1 3003 1503 15003 19509

    P2 6003 2403 24003 32409

    P3 4503 3 33 4539

  • 7/31/2019 T15 Beyond SQL Rahimi

    53/127

    Data Warehouse Concepts

    OLAP Operations Using three queries is wasteful

    The first query does much of the work of the other two

    If we could save that result of the first query and thenaggregate over Market_Id and Product_Id, we could computethe other queries more efficiently

    The Cube operator in SQL 1999 has been designed to helpwith these types of requirements in OLAP

    The CUBE function is used in the GROUP BY clause as

    GROUP BY CUBE(v1, v2, , vn)

    This is equivalent to a collection of GROUP BYs, one foreach value of v

    Copyright 2011 Saeed Rahimi 53

  • 7/31/2019 T15 Beyond SQL Rahimi

    54/127

    Data Warehouse Concepts OLAP Operations

    Example using the CUBE-- Doing the three queries in one using the CUBE operator

    --

    SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)FROM Sales S

    GROUP BY CUBE (S.Market_Id, S.Product_Id);

    MARK PROD SUM(S.SALES_AMT)

    ---- ---- ----------------

    NULL NULL 80863

    Questions: What do NULLs represent?

    How many of them are there?

    Why?

    NULL P1 19509

    NULL P2 32409

    NULL P3 4539

    NULL P4 24406

    M1 NULL 21012

    M1 P1 3003

    M1 P2 6003

    M1 P3 4503

    M1 P4 7503

    M2 NULL 10909

    M2 P1 1503

    M2 P2 2403

    M2 P3 3

    M2 P4 7000

    M3 NULL 48942

    M3 P1 15003

    M3 P2 24003

    M3 P3 33

    M3 P4 9903

    Copyright 2011 Saeed Rahimi 54

  • 7/31/2019 T15 Beyond SQL Rahimi

    55/127

    Data Warehouse Concepts

    OLAP Operations

    The ROLLUP Operator ROLLUP is similar to CUBE except that instead ofaggregating all subsets of the arguments, it creates subsetsmovin from ri ht to left

    ROLLUP is also supported in SQL1999 ROLLUP does exactly what is sounds

    It first finds the fine-grained aggregations of the dimensions,

    Then, it uses them to calculate coarse-grained aggregations,

    and Uses these aggregations to find the grand total

    Copyright 2011 Saeed Rahimi 55

    h

  • 7/31/2019 T15 Beyond SQL Rahimi

    56/127

    Data Warehouse Concepts

    OLAP Operations ROLLUP Example:

    SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)

    FROM Sales S

    GROUP BY ROLLUP(S.Market_Id, S.Product_Id)

    GROUP BY S.Market_Id, S.Product_Id

    Then with the next level of granularity aggregating the product sales foreach MarketsGROUP BY S.Market_Id

    And finally, using the total sales of all products in each market it figuresout the grand total which corresponds to an empty GROUP BY clause

    Copyright 2011 Saeed Rahimi 56

    D W h C

  • 7/31/2019 T15 Beyond SQL Rahimi

    57/127

    Data Warehouse Concepts

    OLAP Operations

    Example of ROLLUP-- The ROLLUP operator--

    SELECT S.Market_Id, S.Product_Id, SUM (S.Sales_Amt)

    FROM Sales S

    GROUP BY ROLLUP (S.Market_Id, S. Product_Id);

    MARK PROD SUM(S.SALES_AMT)

    ---- ---- ----------------

    M1 P1 3003M1 P2 6003

    M1 P3 4503

    M1 P4 7503

    M1 NULL 21012

    M2 P1 1503

    M2 P2 2403

    M2 P3 3

    M2 P4 7000M2 NULL 10909

    M3 P1 15003

    M3 P2 24003

    M3 P3 33

    M3 P4 9903

    M3 NULL 48942

    NULL NULL 80863

    Copyright 2011 Saeed Rahimi 57

    D t W h C t

  • 7/31/2019 T15 Beyond SQL Rahimi

    58/127

    Data Warehouse Concepts

    What does the following Rollup generate?

    SELECT S.Market_Id, S.Product_Id, S.Time_ID, SUM (S.Sales_Amt)FROM Sales S

    GROUP BY ROLLUP (S.Market_Id, S. Product_Id, s.Time_ID);

    Copyright 2011 Saeed Rahimi 58

    D t W h C t

  • 7/31/2019 T15 Beyond SQL Rahimi

    59/127

    Data Warehouse Concepts

    MA PR TI SUM(S.SALES_AMT)

    -- -- -- ----------------

    M1 P1 T1 1000

    M1 P1 T2 1001M1 P1 T3 1002

    M1 P1 3003

    M1 P2 T1 2000

    M1 P2 T2 2001

    Copyright 2011 Saeed Rahimi 59

    M1 P2 T3 2002

    M1 P2 6003M1 P3 T1 1500

    M1 P3 T2 1501

    M1 P3 T3 1502

    M1 P3 4503

    M1 P4 T1 2500

    M1 P4 T2 2501

    M1 P4 T3 2502

    M1 P4 7503

    M1 21012

    And so on

    D t W h C t

  • 7/31/2019 T15 Beyond SQL Rahimi

    60/127

    Data Warehouse Concepts

    OLAP Operations ROLLUP Vs. CUBE

    By contrast, the same query with CUBE

    first aggregates with the finest granularity

    GROUP BY S.Market Id S.Product Id_ _

    then with the next level of granularity (both subsets)

    GROUP BY S.Market_Id

    GROUP BY S.Product_Id

    then the grand total with

    GROUP BY

    Copyright 2011 Saeed Rahimi 60

    Data Warehousing

  • 7/31/2019 T15 Beyond SQL Rahimi

    61/127

    Data Warehousing

    Oracles Advanced OLAP Operations

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    62/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    For this portion of the presentation, we will

    use a general purpose practice database This database has the following schema

    Copyright (c) 2011 Saeed Rahimi 62

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    63/127

    Data Warehouse Concepts

    Department

    Department_name

    Department Payroll_numberSocial_security_numberFk_department

    Last_nameFirst_nameStreetCityStatePhone

    Employee

    Payroll_number

    Security_option

    Sectab

    Wge_maint

    M1 1 M

    Fk_payroll_numberPurchase_dateOpticianCostCheck_number

    Glasses

    Fk_payroll_numberPurchase_dateTool_nameTool_costPayroll_deductPaymentLast_paymentFirst_payment_dat

    Emp_tools

    senseCurrent_positionEmployment_dateBirth_dateWagesGender

    Tax_rateBottom_wageTop_wage

    ax_ra eFk_payroll_number

    Fk_department_numberClassificationClassification_dateOld_wagesNew_wages

    1M

    MM

    1 1

    Copyright (c) 2011 Saeed Rahimi 63

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    64/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    ROLLUP revisited

    ROLLUP is used with the GROUP BY option

    GROUP BY ROLLUP (expr1, expr2)

    To compute, Oracle will first group by data by expr2

    ,

    different values of expr1 It rolls up these aggregates to figure out sub-totals for eachvalue of expr1

    And it adds up these sub-totals into a grand total

    Copyright (c) 2011 Saeed Rahimi 64

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    65/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    GROUP BY vs. ROLLUPSQL> -- Group by - one expression

    SQL> --

    SQL> select department, SUM(wages)

    SQL> -- Rollup - one expression

    SQL> --

    SQL> select department, SUM(wages)

    Copyright (c) 2011 Saeed Rahimi 65

    2 from department, employee

    3 where department = fk_department

    4 group by department

    5 order by 1;

    DEPA SUM(WAGES)

    ---- ----------

    INT 65000

    POL 87700

    WEL 52000

    2 from department, employee

    3 where department = fk_department

    4 group by rollup(department)

    5 order by 1;

    DEPA SUM(WAGES)

    ---- ----------

    INT 65000

    POL 87700

    WEL 52000

    204700

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    66/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    GROUP BY vs. ROLLUPSQL> -- Rollup - two expressions

    SQL> --

    SQL> select gender, department, SUM(wages)

    SQL> select gender, department, SUM(wages)

    2 from department, employee

    3 where de artment = fk de artment2 from department, employee

    3 where department = fk_department

    4 group by rollup(department, gender)5 order by 1,2;

    G DEPA SUM(WAGES)

    - ---- ----------

    F POL 9800

    F WEL 7000

    M INT 65000M POL 77900

    M WEL 45000

    INT 65000

    POL 87700

    WEL 52000

    204700

    _

    4 group by department, gender

    5 order by 1,2;

    G DEPA SUM(WAGES)

    - ---- ----------

    F POL 9800

    F WEL 7000

    M INT 65000

    M POL 77900

    M WEL 45000

    Copyright (c) 2011 Saeed Rahimi 66

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    67/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    ROLLUP revisited

    Example: figuring out sub-totals for each gender andthen grand total for the department

    SQL> select gender, department, sum(wages)

    2 from department, employee

    3 where department = fk_department4 group by rollup (gender, department)

    5 order by 1,2;

    GENDER DEPARTMENT SUM(WAGES)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000F 16800

    M INT 65000

    M POL 77900

    M WEL 45000

    M 187900

    204700

    Aggregate value

    Rollup total female wages

    Rollup total male wages

    Rollup total wages

    Copyright (c) 2011 Saeed Rahimi 67

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    68/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    Partial ROLLUP

    If you place GROUP BY expressions outside theROLLUP option, Oracle will:

    Aggregate values based on these expressions outside

    the GROUP BY Calculates ROLLUP or subtotals on the expressions

    within the ROLLUP parameter list

    Computes a ROLLUP values for each unique occurrence

    of the expressions outside the ROLLUP Does NOT figure out the grand total

    Copyright (c) 2011 Saeed Rahimi 68

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    69/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    Partial ROLLUP example:

    Figuring out a partial rollup of wages per gender within adepartment

    SQL> select gender, department, sum(wages)

    2 from de artment em lo ee

    Rolled-up values for the

    POL department

    Aggregate or grouped

    Value for the POL

    department

    3 where department = fk_department

    4 group by rollup(gender), department

    5 order by 1,2;

    GENDER DEPARTMENT SUM(WAGES)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000

    M INT 65000

    M POL 77900M WEL 45000

    INT 65000

    POL 87700

    WEL 52000

    Copyright (c) 2011 Saeed Rahimi 69

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    70/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    ROLLUP vs. Partial ROLLUPROLLUP Partial ROLLUPgroup by rollup (gender), departmentgroup by rollup (gender, department)

    GENDER DEPARTMENT SUM(WAGES)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000

    F 16800

    M INT 65000

    M POL 77900

    M WEL 45000

    M 187900

    204700

    GENDER DEPARTMENT SUM(WAGES)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000

    M INT 65000

    M POL 77900

    M WEL 45000

    INT 65000

    POL 87700

    WEL 52000

    What does this ROLLUP generate?

    select gender Gender, department, sum(wages)

    from department, employee

    where department = fk_department

    group by rollup(department, gender)

    order by 1,2;Copyright (c) 2011 Saeed Rahimi 70

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    71/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    ROLLUP vs. Partial ROLLUPROLLUP Partial ROLLUPgroup by rollup (gender), departmentWhat does this ROLLUP generate?

    select gender Gender, department, sum(wages)

    GENDER DEPARTMENT SUM(WAGES)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000

    M INT 65000

    M POL 77900

    M WEL 45000

    INT 65000

    POL 87700

    WEL 52000

    rom epartment, emp oyee

    where department = fk_department

    group by rollup(department, gender)order by 1,2;

    Copyright (c) 2011 Saeed Rahimi 71

    G DEPA SUM(WAGES)

    - ---- ----------

    F POL 9800

    F WEL 7000

    M INT 65000

    M POL 77900M WEL 45000

    INT 65000

    POL 87700

    WEL 52000

    204700

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    72/127

    Data Warehouse Concepts

    Advanced OLAP Operations

    CUBE operator revisited

    Unlike ROLLUP, CUBE will figure out sub-totals for allexpressions inside the CUBE and then rolls them up

    SQL> select gender, department, sum(wages)

    2 from department, employee

    3 where department = fk_department

    4 group by cube (gender, department)

    5 order by 1,2;

    GENDER DEPARTMENT SUM(WAGES)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000

    F 16800

    M INT 65000

    M POL 77900M WEL 45000

    M 187900

    INT 65000

    POL 87700

    WEL 52000

    204700

    11 rows selected.Copyright (c) 2011 Saeed Rahimi 72

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    73/127

    Data Warehouse Concepts Advanced OLAP Operations

    Partial CUBE operator

    Similar to partial ROLLUP, partial CUBE will calculate rollup

    values for each unique occurrence of expression(s) outsidethe cube

    SQL> select gender, department, sum(wages)

    2 from department, employee

    3 where department = fk_department

    4 group by cube (gender), department5 order by 1,2;

    GENDER DEPARTMENT SUM(WAGES)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000

    M INT 65000M POL 77900

    M WEL 45000

    INT 65000

    POL 87700

    WEL 52000

    8 rows selected.

    Why is this partial cube exactly

    the same as the partial Rollup?

    Copyright (c) 2011 Saeed Rahimi 73

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    74/127

    p

    Advanced OLAP Operations

    CUBE vs. Partial CUBEPartial CUBE

    group by cube (gender), department

    CUBE

    group by cube (gender, department)

    GENDER DEPARTMENT SUM(WAGES)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000

    M INT 65000

    M POL 77900

    M WEL 45000

    INT 65000

    POL 87700

    WEL 52000

    GENDER DEPARTMENT SUM(WAGES)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000

    F 16800

    M INT 65000

    M POL 77900

    M WEL 45000

    M 187900

    INT 65000

    POL 87700

    WEL 52000

    204700

    Copyright (c) 2011 Saeed Rahimi 74

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    75/127

    p

    Advanced OLAP Operations

    CUBE vs. ROLLUPCUBE

    group by cube (gender, department)

    ROLLUP

    group by rollup (gender, department)

    ------ ---------- ----------

    F POL 9800

    F WEL 7000

    F 16800

    M INT 65000

    M POL 77900

    M WEL 45000

    M 187900

    INT 65000

    POL 87700

    WEL 52000

    204700

    ------ ---------- ----------

    F POL 9800F WEL 7000

    F 16800

    M INT 65000

    M POL 77900

    M WEL 45000

    M 187900

    204700

    Copyright (c) 2011 Saeed Rahimi 75

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    76/127

    p

    Advanced OLAP Operations

    Next pages shows the side by side views of

    the results for CUBE, ROLLUP, PartialCUBE and Partial ROLLUP

    ny n eres ng o serva ons

    Copyright (c) 2011 Saeed Rahimi 76

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    77/127

    Advanced OLAP Operations

    CUBE Partial CUBE

    Department

    Sum(Wages) INT POL WEL Total

    F 0 9800 7000 16800

    Gender M 65000 77900 45000 187900

    Department

    Sum(Wages) INT POL WEL Total

    F 0 9800 7000

    Gender M 65000 77900 45000

    Tota l 65000 87 700 52 00 0 2 04 700

    ROLLUP Partial ROLLUP

    Total 65 000 87 700 5200 0

    Department

    Sum(Wages) INT POL WEL Total

    F 0 9800 7000 16800

    Gender M 65000 77900 45000 187900

    Total 204700

    Department

    Sum(Wages) INT POL WEL Total

    F 0 9800 7000

    Gender M 65000 77900 45000

    Total 65 000 87 700 5200 0

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    78/127

    p

    Advanced OLAP Operations GROUPING Function

    What is the problem with CUBE and ROLLUPfunction?

    cu y n en y ng e rows a are su - o a

    One way is to find the rows that contain NULL values Expressions that are sub-totaled will have a value in the

    column that determines the ROLLUP

    The other expressions will contain null values

    This works well if the database does not have any nullvalues in it otherwise, it will be confusing

    GROUPING function can help with this.

    Copyright (c) 2011 Saeed Rahimi 78

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    79/127

    Advanced OLAP Operations

    GROUPING Function Usage: GROUPING (expression)

    More than one GROUPING function calls areallowed in one SQL statement

    GROUPING will return a 1 if the row is a sub-totalrow for the expression

    GROUPING will return a 0 if the row is NOT a sub-total row for the expression

    Copyright (c) 2011 Saeed Rahimi 79

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    80/127

    Advanced OLAP Operations

    GROUPING Function

    Example:SQL> select gender, department, sum(wages),

    2 grouping(gender) as gdr,

    3 grouping(department) as dpt

    4 from department, employee

    5 where department = fk_department

    6 group by rollup(gender, department)7 order by 1,2;

    G DEPA SUM(WAGES) GDR DPT

    - ---- ---------- ---------- ----------

    F POL 9800 0 0

    F WEL 7000 0 0

    F 16800 0 1

    M INT 65000 0 0

    M POL 77900 0 0

    M WEL 45000 0 0

    M 187900 0 1

    204700 1 1

    Copyright (c) 2011 Saeed Rahimi 80

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    81/127

    Advanced OLAP Operations

    Making the report more readable Use of DECODE Function

    column value

    It acts as a complex IFTHENELSE or a CASEstatement

    Copyright (c) 2011 Saeed Rahimi 81

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    82/127

    Advanced OLAP Operations

    Making the report more readable Example: Assume the following query results

    SQL> select empno, deptno

    2 from emp

    3 order by deptno;

    EMPNO DEPTNO

    ---------- ----------

    7782 10

    7839 10

    7934 10

    7369 20

    7876 20

    7902 20

    7788 20

    7566 207499 30

    7698 30

    7654 30

    7900 30

    7844 30

    7521 30

    Copyright (c) 2011 Saeed Rahimi 82

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    83/127

    Advanced OLAP Operations

    Making the report more readable

    Now assume we want to print Ten for 10, Twenty for 20 and thirty for

    30 in deptno column Decode can do this very easily

    SQL> select empno, decode (deptno,

    2 10, 'Ten',

    3 20, 'Twent ',

    4 30, 'Thirty', 'OTHER')

    5 from emp

    6 order by deptno;

    EMPNO DECODE

    ---------- ------

    7782 Ten

    7839 Ten

    7934 Ten

    7369 Twenty

    7876 Twenty

    7902 Twenty7788 Twenty

    7566 Twenty

    7499 Thirty

    7698 Thirty

    7654 Thirty

    7900 Thirty

    7844 Thirty

    7521 Thirty

    Copyright (c) 2011 Saeed Rahimi 83

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    84/127

    Advanced OLAP Operations

    Use of DECODE function

    Example:SQL> select decode(grouping(gender),1, 'Total Wages', gender) gender,

    2 decode(grouping(department),1, 'Total Per Gender', department) department,

    4 from department, employee

    5 where department = fk_department

    6 group by rollup(gender, department)7 order by 1;

    GENDER DEPARTMENT SUM(WAGES)

    ----------- ---------------- ----------

    F POL 9800

    F WEL 7000

    F Total Per Gender 16800

    M INT 65000

    M POL 77900

    M WEL 45000

    M Total Per Gender 187900

    Total Wages Total Per Gender 204700

    Copyright (c) 2011 Saeed Rahimi 84

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    85/127

    How do we generate the following report?

    DEPARTMENT GENDER SUM(WAGES)

    ----------- -------------------- ----------

    INT Total Per Department 65000

    INT M 65000

    POL M 77900

    POL F 9800

    POL Total Per Department 87700

    WEL F 7000

    WEL M 45000

    WEL Total Per Department 52000

    Total Wages Total Per Department 204700

    Copyright (c) 2011 Saeed Rahimi 85

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    86/127

    How do we generate the following report?SQL> select decode(grouping(department),1, 'Total Wages', department) department,

    2 decode(grouping(gender),1, 'Total Per Department', gender) gender,3 sum(wages)

    4 from department, employee

    5 where department = fk_department

    6 group by rollup(department, gender);

    DEPARTMENT GENDER SUM(WAGES)

    ----------- -------------------- ----------

    INT Total Per Department 65000

    INT M 65000

    POL M 77900

    POL F 9800

    POL Total Per Department 87700

    WEL F 7000

    WEL M 45000

    WEL Total Per Department 52000

    Total Wages Total Per Department 204700

    Copyright (c) 2011 Saeed Rahimi 86

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    87/127

    Advanced OLAP Operations

    Use of DECODE function

    Example: To suppress the extra Fs and MsSQL>break on gender;

    SQL> select decode(grouping(gender),1, 'Total Wages', gender) gender,

    ' ', , , ,

    3 sum(wages)

    4 from department, employee

    5 where department = fk_department6 group by rollup(gender, department)

    7 order by 1;

    GENDER DEPARTMENT SUM(WAGES)

    ----------- ---------------- ----------

    F POL 9800

    WEL 7000

    Total Per Gender 16800M INT 65000

    POL 77900

    WEL 45000

    Total Per Gender 187900

    Total Wages Total Per Gender 204700

    Copyright (c) 2011 Saeed Rahimi 87

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    88/127

    In the previous example, for the row thatindicates the Total Wages, we should NOT

    print the Total Per Gender How do we do that?

    GENDER DEPARTMENT SUM(WAGES)

    ----------- ---------------- ----------

    F POL 9800

    WEL 7426.3

    Total Per Gender 17226.3

    M INT 65000

    POL 77900

    WEL 47740.5

    Total Per Gender 190640.5

    Total Wages 207866.8

    Do not print anything here!

    Copyright (c) 2011 Saeed Rahimi 88

    Data Warehouse ConceptsGENDER DEPARTMENT SUM(WAGES)

  • 7/31/2019 T15 Beyond SQL Rahimi

    89/127

    GENDER DEPARTMENT SUM(WAGES)

    ----------- ---------------- ----------

    F POL 9800

    WEL 7426.3

    Total Per Gender 17226.3

    M INT 65000

    POL 77900

    WEL 47740.5

    Tota Per Gen er .

    Total Wages 207866.8Do not print anything here!

    Copyright (c) 2011 Saeed Rahimi 89

    break on gender;

    select decode(grouping(gender),1, 'Total Wages', gender) gender,

    decode(grouping(department),1,

    decode(grouping(gender),0,'Total Per Gender',' '), department)department,

    sum(wages)

    from department, employee

    where department = fk_departmentgroup by rollup(gender, department)

    order by 1;

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    90/127

    Advanced OLAP Operations

    The RANK Function Oracle 11g providesseveral functions for ranking rows returnedfrom a SELECT statement

    The functions can calculate rankings,percentiles and n-tiles

    These functions are performed after the select

    statement returns the rows and prior toprinting the results

    Copyright (c) 2011 Saeed Rahimi 90

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    91/127

    Advanced OLAP Operations The RANK Function Syntax

    Rank() over (

    [partition by expression, expression]Order by expression[collate clause] [asc | desc]

    nu s rs nu s as

    Only Rank and Order by are mandatory clauses That is because in order to rank rows, the result set must be

    sorted The expression in the order by clause is used for ranking Default sort order is ascending

    By default, NULL values are considered the largest Can change where NULL values will appear using nulls first or

    nulls last

    Copyright (c) 2011 Saeed Rahimi 91

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    92/127

    Advanced OLAP Operations The RANK function Example:

    Departments CEN and TRF do not have employees sum of thewages is null

    Nulls by default are printed first

    ere are emp oyees w ou sa ary n e a a ase as we

    SQL> select department DPT, sum(wages),2 rank() over (order by sum(wages) desc)

    3 as rank_all

    4 from department, employee

    5 where department = fk_department(+) -- Outer join department

    6 group by department;

    DPT SUM(WAGES) RANK_ALL

    ---- ---------- ----------

    CEN 1

    TRF 1

    POL 87700 3

    INT 65000 4

    WEL 52000 5

    Copyright (c) 2011 Saeed Rahimi 92

    Rank does not

    advance for equal

    values

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    93/127

    S L> select de artment DPT sum wa es

    Advanced OLAP Operations The RANK function Example continued

    NULLs last

    2 rank() over (order by sum(wages) desc nulls last )as rank_all

    3 from department, employee

    4 where department = fk_department(+) -- Outer join department5 group by department;

    DPT SUM(WAGES) RANK_ALL

    ---- ---------- ----------

    POL 87700 1

    INT 65000 2

    WEL 52000 3CEN 4

    TRF 4

    Copyright (c) 2011 Saeed Rahimi 93

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    94/127

    Advanced OLAP Operations The RANK function Example continued

    We will use the function nvl to provide values for where there is

    none Syntax nvl(col_name, val)

    If column does not have a value, then val is used instead

    In this example, we provide the value of 90,000 for null values

    of salary

    Value of 4 is missing

    SQL> select department, sum(nvl(wages,90000)) total_wages,

    2 rank() over (order by sum(nvl(wages,90000)) desc)

    3 as rank_all

    4 from department, employee

    5 where department = fk_department(+)

    6 group by department;

    DEPA TOTAL_WAGES RANK_ALL

    ---- ----------- ----------

    INT 155000 1

    WEL 142000 2

    CEN 90000 3

    TRF 90000 3

    POL 87700 5

    Copyright (c) 2011 Saeed Rahimi 94

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    95/127

    Advanced OLAP Operations The following does not look right since on page 93 the INT department

    had total wages of 65500. Now it says, 155000 what is going on?

    SQL> select department, sum(nvl(wages,90000)) total_wages,

    2 rank() over (order by sum(nvl(wages,90000)) desc)

    _

    4 from department, employee

    5 where department = fk_department(+)

    6 group by department;

    DEPA TOTAL_WAGES RANK_ALL

    ---- ----------- ----------

    INT 155000 1

    WEL 142000 2

    CEN 90000 3

    TRF 90000 3POL 87700 5

    Copyright (c) 2011 Saeed Rahimi 95

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    96/127

    Advanced OLAP Operations The following does not look right since on page 93 the INT department

    had total wages of 65500. Now it says, 155000 what is going on?SQL> select payroll_number, wages

    2 from employee

    = ' '_

    PAYROLL_NUMBER WAGES

    -------------- ----------25 9500

    46 9500

    36 14000

    33 13000

    29

    28 11000

    22 8000

    7 rows selected.

    Copyright (c) 2011 Saeed Rahimi 96

    One employee without salary.

    NVL replaces this with 90000

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    97/127

    Advanced OLAP Operations The DENSE_RANK Function

    Dense RANK function does the same thing as the RANK function

    except that it does not count the number of equal ranks. It makes sure that all ranks are used

    Here is the same example as before using the dense RANK

    SQL> select department, sum(nvl(wages, 90000)) total_wages,

    2 dense_rank() over (order by sum(nvl(wages,90000)) desc) as rank_dense

    3 from department, employee

    4 where department = fk_department(+)

    5 group by department;

    DEPA TOTAL_WAGES RANK_DENSE

    ---- ----------- ----------INT 155000 1

    WEL 142000 2

    CEN 90000 3

    TRF 90000 3

    POL 87700 4

    Value of 4 is NOT missing

    Copyright (c) 2011 Saeed Rahimi 97

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    98/127

    Advanced OLAP Operations

    Top-N and Bottom-N queries Rank functions rank the rows of the result set - -

    portion of the ranked rows from the top or thebottom

    There are two steps required to do this:

    Create an inline view to develop the data and the

    rankings Use the RANK expression in the where clause to identify

    the number of Top and Bottom ranked records

    Copyright (c) 2011 Saeed Rahimi 98

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    99/127

    Advanced OLAP Operations

    The Top-N rows example: Display the top three salary earning people in thecom an if salar is missin re lace it with 0

    SQL> select last_name, first_name, wages, emp_wage_rank

    2 from (select last_name, first_name, wages,

    3 rank() over(order by nvl(wages,0) desc) as emp_wage_rank4 from employee)

    5 where emp_wage_rank

  • 7/31/2019 T15 Beyond SQL Rahimi

    100/127

    Advanced OLAP Operations

    How do get the three lowest paid employeesin the organization?LAST_NAME FIRST_NAME WAGES EMP_WAGE_RANK

    --------------- --------------- ---------- -------------

    EISENHOWER DWIGHT 1

    ROOSEVELT ELEANOR 1

    ANTHONY SUSANNE 7000 2

    JOHNSON ANDREW 7500 3

    Copyright (c) 2011 Saeed Rahimi 100

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    101/127

    Advanced OLAP Operations

    How do get the three lowest paid employeesin the organization?LAST_NAME FIRST_NAME WAGES EMP_WAGE_RANK

    --------------- --------------- ---------- -------------

    EISENHOWER DWIGHT 1

    ROOSEVELT ELEANOR 1

    ANTHONY SUSANNE 7000 2

    JOHNSON ANDREW 7500 3

    Copyright (c) 2011 Saeed Rahimi 101

    select last_name, first_name, wages, emp_wage_rank

    from (select last_name, first_name, wages,

    dense_rank() over(order by nvl(wages,0) asc) as emp_wage_rank

    from employee)where emp_wage_rank

  • 7/31/2019 T15 Beyond SQL Rahimi

    102/127

    result set SQL> select last_name as LNAME, wages,2 row_number() over(order by wages) as Row_number

    3 from employee;

    LNAME WAGES ROW_NUMBER

    --------------- ---------- ----------

    ANTHONY 7000 1

    JOHNSON 7500 2

    ROOSEVELT 8000 3

    TAFT 8500 4

    Copyright (c) 2011 Saeed Rahimi 102

    COOLIDGE 9500 6

    MILLER 9500 7

    DWORCZAK 9800 8HOOVER 10000 9

    ROOSEVELT 10400 10

    TRUMAN 11000 11

    KENNEDY 11500 12

    JOHNSON 12000 13

    NIXON 12500 14

    FORD 13000 15

    CARTER 13000 16REAGAN 13500 17

    BUSH 14000 18

    CLINTON 15000 19

    EISENHOWER 20

    ROOSEVELT 21

    21 rows selected.

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    103/127

    Advanced OLAP Operations Top-N and RANK functions allow selecting top n rows

    in a set of records that the select statement returns Consider the following:

    We need to print the top two paid employees within each

    This requires ranking of employees based on their salaries

    within each department The PARTITION function can achieve this!

    NOTE: This partition function is NOT the same as physically

    partitioning tables for performance purposes This is a logical partitioning (temporary) that results inmemory and is lost after the query executes

    Copyright (c) 2011 Saeed Rahimi 103

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    104/127

    Advanced OLAP Operations

    The use of PARTITION clause in conjunction with the

    RANK FunctionSQL> select department_name, last_name, first_name, wages, emp_wage_rank

    2 from (select department_name, last_name, first_name, wages,

    _ , _ _

    4 from department, employee where department = fk_department)

    5 where emp_wage_rank

  • 7/31/2019 T15 Beyond SQL Rahimi

    105/127

    Advanced OLAP OperationsWindowing Oracle has the functionality that allows

    you to calculate values based on a period of time(called a window)

    The functions in this class can be used to com utemoving, cumulative and centered aggregates

    They include moving averages, moving sums,moving min/max, cumulative sum, and LAG/LEAD

    These functions create a value that is based onvalues that precede or follow the record

    The windowing functions can be used in the SELECTand ORDER BY clauses

    Copyright (c) 2011 Saeed Rahimi 105

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    106/127

    Advanced OLAP Operations

    The syntax{Sum | Avg |Max | Min | Count | Stddev | Variance| First_value |

    Last_value}

    *

    Over ({partiton by [,]]

    Order by [collate clause>][asc |desc] [nulls first |nulls last] [, ]

    Rows | range {{unbounded preceding | preceding} between {unbounded preceding |

  • 7/31/2019 T15 Beyond SQL Rahimi

    107/127

    Advanced OLAP Operations

    Windowing functions clauses

    Over Tell Oracle that the function will operate over a query result set.

    Partition by Determines how the data will be segmented for analysis

    Order By Determines how the data will be sorted within the partition. Options are ASC (default),, .

    Rows | Range These keywords determine the windows used for the calculation. The rows keyword isused to specify the window as a set of rows. Range sets the window as a logical offset.

    This function cannot be used unless the order by clause is used

    Between .. AND Determines the starting point and end point of the window. Omitting the betweenkeyword and specifying only one end point will cause Oracle to consider the endpoint asthe starting point. The current row will then consist of the current row.

    UnboundedPreceding

    Sets the first row of the partition as the starting point of the window

    UnboundedFollowing

    Sets the last row of the partition as the endpoint of the window

    Current Row Sets the current row as the starting point or as the end point of the window

    Data Warehouse Concepts

  • 7/31/2019 T15 Beyond SQL Rahimi

    108/127

    Advanced OLAP Operations Windowing function: cumulative aggregate function

    example find cumulative cost of tools purchased within POL departmentSQL> select department, last_name, first_name, tool_cost, sum(tool_cost)2 over (order by purchase_date rows unbounded preceding) balance

    3 from department, employee, emp_tools

    4 where department = fk_department

    5 and payroll_number = fk_payroll_number

    6 and department = 'POL';

    DEPA LAST_NAME FIRST_NAME TOOL_COST BALANCE

    ---- --------------- --------------- ---------- ----------

    POL JOHNSON ANDREW 5.95 5.95

    POL JOHNSON ANDREW 10.75 16.7

    POL WILSON WOODROW 4.95 21.65

    POL WILSON WOODROW 100 121.65

    POL WILSON WOODROW 12 133.65

    POL ROOSEVELT FRANKLIN 12 145.65

    POL ROOSEVELT FRANKLIN 8 153.65

    POL NIXON RICHARD 12.75 166.4

    POL NIXON RICHARD 5.75 172.15

    Copyright (c) 2011 Saeed Rahimi 108

    Data Warehouse Concepts Advanced OLAP Operations Windowing function: cumulative aggregate function

  • 7/31/2019 T15 Beyond SQL Rahimi

    109/127

    example find cumulative cost of tools purchased within each department

    SQL> select department, last_name, first_name, tool_cost, sum(tool_cost)

    2 over (partition by department order by purchase_date rows unbounded preceding) balance

    3 from department, employee, emp_tools

    4 where department = fk_department

    5 and payroll_number = fk_payroll_number;

    DEPA LAST_NAME FIRST_NAME TOOL_COST BALANCE

    ---- --------------- --------------- ---------- ----------

    INT ROOSEVELT THEODORE 34 34

    INT ROOSEVELT THEODORE 290 324

    INT COOLIDGE CALVIN 25 349

    INT COOLIDGE CALVIN 10 359

    INT EISENHOWER DWIGHT 25 384

    INT EISENHOWER DWIGHT 200 584INT EISENHOWER DWIGHT 150 734

    INT FORD GERALD 12 746

    INT FORD GERALD 0 746

    INT FORD GERALD 0 746

    INT BUSH GEORGE 2.75 748.75

    INT BUSH GEORGE 35.95 784.7

    INT BUSH GEORGE 7.5 792.2

    INT MILLER KEVIN 100 892.2

    INT MILLER KEVIN 0 892.2

    POL JOHNSON ANDREW 5.95 5.95POL JOHNSON ANDREW 10.75 16.7

    POL WILSON WOODROW 4.95 21.65

    POL WILSON WOODROW 100 121.65

    POL WILSON WOODROW 12 133.65

    POL ROOSEVELT FRANKLIN 12 145.65

    POL ROOSEVELT FRANKLIN 8 153.65

    POL NIXON RICHARD 12.75 166.4

    POL NIXON RICHARD 5.75 172.15

    Copyright (c) 2011 Saeed Rahimi 109

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    110/127

    p Windowing Moving Averages

    Moving averages can be computed when the

    window is changed over time A moving average can be computed if several

    function

    A range interval is needed to identify the number ofvalues used

    A time unit is needed. Common time units are year,month and day

    The preceding and/or following keywords are neededto indicate which records in the ordered set will beused

    Copyright (c) 2011 Saeed Rahimi 110

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    111/127

    Moving average function example

    This example computes the moving average tools cost for INTdepartment in the past 20 years

    SQL> select department as DPT, to_char(purchase_date, 'YYYY-DD-MON'), tool_cost,

    2 avg(tool_cost) over (order by purchase_date range interval '20' year preceding)as Average

    3 from department, employee, emp_tools

    0 records in the

    previous 20 years

    1 records in the

    previous 20 years

    2 records in the

    previous 20 years

    1 records in the

    previous 20 years

    Copyright (c) 2011 Saeed Rahimi 111

    4 where department = fk_department and payroll_number = fk_payroll_number and department = 'INT'

    5 order by purchase_date;

    DPT TO_CHAR(PUR TOOL_COST AVERAGE

    ---- ----------- ---------- ----------

    INT 1903-01-FEB 34 34

    INT 1905-10-MAR 290 162

    INT 1922-01-OCT 25 116.333333

    INT 1923-01-FEB 10 89.75

    INT 1953-01-MAR 25 25

    INT 1953-31-MAR 200 125

    INT 1953-31-MAR 150 125

    INT 1974-01-JAN 12 12INT 1974-10-AUG 0 6

    INT 1977-23-MAR 0 4

    INT 1988-23-SEP 2.75 3.6875

    INT 1988-10-NOV 35.95 10.14

    INT 1989-23-FEB 7.5 9.7

    INT 2001-08-APR 100 36.55

    INT 2001-23-MAY 0 29.24

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    112/127

    Moving average function example

    This example computes the total salary for employees in a 40

    year period around the hire date of the current employeeSQL> select department as DPT, first_name, Last_name,

    2 to_char(employment_date, 'YYYY-DD-MON') hire_date, wages,

    3 SUM(wages) OVER (ORDER BY employment_date RANGE BETWEEN

    4 INTERVAL '20' YEAR PRECEDING AND INTERVAL '20' YEAR FOLLOWING) ctrd_sum

    5 from department, employee

    6 where department = fk_department

    7 and department = 'INT' order by 4;

    DPT FIRST_NAME LAST_NAME HIRE_DATE WAGES CTRD_SUM

    ---- --------------- --------------- ----------- ---------- ----------

    INT THEODORE ROOSEVELT 1902-20-NOV 8000 17500

    INT CALVIN COOLIDGE 1921-07-AUG 9500 17500

    INT HAROLD TRUMAN 1945-15-APR 11000 11000

    INT DWIGHT EISENHOWER 1953-20-MAR 11000INT GERALD FORD 1973-20-MAY 13000 27000

    INT GEORGE BUSH 1988-05-JAN 14000 36500

    INT KEVIN MILLER 2000-12-OCT 9500 23500

    7 rows selected.

    Copyright (c) 2011 Saeed Rahimi

    112

  • 7/31/2019 T15 Beyond SQL Rahimi

    113/127

    Questions

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    114/127

    Advanced OLAP Operations

    Question 1: Print the cost of tools per classification

    (position) within gender. Subtotal the costs for eachgender GENDER CURRENT_POSITIO Tool Cost------ --------------- ----------

    F ADMINISTRATOR

    F SALESPERSON 2 88.85

    F SYSTEM ANALYST 61.95

    F 150.8

    M CLERK 1 20

    M CLERK 2 46.2M CONTROLLER 324

    M COUNSELER 2

    M GUARD 4 375

    M JANITOR 35

    M LABORER 2 12

    M LABORER 3

    M MAINT. MAN 2 24

    M MAINT. MAN 3 116.95

    M PRESIDENT 28.7M PROGRAMMER 1

    M SALESPERSON 1 16.7

    M TREASURER 18.5

    M TREASURER CLERK

    M VICE PRESIDENT 123

    M 1140.05

    1290.85

    22 rows selected.

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    115/127

    Advanced OLAP Operations

    Question 1: Print the cost of tools per classification

    (position) within gender. Subtotal the costs for eachgender

    select gender, current_position,sum(tool_cost) "Tool Cost"

    from employee, emp_tools

    where payroll_number = fk_payroll_number(+)group by rollup (gender, current_position)

    order by 1,2;

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    116/127

    Advanced OLAP Operations

    Question 2: Determine the two employees in each

    department who had the largest cost of eye glasses

    ' '_ , _ _ _ _ _

    ---------- -------------------------------- ------------- -------------------------

    INT BUSH, GEORGE 1

    INT EISENHOWER, DWIGHT 15 2POL CLINTON, WILLIAM 1

    POL KENNEDY, JOHN 1

    POL DWORCZAK, ALICE 1

    WEL HOOVER, HERBERT 1

    WEL ANTHONY, SUSANNE 120 2

    7 rows selected.

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    117/127

    Advanced OLAP Operations

    Question 2: Determine the two employees in each

    department who had the largest cost of eye glasses

    *

    from (select department, last_name||', '||first_name, sum(cost) eyeglass_cost,

    rank() over (partition by department

    order by sum(cost) asc nulls first) Lowest_eyeglass_cost_rankfrom department, employee, glasses

    where department = fk_department

    and payroll_number = fk_payroll_number(+)

    group by department, last_name, first_name)

    where lowest_eyeglass_cost_rank

  • 7/31/2019 T15 Beyond SQL Rahimi

    118/127

    Advanced OLAP Operations

    Question 3:Create a checkbook style

    cumulative cost of eye glassesPURCHASE_ LAST_NAME||','||FIRST_NAME EYEGLASS_COST BALANCE

    --------- -------------------------------- ------------- ----------

    12-MAR-03 ROOSEVELT, THEODORE 123 123

    06-MAY-04 ROOSEVELT, THEODORE 145 268

    08-NOV-10 TAFT, WILLIAM 145 413

    01-JAN-17 WILSON, WOODROW 123 536

    15-NOV-23 COOLIDGE, CALVIN 175 71103-JUN-33 ROOSEVELT, FRANKLIN 129 840

    01-JUL-33 ROOSEVELT, ELEANOR 134 974

    20-JUL-35 ROOSEVELT, ELEANOR 143 1117

    12-AUG-40 ANTHONY, SUSANNE 120 1237

    12-OCT-47 TRUMAN, HAROLD 110 1347

    31-MAR-53 EISENHOWER, DWIGHT 15 1362

    31-JAN-64 JOHNSON, LYNDON 170 1532

    31-MAY-67 JOHNSON, ANDREW 165 1697

    23-JUN-70 NIXON, RICHARD 123 182001-FEB-74 FORD, GERALD 145 1965

    08-SEP-77 CARTER, JIMMY 164 2129

    12-AUG-79 CARTER, JIMMY 175 2304

    23-OCT-83 REAGAN, RONALD 165 2469

    21-DEC-00 MILLER, KEVIN 165 2634

    19 rows selected.

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    119/127

    Advanced OLAP Operations

    Question 3:Create a checkbook style

    cumulative cost of eye glasses

    select purchase_date, last_name||', '||first_name,

    cost eyeglass_cost,

    sum(cost) over (order by purchase_date

    rows unbounded preceding) balance

    from employee, glasses

    where payroll_number = fk_payroll_number;

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    120/127

    Advanced OLAP Operations

    Question 4: Write a SQL statement that

    counts the number of eyeglasses within oneof the four cost classes: less than $100,$100 to $125, $126 to $150 and Above $150

    cost cat COUNT

    ------------- ----------

    100 to 125 5126 to 150 6

    Above 150 7

    Less than 100 1

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    121/127

    Question 4: Write a SQL statement that counts the number ofeyeglasses within one of the four cost classes: less than $100,

    $100 to $125, $126 to $150 and Above $150

    select (case when cost = 100 and cost 126 and cost 150 then 'Above 150' end)

    "cost cat", count(*) as count

    from glasses

    group by (case when cost = 100 and cost 126 and cost 150 then 'Above 150' end);

  • 7/31/2019 T15 Beyond SQL Rahimi

    122/127

    Questions?

    Contact information

    [email protected] 962 5514

    Copyright (c) 2011 Saeed Rahimi122

    SQL Statement with Group By

  • 7/31/2019 T15 Beyond SQL Rahimi

    123/127

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    124/127

    Advanced OLAP Operations

    The LAG and LEAD Functions

    These functions are useful for computing thedifference between values in different rows

    The LAG and LEAD functions return the value of a

    preceding or following row to the current row Syntax:

    Lag (expression, record offset)

    Lead (expression, record offset) For example, offset 1 refers to the row immediately

    before the current row (for Lag) and the rowimmediately after the current row (for Lead)

    Data Warehouse Concepts Advanced OLAP Operations

    The use of Lag and Lead

    Example: find the wage salary difference of each employee and the one

  • 7/31/2019 T15 Beyond SQL Rahimi

    125/127

    immediately hired before and after the employee

    SQL> select last_name, wages,

    2 lead(wages, 1) over(order by employment_date) - wages as "Lead Diff",

    3 lag(wages, 1) over(order by employment_date) - wages as "Lag Diff"4 from employee;

    LAST_NAME WAGES Lead Diff Lag Diff

    --------------- ---------- ---------- ----------

    ROOSEVELT 8000 500

    TAFT 8500 500 -500

    WILSON 9000 500 -500

    COOLIDGE 9500 500 -500

    HOOVER 10000 -500

    ROOSEVELT

    ROOSEVELT 10400 -3400

    ANTHONY 7000 4000 3400

    TRUMAN 11000 -4000

    EISENHOWER

    KENNEDY 11500 500

    JOHNSON 12000 -4500 -500

    JOHNSON 7500 5000 4500

    NIXON 12500 500 -5000

    FORD 13000 0 -500

    CARTER 13000 500 0

    REAGAN 13500 500 -500

    BUSH 14000 1000 -500

    CLINTON 15000 -5200 -1000

    DWORCZAK 9800 -300 5200

    MILLER 9500 300

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    126/127

    p

    Question: Join table employee with itself (self join) to

    find out the name of the person immediately hiredafter each employee for department WEL

    e are oo ng or s s

    LAST_NAME Hire Date Next emp--------------- ---------- -----------

    TAFT 1908-06-01 HOOVER

    HOOVER 1928-04-06 ROOSEVELT

    ROOSEVELT 1932-03-20 ANTHONY

    ANTHONY 1940-03-30 CARTERCARTER 1976-07-10 REAGAN

    REAGAN 1980-03-03

    6 rows selected.

    Data Warehouse Concepts

    Advanced OLAP Operations

  • 7/31/2019 T15 Beyond SQL Rahimi

    127/127

    d a ced O Ope at o s

    Question:

    Does the use of self join provide the right answer?

    If not why?

    How do we get the right answer?

    Use of Lag function

    Repeat the work but this time use the Lag (orLead) functions to get the right answer

    _ _

    Prev emp

    -------------- --------------- -------------------------

    20 ANTHONY 19

    ROOSEVELT

    35 REAGAN 34

    CARTER