olap functions part 1
Post on 08-Apr-2015
378 Views
Preview:
TRANSCRIPT
OLAP Analytics in ActionPatrice BérubéTechnical Solution Architect
Teradata Canada
22 pg.
OLAP Analytics - Agenda
• Business view
• Options & solutions
• Academic examples
• Real Life examples
• Summary
33 pg.
Typical Business Questions
1. How much revenue did each market have in February, what percent of total revenue?
2. What are the top 5 and bottom 5 Called Countries by market in February?
3. What are the top 5 Adjustment Reasons in February by system and by Consumer vs Business?
4. What is the average revenue and MOU for each decile of the consumer base in February?
5. Which customers had the biggest increases in their MOU from January to February?
6. What price plan is the final one for a given day when there are several changes over the course of the day?
44 pg.
Business needs beyond SQL (1 of 2)
• Data Warehouse power users know:
• Their business
• Their data
• SQL
• Data Warehouse power users want:
• Reduce SQL coding complexity
• Less “steps”
• Less DERIVED tables
• Do more in Teradata SQL, less in Excel
55 pg.
Business needs beyond SQL (2 of 2)
• Data Warehouse power users Often turn to Excel:
• Computed figures available along detail rows
• Aggregated figures available along detail rows
• SUM, AVG, COUNT, MIN, MAX
• Target different window of data rows
• All rows
• A specific number of rows
• Row immediately before/after current row
• Row position (7,13) before/after current row
• All rows before/after current row
After all power users count on the Data Warehouse power and usability to answer…..
66 pg.
Any options… solutions…
• So!
• Any suggestions?
• Can Teradata help?
77 pg.
OLAP Analytics - Agenda
• Business view
• Options & solutions
• Academic examples
• Real Life examples
• Summary
88 pg.
Business question #1 (1 of 3)
OLD WAY
SELECT
bs.bill_mkt_id
,DT.CumRev
,SUM(bs.tot_amt) AS MktRev
,MktRev / (DT.CumRev/ 1.0000) AS MktRevPerc
FROM bl_stmnt bs
CROSS JOIN
(
SELECT
SUM(bs.tot_amt) AS CumRev
FROM bl_stmnt bs
WHERE bs.bl_cyc_dt BETWEEN '2006-02-01' AND '2006-02-28'
) DT
WHERE bs.bl_cyc_dt BETWEEN '2006-02-01' AND '2006-02-28'
GROUP BY 1,2
ORDER BY MktRev DESC
OLAP Way
SELECT
bs.bill_mkt_id
,SUM(bs.tot_amt) AS MktRev
,SUM(MktRev)
OVER (ROWS BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING ) AS CumRev
,MktRev / (CumRev / 1.0000) AS MktRevPerc
FROM bl_stmnt bs
WHERE bs.bl_cyc_dt BETWEEN '2006-02-01' AND '2006-02-28'
GROUP BY 1
ORDER BY MktRev DESC
•Involves a Derived Table and 2nd pass of the table
How much revenue did each market have in February, what percent of total revenue?
99 pg.
OLAP Analytics –What are we talking about?• Give it a name
• OLAP Functions (On Line Analytical Processing)
• Window Aggregate Functions
• Ordered Analytical Functions
• What are Ordered Analytical Functions?
• Like traditional aggregate functions, window aggregate functions operate on groups of rows and permit qualification and filtering of the group result. Unlike aggregations, OLAP functions also return individual detail rows, not just aggregations.
• How they work
• The window feature is ANSI SQL-99 compliant and provides a way to dynamically define a subset of data, or window, in an ordered relational database table. A window is specified by the OVER() phrase, which can include the following clauses inside the parentheses:
• PARTITION BY
• ORDER BY
• ROWS
1010 pg.
Traditional SQL requests vsOLAP FunctionsCalculation
Aggregation
Ordered Analytical Functions
1111 pg.
OLAP functions available in Teradata
Teradata specific Functions (V2R4.1)
•CSUM
•MSUM
•MAVG
•RANK
•MDIFF (composable of SUM)
•MLINREG (composable of
SUM and COUNT)
•QUANTILE (composable of
RANK and COUNT)
ANSI compliant Functions (V2R5)
•SUM (col) over (...)
•AVG (col) over (...)
•RANK () over (...)
•COUNT (col) over (...)
•MAX (col) over (...)
•MIN (col) over (...)
•PERCENT_RANK (composable
of RANK and COUNT)
•ROW_NUMBER (composable of COUNT)
V2R4 syntax including user friendly functions are still available for backward compatibility. Preference to ANSI syntax.
1212 pg.
New R12 Features
VAR_SAMP Returns the sample variance of a set of numbers.
VAR_POP Returns the population variance of a set of numbers.
COVAR_POP Returns the population covariance of a set of number pairs.
COVAR_SAMP Returns the sample covariance of a set of number pairs.
CORR Returns the coefficient of correlation of a set of number pairs.
STDDEV_POP Computes the population STDDEV and returns the square root of the population variance.
STDDEV_SAMP Computes the cumulative sample STDDEV and returns the square root of the sample variance.
REGR_AVGX Evaluates the average of the independent variable of the regression line.
REGR_AVGY Evaluates the average of the dependent variable of the regression line.
REGR_COUNT Returns an integer that is the number of non-null number pairs used to fit the regression line.
REGR_INTERCEPT Returns the Y-intercept of the regression line.
REGR_R2 Returns the coefficient of determination (R-Squared) for the regression.
REGR_SLOPE Returns the slope of the line.
REGR_SXX Auxiliary functions that are used to compute various diagnostic statistics.
REGR_SXY Auxiliary functions that are used to compute various diagnostic statistics.
REGR_SYY Auxiliary functions that are used to compute various diagnostic statistics.
1313 pg.
OLAP Analytics - Agenda
• Business view
• Options & solutions
• Academic examples
• Real Life examples
• Summary
cum qtyqtydaymonth cum qtyqtydaymonth
1414 pg.
OLAP Functions Types
All Windows Aggregate functions fall into one of four types.
Group Window -Aggregates based on a grouping of rows.
Cumulative Window -Aggregates based on a cumulation of rows.
Moving Window -Aggregates based on a moving window of rows.
Remaining Window -Aggregates based on the rows remaining outside of a defined window.
Each choice is determined by the ROWS clause defined in the query.
Each choice is used with one of the following aggregate functions:
• AVG
• COUNT• MAX
• MIN
• SUM• RANK• PERCENT_RANK
• ROW_NUMBER
cum qtyqtydaymonth cum qtyqtydaymonth
1515 pg.
OLAP Functions PermutationsFour Categories
Group WindowCumulative WindowMoving WindowRemaining Window
Aggregates
SUM ( ) OVERCOUNT ( ) OVERAVG ( ) OVERMIN ( ) OVERMAX ( ) OVER
x
Group Window Function
• Use of keywords: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
• Absence of keywords: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
Remaining Window Function
• Use of keywords: ROWS BETWEEN UNBOUNDED FOLLOWING
• Absence of keywords: UNBOUNDED PRECEDING
Moving Window Function
• Use of keywords: ROWS BETWEEN # PRECEDING AND # FOLLOWING
• Absence of keywords: UNBOUNDED
Cumulative Window Function
• Use of keywords: ROWS BETWEEN UNBOUNDED PRECEDING
• Absence of keywords: UNBOUNDED FOLLOWING
cum qtyqtydaymonth cum qtyqtydaymonth
1616 pg.
OLAP - Diagram
14527200404
9416200404
554200404
1018200403
927200403
776200403
8520200402
335200402
6115200401
5310200401
221200401
cum_qtyqtydaymonth
order by day
rows between
unbounded preceding
and current row
partition by month
SUM (qty) over (...) as cum_qty
cum qtyqtydaymonth cum qtyqtydaymonth
1717 pg.
Coding
• Options and Syntax
• Example:SUM(qty)
OVER(PARTITION BY month ORDER BY day
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW
) as Cumulative_Quantity
• Window defined by
• PARTITION BY clause defining the “grouping” of data
• ORDER BY clause defining the sequence of data
• ROWS BETWEEN defines the window used for calculatione.g.: following/preceding/current row
unbounded or relative row numbers
cum qtyqtydaymonth cum qtyqtydaymonth
1818 pg.
Group SUM Window Functioncum qtyqtydaymonth cum qtyqtydaymonth
SELECT storeid, prodid, sales, SUM(sales) OVER (ORDER BY sales DESC)
FROM salestbl;
storeid prodid sales Group Sum(sales)----------- ------ -------------- --------------------------
1001 F 150000.00 610000.001001 A 100000.00 610000.001003 B 65000.00 610000.001001 C 60000.00 610000.001003 D 50000.00 610000.001002 A 40000.00 610000.001001 D 35000.00 610000.001002 C 35000.00 610000.001003 A 30000.00 610000.001002 D 25000.00 610000.001003 C 20000.00 610000.00
The window is defined as all rows - no PARTITION is specified.The final column represents the total of all rows.The default title of the last column indicates this is a Group function.
1919 pg.
Group SUM Window Functioncum qtyqtydaymonth cum qtyqtydaymonth
Note that the Group Sum reflects the total for each product.
SELECT storeid, prodid, sales, SUM(sales) OVER (PARTITION BY prodid ORDER BY sales DESC)
FROM salestbl ;
storeid prodid sales Group Sum(sales)----------- ---------- -------------- --------------------------
1001 A 100000.00 170000.001002 A 40000.00 170000.00
1003 A 30000.00 170000.001003 B 65000.00 65000.001001 C 60000.00 115000.00
1002 C 35000.00 115000.001003 C 20000.00 115000.00
1003 D 50000.00 110000.001001 D 35000.00 110000.00
1002 D 25000.00 110000.001001 F 150000.00 150000.00
2020 pg.
Cumulative SUM Window Functioncum qtyqtydaymonth cum qtyqtydaymonth
SELECT storeid, prodid, sales, SUM(sales) OVER (ORDER BY sales DESC ROWS UNBOUNDED PRECEDING)
FROM salestbl ;
storeid prodid sales Cumulative Sum(sales)------------ ---------- ---------------- --------------------------------
1001 F 150000.00 150000.001001 A 100000.00 250000.001003 B 65000.00 315000.001001 C 60000.00 375000.001003 D 50000.00 425000.001002 A 40000.00 465000.001001 D 35000.00 500000.001002 C 35000.00 535000.001003 A 30000.00 565000.001002 D 25000.00 590000.001003 C 20000.00 610000.00
The Cululative Sum reflects the sequential aggregation of all rows.The default title of last column indicates this is a Cumulative function.
2121 pg.
Moving SUM Window Functioncum qtyqtydaymonth cum qtyqtydaymonth
SELECT storeid, prodid, sales,
SUM(sales) OVER (ORDER BY sales DESC ROWS 2 PRECEDING)
FROM salestbl;
• Each row computes a moving sum based on itself and 2 preceding rows.• The 1st and 2nd rows compute their sums based on one and two rows respectively.• The default title of the last column indicates this is a Moving function.
storeid prodid sales Moving Sum(sales)----------- ---------- -------------- ---------------------------
1001 F 150000.00 150000.001001 A 100000.00 250000.001003 B 65000.00 315000.001001 C 60000.00 225000.001003 D 50000.00 175000.001002 A 40000.00 150000.001001 D 35000.00 125000.001002 C 35000.00 110000.001003 A 30000.00 100000.001002 D 25000.00 90000.001003 C 20000.00 75000.00
2222 pg.
Remaining SUM Window Functioncum qtyqtydaymonth cum qtyqtydaymonth
SELECT salesdate,itemid,sales,
SUM(sales) OVER (ORDER BY salesdate ASC ROWS BETWEEN CURRENT ROW AND unbounded following) AS "AMsum"
,CAST((AMsum - sales)AS DECIMAL(6,2)) AS "excl. current"
,SUM(sales) OVER (ORDER BY salesdate ASC ROWS BETWEEN 1 following AND unbounded following) AS "AM1sum"
FROM daily_sales ORDER BY salesdate ASC
WHERE EXTRACT(YEAR FROM salesdate) = '1997‘ AND EXTRACT(MONTH FROM salesdate) BETWEEN 1 AND 2;
salesdate itemid sales AMsum excl. current AM1sum
========== ============ ========== ========== ============= ==========
1997-01-01 10 350.00 4100.00 3750.00 3750.00
1997-01-02 10 100.00 3750.00 3650.00 3650.00
1997-01-03 10 250.00 3650.00 3400.00 3400.00
1997-01-05 10 350.00 3400.00 3050.00 3050.00
1997-01-10 10 450.00 3050.00 2600.00 2600.00
1997-01-21 10 250.00 2600.00 2350.00 2350.00
1997-01-25 10 300.00 2350.00 2050.00 2050.00
1997-01-31 10 100.00 2050.00 1950.00 1950.00
1997-02-01 10 550.00 1950.00 1400.00 1400.00
1997-02-03 10 350.00 1400.00 1050.00 1050.00
1997-02-06 10 150.00 1050.00 900.00 900.00
1997-02-17 10 250.00 900.00 650.00 650.00
1997-02-20 10 500.00 650.00 150.00 150.00
1997-02-27 10 150.00 150.00 .00 ?
2323 pg.
SELECT Session_ID, Txn_Time,
MAX(Txn_Time)
OVER (partition BY Session_ID ORDER BY Txn_Time
ROWS BETWEEN 1 following AND 1 following) –
Txn_Time AS MovDiff,
URL
FROM TXN_Table;
Moving Difference – MAX
Session_ID Txn_Time MovDiff URL
21 10:00 3 /url1.htm
21 10:03 4 /url2.htm
21 10:07 ? /url3.htm
22 10:05 25 /urlx.htm
22 10:30 15 /urly.htm
22 10:45 ? /urlz.htm
cum qtyqtydaymonth cum qtyqtydaymonth
2424 pg.
SELECT itemid, salesdate, sales,
RANK() OVER (ORDER BY sales DESC)
FROM daily_sales_2004
WHERE salesdate BETWEEN DATE '2004-01-01' AND DATE '2004-03-01’
Rank vs Row_Number (1 of 2)
cum qtyqtydaymonth cum qtyqtydaymonth
itemid salesdate sales Rank(sales)----------- ------------- ----------- -----------------
10 04/01/10 550.00 110 04/02/17 550.00 110 04/02/20 450.00 310 04/02/06 350.00 410 04/02/27 350.00 410 04/01/05 350.00 410 04/01/03 250.00 710 04/02/03 250.00 710 04/01/25 200.00 910 04/01/02 200.00 910 04/01/21 150.00 1110 04/02/01 150.00 1110 04/01/01 150.00 1110 04/01/31 100.00 14
Ranking positions which results in ties do not reuse the tied position number. For example, the seventh row in this list still maintains a rank of seven, even though the previous row has a rank of four.
2525 pg.
SELECT itemid, salesdate, sales,
ROW_NUMBER() OVER (ORDER BY sales DESC)
FROM daily_sales_2004
WHERE salesdate BETWEEN DATE '2004-01-01' AND DATE '2004-03-01’
Rank vs Row_Number (2 of 2)
cum qtyqtydaymonth cum qtyqtydaymonth
itemid salesdate sales Row_Number()----------- ------------- ----------- ---------------------
10 04/01/10 550.00 110 04/02/17 550.00 210 04/02/20 450.00 310 04/02/06 350.00 410 04/02/27 350.00 510 04/01/05 350.00 610 04/01/03 250.00 710 04/02/03 250.00 810 04/01/25 200.00 910 04/01/02 200.00 1010 04/01/21 150.00 1110 04/02/01 150.00 1210 04/01/01 150.00 1310 04/01/31 100.00 14
Ties are always assigned an incremented sequence number with ROW_NUMBER.
Ties are always assigned the same number with RANK function.
2626 pg.
SELECT storeid, prodid, sales,
RANK() OVER (ORDER BY sales DESC) AS Rank_Sales,
PERCENT_RANK() OVER (ORDER BY sales DESC) AS
Pct_Rank_Sales
FROM salestbl ;
Percent_Rankcum qtyqtydaymonth cum qtyqtydaymonth
storeid prodid sales Rank_Sales Pct_Rank_Sales------------ --------- ------------- ------------------ -----------------------
1001 F 150000.00 1 0.0000001001 A 100000.00 2 0.1000001003 B 65000.00 3 0.2000001001 C 60000.00 4 0.3000001003 D 50000.00 5 0.4000001002 A 40000.00 6 0.5000001001 D 35000.00 7 0.6000001002 C 35000.00 7 0.6000001003 A 30000.00 9 0.8000001002 D 25000.00 10 0.9000001003 C 20000.00 11 1.000000
PERCENT_RANK is always a value between 0.0 and 1.0 inclusive.
It’s value represents the portion of rows in the answer set which precede any given row on the list.
Formula: PERCENT_RANK = (Row Ranking - 1) / (# of rows - 1)
2727 pg.
OLAP Analytics - Agenda
• Business view
• Options & solutions
• Academic examples
• Real Life examples
• Summary
2828 pg.
Business question #1
OLAP Way
SELECT
bs.bill_mkt_id
,SUM(bs.tot_amt) AS MktRev
,SUM(MktRev) OVER (ROWS BETWEEN
UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING
) AS CumRev
,MktRev / (CumRev / 1.0000) AS MktRevPerc
FROM bl_stmnt bs
WHERE bs.bl_cyc_dt BETWEEN '2006-02-01' AND '2006-02-28'
GROUP BY 1
ORDER BY MktRev DESC
How much revenue did each market have in February, what percent of total revenue?
Market MktRev GrpRev
Perc of Total
Cum Perc
North Pole 42,806,310.55$ 231,945,644.49$ 18.5% 18.5%
South Pole 14,017,714.23$ 231,945,644.49$ 6.0% 24.5%Fairyland 12,427,672.94$ 231,945,644.49$ 5.4% 29.9%
Tir-na-nog 11,116,845.45$ 231,945,644.49$ 4.8% 34.7%Shan-gri-la 10,770,807.94$ 231,945,644.49$ 4.6% 39.3%
Land Far-Far Away 10,726,915.78$ 231,945,644.49$ 4.6% 43.9%Isle of Misfit Toys 7,410,941.03$ 231,945,644.49$ 3.2% 47.1%
Soder Island 7,224,536.83$ 231,945,644.49$ 3.1% 50.2%Birdwell Island 7,210,494.65$ 231,945,644.49$ 3.1% 53.3%
Hazzard County 6,814,012.59$ 231,945,644.49$ 2.9% 56.3%
2929 pg.
Business question #2
SELECT
calld_cntry_nam
,SUM(toll_dur_min) AS TollMin
,RANK() OVER (ORDER BY TollMin ) AS Cntry_Rnk_Asc
,RANK() OVER (ORDER BY TollMin DESC ) AS Cntry_Rnk_Desc
FROM bl_dtl_usge
WHERE bl_cyc_dt BETWEEN '2006-02-01' AND '2006-02-28'
AND calld_cntry_nam <> ''
GROUP BY 1
ORDER BY 3 DESC
QUALIFY Cntry_Rnk_Asc <= 5 OR Cntry_Rnk_Desc <= 5
What are the top 5 and bottom 5 Called Countries by market in February?
calld_cntry_nam TollMin Cntry_Rnk_Asc Cntry_Rnk_Desc
CANADA 7,135,936.0 219 1MEXICO 5,630,907.0 218 2
DOMINICAN REPUBLIC 3,660,275.0 217 3
UNITED KINGDOM 2,253,797.0 216 4
JAMAICA 1,159,594.0 215 5
LESOTHO 2.0 2 214
BHUTAN 2.0 2 214
SAINT HELENA 2.0 2 214FAROE ISLANDS 2.0 2 214
ASCENSION 2.0 2 214
TUVALU 1.0 1 219
3030 pg.
Business question #4 (2 of 2)
SELECTDT.Acct_Decile,DT.BillMonth,COUNT(DT.id) AS Occur_Cnt,SUM(DT.tot_amt) AS Dec_Chg_Amt,AVG(DT.tot_amt) AS Avg_Chg_Amt,MIN(DT.tot_amt) AS Min_Chg_Amt,MAX(DT.tot_amt) AS Max_Chg_Amt
,SUM (Dec_Chg_Amt) OVER () AS Grp_Amt,Dec_Chg_Amt / (Grp_Amt / 1.0000) AS Amt_Perc
FROM(SELECT
bs.bill_id,bs.id,bs.bl_cyc_id,bs.bl_cyc_dt,bs.tot_amt
,ROW_NUMBER () OVER (ORDER BY bs.tot_amt) AS Acct_Rnk
,(( (Acct_Rnk - 1) * 10 ) / COUNT(*) OVER() ) + 1 AS Acct_DecileFROM bl_stmnt bsWHERE bs.bl_ind = 'Y'
AND bs.bl_cyc_dt BETWEEN '2006-02-01' AND '2006-02-28'
AND bs.bill_id = 9999) DT
GROUP BY 1,2 ORDER BY 1,2;
What is the average revenue and MOU for each decile of the consumer base in February?
3131 pg.
Business Question #5 (1 of 2)
Which customers had the biggest increases in their revenue fromJanuary to February?
id bl_cyc_dt Cur_Amt Prior_Amt Perc_Change
16300006 1/16/2006 14.41$
16300006 2/16/2006 14.64$ 14.41$ 1.6%
16300012 1/20/2006 139.57$
16300012 2/20/2006 134.57$ 139.57$ -3.6%
16300018 1/25/2006 165.84$
16300018 2/25/2006 184.06$ 165.84$ 11.0%
16300028 1/2/2006 109.16$
16300028 2/2/2006 126.31$ 109.16$ 15.7%
16300036 1/2/2006 -$
16300036 2/2/2006 -$ -$
16300042 1/2/2006 -$
16300042 2/2/2006 -$ -$
16300046 1/2/2006 13.09$
16300046 2/2/2006 20.27$ 13.09$ 54.9%
16300078 1/2/2006 118.35$
16300078 2/2/2006 117.24$ 118.35$ -0.9%
16300082 1/2/2006 502.04$
16300082 2/2/2006 551.18$ 502.04$ 9.8%
16300088 1/2/2006 32.60$
16300088 2/2/2006 32.60$ 32.60$ 0.0%
3232 pg.
Business question #5 (2 of 2)
SELECT bs.id,bs.bl_cyc_dt,bs.tot_amt AS Cur_Amt
,SUM (Cur_Amt) OVER (PARTITION BY bs.bill_mkt_id,bs.acct_id
ORDER BY bs.bl_cyc_dt ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) AS Prior_Amt,CASE
WHEN ZEROIFNULL ( Prior_Amt ) > 0 THEN (Cur_Amt - Prior_Amt) / (Prior_Amt / 1.0000) ELSE NULL
END AS Perc_ChangeFROM bl_stmnt bsWHERE bs.bl_ind = 'Y'
AND bs.bl_cyc_dt BETWEEN '2006-01-01' AND '2006-02-28'
AND bs.bill_id = 5554ORDER BY 1,2,3,4
Which customers had the biggest increases in their revenue fromJanuary to February?
3333 pg.
Business question #6 (1 of 2)
What price plan is the final one for a given day when there are several
changes over the course of the day?
srv_accs_id client_prc_pln_eff_dt client_prc_pln_end_dt client_prc_pln_seq_nbr prd_id Row_Nbr
808090 02/12/2003 02/12/2003 1 538042 6
808090 02/12/2003 02/12/2003 2 546478 5
808090 02/12/2003 02/12/2003 3 509834 4
808090 02/12/2003 02/12/2003 4 547262 3
808090 02/12/2003 02/12/2003 5 538042 2
808090 02/12/2003 02/12/2003 6 538038 1
808096 26/02/2003 26/02/2003 1 269338 2
808096 26/02/2003 26/02/2003 2 270708 1
809186 01/05/2003 01/05/2003 1 478658 2
809186 01/05/2003 01/05/2003 2 269332 1
809186 05/12/2003 05/12/2003 1 478658 3
809186 05/12/2003 05/12/2003 2 486812 2
809186 05/12/2003 05/12/2003 3 546478 1
813436 11/11/2003 11/11/2003 1 547310 2
813436 11/11/2003 11/11/2003 2 547308 1
813502 17/02/2005 17/02/2005 1853936 836964 2
813502 17/02/2005 17/02/2005 2093328 836956 1
3434 pg.
Business question #6 (2 of 2)
SELECT
vpph.srv_accs_id
,vpph.client_prc_pln_eff_dt
,vpph.client_prc_pln_end_dt
,vpph.client_prc_pln_seq_nbr
,vpph.prd_id
,ROW_NUMBER () OVER (PARTITION BY vpph.srv_accs_id
,vpph.client_prc_pln_eff_dt
,vpph.client_prc_pln_end_dt
ORDER BY vpph.client_prc_pln_seq_nbr DESC
) AS Row_Nbr
FROM Vclient_PRC_PLN_HIST vpph
INNER JOIN client s
ON vpph.srv_accs_id = s.srv_accs_id
AND s.bill_srv_area_id = 1993
ORDER BY 1,2,3,4,5
QUALIFY COUNT(vpph.srv_accs_id)
OVER (PARTITION BY vpph.srv_accs_id, vpph.client_prc_pln_eff_dt, vpph.client_prc_pln_end_dt) > 1
What price plan is the final one for a given day when there are several
changes over the course of the day?
3535 pg.
• Assign Country Code and Area Code to the phone number using the
‘best match’
• RANK (windows function), QUALIFY clause
Real Life –
Assign phone_no Attributes 1 of 3
3725512543100004
37255334234100003
37250867890100002
3723567890100001
phone_nophone_ID
Estonia Ritabell Plus551372372551
Estonia Ritabell5537237255
Estonian Mobile5037237250
Estonia?372372
NetworkProviderAreaCCountryCNormalized
3636 pg.
SELECT pn.phone_ID, pn.phone_no, ac.CountryC, ac.Area
FROM phone_nos pn
INNER JOIN AreaCodes ac
ON SUBSTRING (pn.phone_no FROM 1 FOR
CHARACTERS(ac.Normalized)) = ac.Normalized;
Real Life –Assign phone_no Attributes 2 of 3
phone_ID phone_no CountryC AreaC Chars
100001 3723567890 372 ? 3
100002 37250867890 372 ? 3
100002 37250867890 372 50 5
100003 37255334234 372 ? 3
100003 37255334234 372 55 5
100004 3725512543 372 ? 3
100004 3725512543 372 55 5
100004 3725512543 372 551 6
3737 pg.
SELECT pn.phone_ID, pn.phone_no, ac.CountryC, ac.AreaC
FROM phone_nos pn
INNER JOIN AreaCodes ac
ON SUBSTRING (pn.phone_no FROM 1 FOR
CHARACTERS(ac.Normalized)) = ac.Normalized
QUALIFY RANK () OVER (partition BY pn.phone_ID ORDER BY
CHARACTERS(ac.Normalized) DESC) = 1;
Real Life –Assign phone_no Attributes 3 of 3
phone_ID phone_no CountryC AreaC Chars
100001 3723567890 372 ? 3
100002 37250867890 372 50 5
100003 37255334234 372 55 5
100004 3725512543 372 551 6
3838 pg.
OLAP Analytics - Agenda
• Business view
• Options & solutions
• Academic examples
• Real Life examples
• Summary
3939 pg.
Summary
• Functionality of Ordered Analytical Functions
• Supports a large subset of the SQL-99 Window Functions
• All combinations of (cumulative, moving, running) x (sum, count, min, max, avg)
• Any physical row based window definition: preceding, following, current row, unbound
• ANSI Row_Number, ANSI Rank
• Benefit
• Exceed Application Limits, get better Performance
• Process costly OLAP Functions within Teradata
• ANSI standard makes SQL support easier for application developers
• Tidy SQL
• less subselects
• replace self joins or multipass SQL
4040 pg.
Remember, coding is one thing, what you want to achieve is another!
4141 pg.
Thanks and Questions
• Questions???
• I can be reached at patrice.berube@Teradata.com
• Thanks
• Patrick R. McHugh
And Obviously…All of you !
top related