a brief introduction to proc transpose prepared by voytek grus for sas user group, halifax november...

12
A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

Upload: tobias-justin-charles

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

A Brief Introduction to PROC TRANSPOSEprepared byVoytek Grus

for

SAS user group, Halifax November 26, 2009

Page 2: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

What is data transposition?

• Matrix notation: : XT

• Excel: copy/paste/transpose command or pivot tables.• Example of a simple data transposition:

Col/Row A B C D E1 1 2 3 4 52 2 4 6 8 103 4 8 12 16 204 8 16 24 32 405 16 32 48 64 806 32 64 96 128 1607 64 128 192 256 3208 128 256 384 512 6409 256 512 768 1024 1280

10 512 1024 1536 2048 2560Row/Col 1 2 3 4 5 6 7 8 9 10A 1 2 4 8 16 32 64 128 256 512B 2 4 8 16 32 64 128 256 512 1024C 3 6 12 24 48 96 192 384 768 1536D 4 8 16 32 64 128 256 512 1024 2048E 5 10 20 40 80 160 320 640 1280 2560

Page 3: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

Examples of Complex data transpositions

group subgroup Unique_id C1 C2 C3 C4 C5R1 N1 no1 1 2 3 4 5R1 N2 no2 2 4 6 8 10R1 N3 no3 4 8 12 16 20R2 N1 no4 8 16 24 32 40R2 N2 no5 16 32 48 64 80R2 N3 no6 32 64 96 128 160R3 N1 no7 64 128 192 256 320R3 N2 no8 128 256 384 512 640R3 N3 no9 256 512 768 1024 1280R3 N4 no10 512 1024 1536 2048 2560

group Name N1 N2 N3 N4R1 C1 1 2 4R1 C2 2 4 8R1 C3 3 6 12R1 C4 4 8 16R1 C5 5 10 20R2 C1 8 16 32R2 C2 16 32 64R2 C3 24 48 96R2 C4 32 64 128R2 C5 40 80 160R3 C1 64 128 256 512R3 C2 128 256 512 1024R3 C3 192 384 768 1536R3 C4 256 512 1024 2048R3 C5 320 640 1280 2560

group R1 R1 R1 R2 R2 R2 R3 R3 R3 R3subgroup N1 N2 N3 N1 N2 N3 N1 N2 N3 N4Unique_id no1 no2 no3 no4 no5 no6 no7 no8 no9 no10C1 1 2 4 8 16 32 64 128 256 512C2 2 4 8 16 32 64 128 256 512 1024C3 3 6 12 24 48 96 192 384 768 1536C4 4 8 16 32 64 128 256 512 1024 2048C5 5 10 20 40 80 160 320 640 1280 2560

Full Transp.

Interleaved Dataset

Page 4: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

Why do we transpose data?

– For data presentation purposes.

– To merge datasets with diverse structures

– To re-design databases• efficiency gains in programming code

and processing time.

Page 5: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

Transposing data to streamline SAS calc. and programming code.

Cross-sectional data base

design

Sequential data base

design

Customer Var ValuesA Var1 2.00$ A Var2 4.00 A Var3 6B Var1 16.00$ B Var2 32.00 B Var3 48C Var1 128.00$ C Var2 256.00 C Var3 384

Customer Var1 Var2 Var3A 2.00$ 4.00 6B 16.00$ 32.00 48C 128.00$ 256.00 384 Proc

Transpose

Data Step type of processingData test2;Set test1;

Sum= var1+ var2+var3;run;

This db design is more amenable to

sequential data processing using SAS procs with BY statements

Proc means, proc regression, proc univariate etc..

Data StepArray def.

Do loopIf conditionThen output;

Run;

Page 6: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

Syntax of Proc Transpose

PROC TRANSPOSE (DATA=input-data-set <LABEL=label> <LET> <NAME=name> )

OUT=output-data-set (drop = _label_ _name_) <PREFIX=prefix>;

BY <DESCENDING> variable-1 WHERE … (Conditions) … ;

VAR (options: list variables, _all_, blank);ID variable; IDLABEL variable; COPY variable(s);

– Beware of pitfalls of missing observations and/or duplicate entries in the id variable

Page 7: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

SAS Help Tip

• Examples: TRANSPOSE Procedure•

• Example 1: Performing a Simple Transposition• Example 2: Naming Transposed Variables• Example 3: Labeling Transposed Variables• Example 4: Transposing BY Groups• Example 5: Naming Transposed Variables When

the ID Variable Has Duplicate Values• Example 6: Transposing Data for Statistical

Analysis

Page 8: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

An example of proc transpose application in pricing and cost studies from the utility industry

• Use divers data bases in pricing studies– Load Research (sample data of 15

min readings)– Power Production (hourly costs)– CIS (monthly records)

Page 9: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

Real Life Example (slide 1)

bf psmadj hourlydate customer class month year daytime season kwhslr kwslr pfslr kvarhslrM 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,278 1,293 1 442 M 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,227 1,259 1 419 M 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,218 1,235 1 421 M 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,171 1,186 1 421 M 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,138 1,144 1 413 M 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,198 1,219 1 425 M 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,248 1,285 1 434 M 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,274 1,288 1 434 M 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,376 1,435 1 431 M 1 1/1/2004 A residential 1 2004 offpeak _winter_ 1,465 1,493 1 455

class year customer bf psmadjkwhs_summer_offpeak

kwhs_summer_peak

kwhs_winter_offpeak

kwhs_winter_peak

kws_summer_offpeak

kws_summer_peak

kws_winter_offpeak

kws_winter_peak

residential 2004 A M 1 3,608,480 4,255,657 1,832,953 1,931,602 14,005 16,121 6,784 7,693 residential 2004 B M 0 2,634,055 2,592,400 1,434,944 1,295,578 11,262 10,527 5,734 5,742 residential 2004 C M 0 2,342,346 2,484,752 1,118,475 1,141,182 11,326 10,224 4,512 4,560 residential 2004 D MTO -1 9,888,882 11,998,166 4,487,063 4,920,141 37,964 46,354 16,690 18,288 residential 2005 A M 1 3,605,804 4,147,944 1,801,159 1,867,800 13,936 16,533 7,230 7,549 residential 2005 B M 0 2,646,993 2,533,870 1,441,147 1,268,540 10,889 12,050 5,957 5,683

•proc transpose data=bd_ets out=bd_ets2;var kwhs kws;

id season daytime _name_;

by year class customer bf psmadj; run;

Page 10: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

Real Life Example (slide 2)

proc transpose data=bd_ets out=bd_ets2;

• var kwhs kws; by year class customer bf psmadj season daytime; run; •

• data bd_ets2; set bd_ets2; newname=left(right(_name_)||left(season)||left(daytime));run;

year class customer bf psmadj season daytime kwhs kws2004 commercial E MTO 0 _summer_ offpeak 1,769,777 6,173 2004 commercial E MTO 0 _summer_ peak 1,964,920 7,812 2004 commercial E MTO 0 _winter_ offpeak 538,933 1,598

year class customer bf psmadj season daytime _NAME_ COL1 newname2004 commercial E MTO 0 _summer_ offpeak kwhs 1,769,777 kwhs_summer_offpeak2004 commercial E MTO 0 _summer_ offpeak kws 6,173 kws_summer_offpeak2004 commercial E MTO 0 _summer_ peak kwhs 1,964,920 kwhs_summer_peak2004 commercial E MTO 0 _summer_ peak kws 7,812 kws_summer_peak

Page 11: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

Real Life Example (slide 3)

• proc sort data=bd_ets2 out=bd_ets3;

• by class year customer bf psmadj newname;run;

• proc transpose data=bd_ets3 out=bd_ets3(drop=_name_);

• var col1;id newname; by class year customer bf psmadj;run;

year class customer bf psmadj season daytime _NAME_ COL1 newname2004 commercial E MTO 0 _summer_ offpeak kwhs 1,769,777 kwhs_summer_offpeak2004 commercial E MTO 0 _summer_ offpeak kws 6,173 kws_summer_offpeak2004 commercial E MTO 0 _summer_ peak kwhs 1,964,920 kwhs_summer_peak2004 commercial E MTO 0 _summer_ peak kws 7,812 kws_summer_peak

class year customer bf psmadjkwhs_summer_offpeak

kwhs_summer_peak

kwhs_winter_offpeak

kwhs_winter_peak

kws_summer_offpeak

kws_summer_peak

kws_winter_offpeak

kws_winter_peak

commercial 2004 E MTO 0 1,769,777 1,964,920 538,933 574,945 6,173 7,812 1,598 2,117 commercial 2004 G M 0 7,321,980 8,297,248 3,534,919 3,615,724 28,555 30,701 12,736 14,601 commercial 2004 H M 1 6,427,165 7,344,063 3,569,152 3,809,118 27,026 28,505 14,536 16,684 commercial 2004 I M 0 14,988,835 17,380,075 6,347,549 6,962,343 53,149 60,524 22,463 26,301

Page 12: A Brief Introduction to PROC TRANSPOSE prepared by Voytek Grus for SAS user group, Halifax November 26, 2009

Questions?