proc tabulate in action - geocities.ws fileproc tabulate in action susan sepanik new york area sas...
Post on 25-Aug-2019
214 Views
Preview:
TRANSCRIPT
2
Topics
What is PROC TABULATE?Why use PROC TABULATE?PROC TABULATE SyntaxPROC TABULATE ExampleAdditional Resources
3
What is PROC TABULATE?PROC TABULATE generates customized
tables of descriptive statistics
It creates many of the same statistics as PROC MEANS and PROC FREQ
But, you can:Decide what goes in the rows and columnsDo analysis on several variables in one tableDecide how you want to classify variablesFormat it all into a ready-to-present table
4
Why use PROC TABULATE?
To examine raw dataChecking counts and simple descriptive data in easy-to-read tables
To present data internallyAnswer specific analysis questions for a colleague or meeting
5
Sample DataOriginal Data Set (abbreviated):
STUDID SCHOOL YEAR ATTRATE TSCORE TMISS
1 School A 2005 0.95 655 0
1 School A 2006 0.97 673 0
2 School B 2005 0.87 565 0
2 School B 2006 0.85 . 1
3 School C 2005 0.82 503 0
3 School B 2006 0.89 501 0
6
Analysis Questions
1.Do we have the correct number of students per school and school year?
2.Does any school have particularly high or low test scores in a given year?
3.Does any school or school year have particularly high levels of missing test scores?
4.Are the attendance rates at each school for each year what we would expect?
8
PROC TABULATE Syntax
PROC TABULATE DATA = dataset;VAR analysis-variable-list;CLASS classification-variable-list;TABLE page-dimension,
row-dimension, column-dimension;
9
Simple PROC TABULATEPROC TABULATE DATA = schooldata;
CLASS SCHOOL YEAR;TABLE SCHOOL, YEAR;
YEAR
2005 2006
N N
SCHOOL
3.00 3.00School A
School B 3.00 4.00
School C 4.00 3.00
10
Adding a New Variable: Syntax(TSCORE)
PROC TABULATE DATA = schooldata;VAR TSCORE;CLASS SCHOOL YEAR;
TABLE SCHOOL, YEAR*N YEAR*TSCORE*MEAN;
11
Adding a New Variable: Output(TSCORE)
YEAR YEAR
2005 2006
2005 2006 TSCORE TSCORE
N N Mean Mean
SCHOOL
3.00 3.00 574.00 619.00School A
School B 3.00 4.00 527.00 533.67
School C 4.00 3.00 597.33 605.67
12
Adding Total: Syntax(ALL)
PROC TABULATE DATA = schooldata;VAR TSCORE;CLASS SCHOOL YEAR;TABLE SCHOOL ALL, YEAR*(N TSCORE*MEAN);
13
Adding Total: Output(ALL)
YEAR
2005 2006
N TSCORE N TSCORE
Mean Mean
SCHOOL
3.00 574.00 3.00 619.00School A
School B 3.00 527.00 4.00 533.67
School C 4.00 597.33 3.00 605.67
All 10.00 571.00 10.00 582.00
14
Adding New Statistics: Syntax(PCTN, NMISS)
PROC TABULATE DATA = schooldata;VAR TSCORE;CLASS SCHOOL YEAR;
TABLE SCHOOL ALL, YEAR*(N COLPCTN TSCORE*(MEAN NMISS));
15
Adding New Statistics: Output(PCTN, NMISS)
YEAR
2005 2006
N ColPctN TSCORE N ColPctN TSCORE
Mean NMiss Mean NMiss
SCHOOL
3.00 30.00 574.00 0.00 3.00 30.00 619.00 1.00School A
School B 3.00 30.00 527.00 1.00 4.00 40.00 533.67 1.00
School C 4.00 40.00 597.33 1.00 3.00 30.00 605.67 0.00
All 10.00 100.00 571.00 2.00 10.00 100.00 582.00 2.00
17
Changing Columns and Rows: Syntax
PROC TABULATE DATA = schooldata;VAR TSCORE;CLASS YEAR SCHOOL;
TABLE YEAR*(SCHOOL ALL),N PCTN<SCHOOL ALL> TSCORE*(MEAN NMISS);
18
Changing Columns and Rows: Output
N PctN TSCOREMean NMiss
YEAR SCHOOL
3.00 30.00 574.00 0.002005 School ASchool B 3.00 30.00 527.00 1.00
School C 4.00 40.00 597.33 1.00
All 10.00 100.00 571.00 2.00
2006 SCHOOL
3.00 30.00 619.00 1.00School ASchool B 4.00 40.00 533.67 1.00
School C 3.00 30.00 605.67 0.00
All 10.00 100.00 582.00 2.00
19
Adding More Variables: Syntax(TMISS ATTRATE)
PROC TABULATE DATA = schooldata;VAR TSCORE TMISS ATTRATE;CLASS SCHOOL YEAR;
TABLE YEAR*(SCHOOL ALL),N PCTN<SCHOOL ALL> TSCORE*MEAN TMISS*MEAN ATTRATE*MEAN;
20
Adding More Variables: Output(TMISS ATTRATE)
N PctN TSCORE TMISS ATTRATE
Mean Mean Mean
YEAR SCHOOL
3.00 30.00 574.00 0.00 0.922005 School A
School B 3.00 30.00 527.00 0.33 0.68
School C 4.00 40.00 597.33 0.25 0.82
All 10.00 100.00 571.00 0.20 0.81
2006 SCHOOL
3.00 30.00 619.00 0.33 0.95School A
School B 4.00 40.00 533.67 0.25 0.90
School C 3.00 30.00 605.67 0.00 0.86
All 10.00 100.00 582.00 0.20 0.90
21
Changing Headings: SyntaxRemoving unneeded headings:YEAR=' '
Changing headings:ATTRATE='Average Attendance Rate'
Changing or removing statistic headings:KEYLABEL N='Total Students' MEAN =' ';
Adding a table title:/BOX = 'Average Test Scores and Attendance Rates by School and Year';
22
Changing Headings: SyntaxPROC TABULATE DATA = schooldata;
VAR ATTRATE TSCORE TMISS;CLASS SCHOOL YEAR;
TABLE YEAR=' '*(SCHOOL=' ' ALL='All Schools'),N PCTN<SCHOOL ALL> TSCORE='Average Test Score'*MEANTMISS='Percentage Missing Test Scores'*MEANATTRATE='Average Attendance Rate'* MEAN
/BOX = ‘Average Test Scores and Attendance Rates by School and Year';
KEYLABEL N='Total Students' PCTN='Percentage of Students' MEAN =' ';
23
Changing Headings: OutputAverage Test Scores and
Attendance Rates by School and Year
Total Students
Percentage of Students
Average Test Score
Percentage of Missing
Test Scores
Average Attendance
Rate
2005 School A 3.00 30.00 574.00 0.00 0.92
School B 3.00 30.00 527.00 0.33 0.68
School C 4.00 40.00 597.33 0.25 0.82
All Schools 10.00 100.00 571.00 0.20 0.81
2006 School A 3.00 30.00 619.00 0.33 0.95
School B 4.00 40.00 533.67 0.25 0.90
School C 3.00 30.00 605.67 0.00 0.86
All Schools 10.00 100.00 582.00 0.20 0.90
24
Formatting Values: SyntaxFormatting all numeric cells:
FORMAT=12.0;
Formatting specific variables:*F=12.1 *F=PERCENT12.0
Specifying amount of spaces for all row headings:RTS=25
Creating your own formats:PROC FORMAT;
PICTURE PCTPIC LOW-HIGH =' 000%';RUN;
*F=PCTPIC.
25
Formatting Values: Syntax
PROC TABULATE DATA = schooldata FORMAT=12.0; VAR ATTRATE TSCORE TMISS; CLASS SCHOOL YEAR;
TABLE YEAR=' '*(SCHOOL=' ' ALL= 'All Schools'), N PCTN<SCHOOL ALL> *F=PCTPIC.TSCORE= 'Average Test Score' *F=12.1*MEAN TMISS='Percent Missing Test Scores'*MEAN*F=PERCENT12.0ATTRATE='Average Attendance Rate'*MEAN*F=PERCENT12.1
/BOX = 'Average Test Scores and Attendance Rates by School and Year' RTS=25;
KEYLABEL PCTN='Percent of Students' N='Total Students' MEAN=' ';
PROC FORMAT;PICTURE PCTPIC LOW-HIGH =' 000%';
RUN;
26
Formatting Values: OutputAverage Test Scores
and Attendance Rates by School and Year
Total Students
Percent of Students
Average Test Score
Percent Missing Test
Scores
Average Attendance
Rate
2005 School A 3 30% 574.0 0% 92.3%
School B 3 30% 527.0 33% 67.7%
School C 4 40% 597.3 25% 81.5%
All Schools 10 100% 571.0 20% 80.6%
2006 School A 3 30% 619.0 33% 94.7%
School B 4 40% 533.7 25% 90.3%
School C 3 30% 605.7 0% 85.7%
All Schools 10 100% 582.0 20% 90.2%
27
Creating an Excel File: Syntax
ODS HTML FILE = 'PROCTABULATE.XLS' STYLE = MINIMAL;PROC TABULATE DATA = schooldata;
VAR ATTRATE TSCORE TMISS; CLASS SCHOOL YEAR;
TABLE YEAR=' '*(SCHOOL=' ' ALL= 'All Schools'), (N PCTN<SCHOOL ALL> *F=PCTPIC.TSCORE= 'Average Test Score' *F=12.1*MEAN TMISS='Percent Missing Test Scores'*MEAN*F=PERCENT12.0
ATTRATE='Average Attendance Rate'*MEAN*F=PERCENT12.1) /BOX = 'Average Test Scores and Attendance Rates by School and Year' RTS=25; KEYLABEL PCTN='Percent of Students' N='Total Students' MEAN=' '; RUN;
ODS HTML CLOSE;
Creating an Excel File: Output
28
Percent Missing Test Average AttendanceScores Rate
School A 3 30% 574 0% 92.30%School B 3 30% 527 33% 67.70%School C 4 40% 597.3 25% 81.50%
All Schools 10 100% 571 20% 80.60%School A 3 30% 619 33% 94.70%School B 4 40% 533.7 25% 90.30%School C 3 30% 605.7 0% 85.70%
All Schools 10 100% 582 20% 90.20%
Average Test Scores and Attendance Rates by School and Year
Total Students Percent of Students Average Test Score
2005
2006
29
ConclusionPROC TABULATE generates customized tables of
descriptive statistics
You can format the output into ready-to-present Excel tables
The best way to create a table is:- Start simple with which variables you want in the
columns and rows- Add more statistics and variable relationships as
needed-Finish by formatting titles and values
30
Additional resourcesBASE SAS 9.1.3 Procedures Guide, Volume 3
“Making Sense of PROC TABULATE (Updated for (SAS9)” by Jonas V. Bilenas, JP Morgan Chase Paper 230-2007http://www2.sas.com/proceedings/forum2007/230-2007.pdf
“Anyone Can Learn PROC TABULATE” by Lauren Haworth, Genentech, Inc. Paper 60-27www2.sas.com/proceedings/sugi27/p060-27.pdf
PROC TABULATE by Example, by Lauren E. Haworth, SAS Institute Inc., 1999.
32
Appendix: Entire Original Data Set
STUDID SCHOOL YEAR ATTRATE TSCORE TMISS1 School A 2005 0.95 655 01 School A 2006 0.97 673 02 School B 2005 0.87 565 0
2 School B 2006 0.85 . 13 School C 2005 0.82 503 03 School B 2006 0.89 501 04 School A 2005 0.9 524 04 School A 2006 0.91 . 15 School B 2005 0.27 489 05 School B 2006 0.95 522 0
33
Appendix: Entire Original Data Set (continued):
STUDID SCHOOL YEAR ATTRATE TSCORE TMISS6 School C 2006 0.88 495 0
7 School C 2005 0.77 669 07 School C 2006 0.73 690 08 School C 2005 0.99 620 08 School C 2006 0.96 632 0
9 School C 2005 0.68 . 110 School A 2005 0.92 543 010 School A 2006 0.96 565 0
11 School B 2005 0.89 . 111 School B 2006 0.92 578 0
top related