chapter 8 producing summary reports. section 8.1 introduction to summary reports
TRANSCRIPT
Chapter 8
Producing Summary Reports
Section 8.1
Introduction toSummary Reports
3 3
Objectives Identify the different report writing procedures. Create one-way and two-way frequency tables using
the FREQ procedure. Restrict the variables processed by the FREQ
procedure. Generate simple descriptive statistics using the
MEANS procedure. Group observations of a SAS data set for analysis
using the CLASS statement in the MEANS procedure.
4 4
Summary Reports
SummarizeData andReportWriting
Step
SummarizeData andReportWriting
Step
ReportWriting
Step
ReportWriting
StepReport
LastName FirstName Age
TORRES JAN 23LANGKAMM SARAH 46SMITH MICHAEL 71WAGSCHAL NADJA 37TOERMOEN JOCHEN 16
Small Data Set
LastName FirstName Age
TORRES JAN 23LANGKAMM SARAH 46SMITH MICHAEL 71WAGSCHAL NADJA 37TOERMOEN JOCHEN 16 . . . . . .Ingersol Hans 32Himelewski Janice 87
Large Data Set
...
5 5
Summary Report Procedures
Toolbox
Toolbox
PROC FREQproduces frequency
counts.
PROC FREQproduces frequency
counts.
PROC MEANSproducessimple
statistics.
PROC MEANSproducessimple
statistics.
PROC REPORTproduces flexible
detail and summary reports.
PROC REPORTproduces flexible
detail and summary reports.
6 6
PROC FREQ Output Distribution of Job Code Values
The FREQ Procedure
Job Cumulative Cumulative Code Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 20.29 14 20.29 FLTAT2 18 26.09 32 46.38 FLTAT3 12 17.39 44 63.77 PILOT1 8 11.59 52 75.36 PILOT2 9 13.04 61 88.41 PILOT3 8 11.59 69 100.00
7 7
PROC MEANS Output Salary by Job Code
The MEANS Procedure
Analysis Variable : Salary
Job NCode Obs N Mean Std Dev Minimum MaximumƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒFLTAT1 14 14 25642.86 2951.07 21000.00 30000.00
FLTAT2 18 18 35111.11 1906.30 32000.00 38000.00
FLTAT3 12 12 44250.00 2301.19 41000.00 48000.00
PILOT1 8 8 69500.00 2976.10 65000.00 73000.00
PILOT2 9 9 80111.11 3756.48 75000.00 86000.00
PILOT3 8 8 99875.00 7623.98 92000.00 112000.00ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
8 8
PROC REPORT Output Salary Analysis
Job Code Home Base Salary ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000
Section 8.2
Basic Summary Reports
1010
SAS Vocabulary PROC FREQ TABLES NLEVELS Crosstabular * PROC MEANS VAR CLASS MAXDEC=
1111
Goal Report 1International Airlines wants to know how many employees are in each job code.
Distribution of Job Code Values
The FREQ Procedure
Job Cumulative Cumulative Code Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 20.29 14 20.29 FLTAT2 18 26.09 32 46.38 FLTAT3 12 17.39 44 63.77 PILOT1 8 11.59 52 75.36 PILOT2 9 13.04 61 88.41 PILOT3 8 11.59 69 100.00
1212
Categorize job code and salary values to determine how many employees fall into each group.
Salary Distribution by Job Codes
The FREQ Procedure
Table of JobCode by Salary
JobCode Salary
Frequency ‚ Percent ‚ Row Pct ‚ Col Pct ‚Less tha‚25,000 t‚More tha‚ Total ‚n 25,000‚o 50,000‚n 50,000‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Flight Attendant ‚ 5 ‚ 39 ‚ 0 ‚ 44 ‚ 7.25 ‚ 56.52 ‚ 0.00 ‚ 63.77 ‚ 11.36 ‚ 88.64 ‚ 0.00 ‚ ‚ 100.00 ‚ 100.00 ‚ 0.00 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Pilot ‚ 0 ‚ 0 ‚ 25 ‚ 25 ‚ 0.00 ‚ 0.00 ‚ 36.23 ‚ 36.23 ‚ 0.00 ‚ 0.00 ‚ 100.00 ‚ ‚ 0.00 ‚ 0.00 ‚ 100.00 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 5 39 25 69 7.25 56.52 36.23 100.00
Goal Report 2
1313
PROC FREQ displays frequency counts of the data values in a SAS data set.
General form of a simple PROC FREQ step:
PROC FREQ DATA=SAS-data-set;RUN;PROC FREQ DATA=SAS-data-set;RUN;
Example:
Creating a Frequency Report
proc freq data=ia.crew;run;
1414
By default, PROC FREQ analyzes every variable in the SAS data set displays each distinct data value calculates the number of observations in which each
data value appears (and the corresponding percentage)
Indicates, for each variable, how many observations have missing values.
Creating a Frequency Report
1515
...
proc freq data=ia.crew;run;
HireDate LastName FirstName Location Phone EmpID JobCode Salary 07NOV1992 BEAUMONT SALLY T. LONDON 1132 E00525 PILOT1 72000
12MAY1985 BERGAMASCO CHRISTOPHER CARY 1151 E02466 FLTAT3 41000
04AUG1988 BETHEA BARBARA ANN FRANKFURT 1163 E00802 PILOT2 81000
ia.crew
Distribution of
LastName
Distribution of
Salary
Distribution of
JobCode
Distribution of
FirstNameDistribution of
EmpID
Distribution of
HireDate
Distribution of
PhoneDistribution of
Location
Default Frequency Reports
1616
Variables to AnalyzePROC FREQ is appropriate for variables with only a few values.
For example, if you have a class list with one row for each student, it would not be very meaningful to analyze the student ID if there is one row per person in the table.
PROC FREQ enables you to choose the variables to analyze.
1717
Printing Selected VariablesSAS enables you to select the variables to display or analyze.
In PROC PRINT, what statement selected thevariables for the output?
...
1818
Printing Selected VariablesSAS enables you to select the variables to display or analyze.
In PROC PRINT, what statement selected thevariables for the output?
The VAR statement
1919
Printing Selected VariablesSAS enables you to select the variables to display or analyze.
In PROC FREQ, what statement selects the variables?
PROC Statement to select variables
PRINT VAR
...
2020
Printing Selected Variables
PROC Statement to select variables
PRINT VAR
SAS enables you to select the variables to display or analyze.
In PROC FREQ, what statement selects the variables?
The TABLES statement
2121
Printing Selected VariablesSAS enables you to select the variables to display or analyze.
PROC Statement to select variables
PRINT VAR
FREQ TABLES
2222
Use the TABLES statement to limit the variables included in the frequency counts.
These are typically variables that have a limited number of distinct values.
General form of a PROC FREQ step with a TABLES statement:
PROC FREQ DATA=SAS-data-set; TABLES SAS-variables < / options >;RUN;
PROC FREQ DATA=SAS-data-set; TABLES SAS-variables < / options >;RUN;
One-Way Frequency Report
Ignore the option for now.
2323
One-Way Frequency ReportUse the TABLE statement to analyze JobCode.
For example:
proc freq data=ia.crew; tables JobCode ;run;
2424
Distribution of Job Code Values
The FREQ Procedure
Job Cumulative Cumulative Code Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 20.29 14 20.29 FLTAT2 18 26.09 32 46.38 FLTAT3 12 17.39 44 63.77 PILOT1 8 11.59 52 75.36 PILOT2 9 13.04 61 88.41 PILOT3 8 11.59 69 100.00
title 'Distribution of Job Code Values';proc freq data=ia.crew; tables JobCode;run;
Creating a Frequency Report – Example
2525
One-Way Frequency ReportYou can select more than one variable to analyze by listing them all in the TABLES statement. Separate them with a space.
This creates one report for each variable.
For example:
proc freq data=ia.crew; tables JobCode Location;RUN;
2626
The FREQ Procedure
Job Cumulative Cumulative Code Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 14 20.29 14 20.29 FLTAT2 18 26.09 32 46.38 FLTAT3 12 17.39 44 63.77 PILOT1 8 11.59 52 75.36 PILOT2 9 13.04 61 88.41 PILOT3 8 11.59 69 100.00
Cumulative Cumulative Location Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ CARY 27 39.13 27 39.13 FRANKFURT 19 27.54 46 66.67 LONDON 23 33.33 69 100.00
title; proc freq data=ia.crew; tables JobCode Location;run;
Creating a Frequency Report – Example
JobCode Report
Location Report
2727
Use the NLEVELS option in the PROC FREQ statement to display the number of levels for the variables included in the frequency counts.
Displaying the Number of Levels – Example
title 'Distribution of Location Values';proc freq data=ia.crew nlevels; tables Location;run;
2828
Distribution of Location Values
The FREQ Procedure
Number of Variable Levels
Variable Levels ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ Location 3
Cumulative CumulativeLocation Frequency Percent Frequency PercentƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒCARY 27 39.13 27 39.13FRANKFURT 19 27.54 46 66.67LONDON 23 33.33 69 100.00
Creating a Frequency Report – Example
2929
Creating a Frequency ReportTo display the number of levels without displaying the frequency counts, add the NOPRINT option to the TABLES statement.
proc freq data=ia.crew nlevels; tables JobCode Location / noprint; title 'Number of Levels for Job Code and Location';run;
Number of Levels for Job Code and Location
The FREQ Procedure
Number of Variable Levels Variable Levels ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ JobCode 6 Location 3
3030
Creating a Frequency ReportTo display the number of levels for all variables without displaying any frequency counts, use the _ALL_ keyword and the NOPRINT option in the TABLES statement.
(You must also use the NLEVELS option.)
title 'Number of Levels for All Variables';proc freq data=ia.crew nlevels; tables _all_ / noprint;run;
3131
International Airlines wants to use formats to categorize the flight crew by job code.
Pilot
PILOT1PILOT2PILOT3
FLTAT1FLTAT2FLTAT3
Flight Attendant
Stored values Formatted values
Analyzing Categories of Values
3232
proc format; value $codefmt 'FLTAT1'-'FLTAT3'='Flight Attendant' 'PILOT1'-'PILOT3'='Pilot';run;proc freq data = ia.crew; format JobCode $codefmt.; tables JobCode;run;
Analyzing Categories of Values – Example
3333
Distribution of Job Code Values
The FREQ Procedure
Cumulative CumulativeJobCode Frequency Percent Frequency PercentƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒFlight Attendant 44 63.77 44 63.77Pilot 25 36.23 69 100.00
Analyzing Categories of Values – Example
PROC FREQ automatically groups the data by the formatted value of a variable if a format is associated with that variable.
3434
This exercise reinforces the concepts discussed previously.
Exercise
3535
Exercises
1.Use the StudyAbroad2 delimited data file to create a
data set called StudyLocations. The variables in
order are Country, Cost, Time, and BeginDate. Format Cost to reflect currency values and
BeginDate to a readable date value. Change the column headings to Country, Trip
Cost, Length of Program, and Trip Begin Date.
2.Create a listing report to verify all the work for #1 above.
3.Use PROC FREQ to determine the frequencies for Country and Time.
3636
Exercises – A Solution
data StudyLocations; infile 'StudyAbroad2.csv' dsd; input Country :$15. Cost Time :$8. BeginDate :mmddyy10.; format BeginDate mmddyy10. Cost dollar8.; label Cost='Trip Cost'
Time='Length of Program' BeginDate = 'Trip Begin Date';
run;proc print data= StudyLocations noobs label;run;proc freq data=StudyLocations; tables Country Time;run;
3737
Exercises
Length Trip of Trip BeginCountry Cost Program Date
Germany $4,200 Semester 09/01/2007France $8,162 Year 10/01/2007Great Britain $8,225 Year 09/01/2007Australia $7,500 Year 06/01/2007Sweden $5,286 Semester 12/01/2007Spain $3,500 Semester 09/01/2007Mexico $2,300 Semester 09/01/2007France $3,971 Semester 10/01/2007Great Britain $8,225 Year 09/01/2007Sweden $5,286 Semester 12/01/2007Germany $4,200 Semester 09/01/2007Great Britain $4,700 Semester 09/01/2007Germany $7,625 Year 09/01/2007
Partial PROC PRINT Output
3838
Exercises
The FREQ Procedure
Cumulative CumulativeCountry Frequency Percent Frequency PercentƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒAustralia 11 20.75 11 20.75France 9 16.98 20 37.74Germany 7 13.21 27 50.94Great Britain 10 18.87 37 69.81Mexico 4 7.55 41 77.36Spain 5 9.43 46 86.79Sweden 7 13.21 53 100.00
Length of Program
Cumulative CumulativeTime Frequency Percent Frequency PercentƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒSemester 25 47.17 25 47.17Year 28 52.83 53 100.00
Partial PROC FREQ Output
3939
A two-way, or crosstabular, frequency report analyzes all possible combinations of the distinct values of two variables.
The asterisk (*) operator in the TABLES statement is used to cross variables.
General form of the FREQ procedure to create a crosstabular report:
Crosstabular Frequency Reports
PROC FREQ DATA=SAS-data-set;
TABLES variable1*variable2;
RUN;
PROC FREQ DATA=SAS-data-set;
TABLES variable1*variable2;
RUN;
4040
proc format; value $codefmt 'FLTAT1'-'FLTAT3'='Flight Attendant' 'PILOT1'-'PILOT3'='Pilot'; value money low-<25000 ='Less than 25,000' 25000-50000='25,000 to 50,000' 50000<-high='More than 50,000';run;proc freq data=ia.crew; tables JobCode*Salary; format JobCode $codefmt. Salary money.; title 'Salary Distribution by Job Codes';run;
Crosstabular Frequency Reports – Example
4141
Salary Distribution by Job Codes
The FREQ Procedure
Table of JobCode by Salary
JobCode Salary
Frequency ‚ Percent ‚ Row Pct ‚ Col Pct ‚Less tha‚25,000 t‚More tha‚ Total ‚n 25,000‚o 50,000‚n 50,000‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Flight Attendant ‚ 5 ‚ 39 ‚ 0 ‚ 44 ‚ 7.25 ‚ 56.52 ‚ 0.00 ‚ 63.77 ‚ 11.36 ‚ 88.64 ‚ 0.00 ‚ ‚ 100.00 ‚ 100.00 ‚ 0.00 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Pilot ‚ 0 ‚ 0 ‚ 25 ‚ 25 ‚ 0.00 ‚ 0.00 ‚ 36.23 ‚ 36.23 ‚ 0.00 ‚ 0.00 ‚ 100.00 ‚ ‚ 0.00 ‚ 0.00 ‚ 100.00 ‚ ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ Total 5 39 25 69 7.25 56.52 36.23 100.00
Crosstabular Frequency ReportsF
irst
Var
iabl
e
Second Variable
4242
proc freq data=ia.crew; tables JobCode*Location / crosslist; title 'Location Distribution for Job Codes';run;
Crosstabular Frequency Reports – ExampleTo display the crosstabulation results in a listing form, add the CROSSLIST option to the TABLES statement.
4343
Location Distribution for Job Codes
The FREQ Procedure
Table of JobCode by Location
Job Row Column Code Location Frequency Percent Percent Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 CARY 5 7.25 35.71 18.52 FRANKFURT 4 5.80 28.57 21.05 LONDON 5 7.25 35.71 21.74
Total 14 20.29 100.00 ------------------------------------------------------------- FLTAT2 CARY 7 10.14 38.89 25.93 FRANKFURT 5 7.25 27.78 26.32 LONDON 6 8.70 33.33 26.09
Total 18 26.09 100.00 -------------------------------------------------------------
Crosstabular Frequency ReportsPartial Output
4444
This exercise reinforces the concepts discussed previously.
Exercise
4545
Exercises
Using the StudyLocations data set you created in a previous exercise, create a crosstabular frequency report using the CROSSLIST option. Display the length of the program by country.
4646
Exercises
proc freq data=StudyLocations; tables Time*Country /crosslist;run;
4747
Exercises
4848
International Airlines wants to determine the minimum, maximum, and average salary for each job code.
Business Task
4949
The MEANS procedure displays simple descriptive statistics for the numeric variables in a SAS data set.
General form of a simple PROC MEANS step:
PROC MEANS DATA=SAS-data-set;RUN;
PROC MEANS DATA=SAS-data-set;RUN;
proc means data=ia.crew; title 'Salary Analysis';run;
Calculating Summary Statistics – Example
How many variables will be analyzed?
5050
Salary Analysis
The MEANS Procedure
Variable N Mean Std Dev Minimum MaximumƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒHireDate 69 9812.78 1615.44 7318.00 12690.00Salary 69 52144.93 25521.78 21000.00 112000.00ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Calculating Summary Statistics
In this example, PROC MEANS analyzed two variables, HireDate and Salary.
5151
Calculating Summary StatisticsBy default, PROC MEANS analyzes every numeric variable in the SAS data set prints the following statistics
– N– MEAN – ST – MIN– MAX
excludes missing values before calculating statistics.
5252
Summary Statistics
N number of rows with nonmissing values
MEAN arithmetic mean (or average)
STD standard deviation
MIN minimum value
MAX maximum value
Default Statistics:
5353
Choosing Summary StatisticsOther Statistics:
RANGE difference between lowest and highest values
MEDIAN 50th percentile value
SUM total
NMISS number of rows with missing values.
For more information on other PROC MEANS options, refer to the SAS OnlineDoc.
5454
Choosing Summary StatisticsTo see a different statistic or control the number of default statistics, list the statistics you want in the PROC MEANS statement as an option to the step.
You saw this in Chapter 2 when you worked on syntax errors.
5555
Choosing Summary Statisticstitle 'Salary Analysis';proc means data=ia.crew mean max min; run;
Salary Analysis
The MEANS Procedure
Variable Mean Maximum Minimum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ HireDate 9812.78 12690.00 7318.00 Salary 52144.93 112000.00 21000.00 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
The order of the statistics changed from the default order. Because min is listed in the statement after
max, that is the order that they appear in the output.
5656
Grouping ObservationsPROC MEANS may not always print two digits to the right of the decimal point.
To control the maximum number of decimal places for PROC MEANS to use in printing results, use the MAXDEC= option in the PROC MEANS statement.
General form of the PROC MEANS statement with the MAXDEC= option:
PROC MEANS DATA=SAS-data-set MAXDEC=number;RUN;
PROC MEANS DATA=SAS-data-set MAXDEC=number;RUN;
5757
Choosing Summary Statisticstitle 'Salary Analysis';proc means data=ia.crew mean max min maxdec=1;run;
Salary Analysis
The MEANS Procedure
Variable Mean Maximum Minimum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ HireDate 9812.8 12690.0 7318.0 Salary 52144.9 112000.0 21000.0 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Values are rounded to the specified number of decimals. The range of values for the MAXDEC= option is 0-8. The MAXDEC= option does not use format names to format the values.
5858
SAS enables you to select the variables to display and analyze.
In PROC MEANS, what statement selects the variables?
Printing Selected Variables
PROC Statement to select variables
PRINT VAR
FREQ TABLES
...
5959
Printing Selected VariablesSAS enables you to select the variables to display and analyze.
In PROC MEANS, what statement selects the variables?
The VAR statement
PROC Statement to select variables
PRINT VAR
FREQ TABLES
6060
Printing Selected VariablesSAS enables you to select the variables to display and analyze.
PROC Statement to select variables
PRINT VAR
FREQ TABLES
MEANS VAR
6161
The VAR statement restricts the variables processed by PROC MEANS.
General form of the VAR statement:
VAR SAS-variable(s);VAR SAS-variable(s);
Selecting Variables
6262
proc means data=ia.crew; var Salary; title 'Salary Analysis';run;
HireDate LastName FirstName Location Phone EmpID JobCode Salary 07NOV1992 BEAUMONT SALLY T. LONDON 1132 E00525 PILOT1 72000
12MAY1985 BERGAMASCO CHRISTOPHER CARY 1151 E02466 FLTAT3 41000
04AUG1988 BETHEA BARBARA ANN FRANKFURT 1163 E00802 PILOT2 81000
ia.crewSelecting Variables – Example
Salary Analysis
The MEANS Procedure
Analysis Variable : Salary
N Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 69 52144.93 25521.78 21000.00 112000.00 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
...
6363
The CLASS statement in the MEANS procedure groups the observations of the SAS data set for analysis.
General form of the CLASS statement:
CLASS SAS-variable(s);CLASS SAS-variable(s);
Grouping Observations
6464
title 'Salary by Job Code';proc means data=ia.crew maxdec=2; var Salary; class JobCode;run;
HireDate LastName FirstName Location Phone EmpID JobCode Salary 07NOV1992 BEAUMONT SALLY T. LONDON 1132 E00525 PILOT1 72000
12MAY1985 BERGAMASCO CHRISTOPHER CARY 1151 E02466 FLTAT3 41000
04AUG1988 BETHEA BARBARA ANN FRANKFURT 1163 E00802 PILOT2 81000
ia.crewGrouping Observations – Example
...
6565
Salary by Job Code
The MEANS Procedure
Analysis Variable : Salary
Job NCode Obs N Mean Std Dev Minimum MaximumƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒFLTAT1 14 14 25642.86 2951.07 21000.00 30000.00
FLTAT2 18 18 35111.11 1906.30 32000.00 38000.00
FLTAT3 12 12 44250.00 2301.19 41000.00 48000.00
PILOT1 8 8 69500.00 2976.10 65000.00 73000.00
PILOT2 9 9 80111.11 3756.48 75000.00 86000.00
PILOT3 8 8 99875.00 7623.98 92000.00 112000.00ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Grouping Observations
6666
Using Formats with PROC MEANSYou cannot format the statistics, but you can format the CLASS variables.
proc format; value $codefmt 'FLTAT1'-'FLTAT3'='Flight Attendant' 'PILOT1'-'PILOT3'='Pilot';run;title 'Salary by Job Code';proc means data=ia.crew mean max min maxdec=2; var Salary; class JobCode; format JobCode $codefmt.; run;
6767
Salary by Job Code
The MEANS Procedure
Analysis Variable : Salary
NJobCode Obs Mean Maximum MinimumƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒFlight Attendant 44 34590.91 48000.00 21000.00
Pilot 25 83040.00 112000.00 65000.00
Grouping Observations
6868
This exercise reinforces the concepts discussed previously.
Exercise
6969
Exercises
Using the StudyLocations data set that you created in the previous section and PROC MEANS, create a report that includes the following: with the minimum, maximum, and range of Cost grouping the cost of the trip by Country appropriately formatted values
7070
Exercisesproc means data= StudyLocations min max range; var cost; class Country;run;
7171
Exercises (Alternate Solution)
proc means data= StudyLocations min max range nonobs ; var cost; class Country;run;
To remove the NOBS statistic, use the NONOBS option in the PROC MEANS statement.
7272
This exercise reinforces the concepts discussed previously.
Exercise – Section 8.2
Section 8.3
The REPORT Procedure
7474
Objectives Use the REPORT procedure to create a listing report. Apply the ORDER usage type to sort the data in a
listing report. Apply the SUM and GROUP usage types to create a
summary report. Use the RBREAK statement to produce a grand total.
7575
SAS Vocabulary PROC REPORT WINDOWS|WD NOWINDOWS|NOWD PROC TABULATE COLUMN DEFINE FORMAT=
WIDTH= ORDER GROUP RBREAK HEADLINE HEADSKIP
7676
REPORT Procedure FeaturesPROC REPORT enables you to create listing reports
Rows are listed one line at a time (as in PROCPRINT output).
create summary reports
Data is grouped and many rows are combined in one line of output.
LastName FirstName Date Purchase
TORRES JAN 16409 120.80TORRES JAN 16578 500.20SMITH MICHAEL 16614 82.25SMITH MICHAEL 15999 16.48SMITH MICHAEL 16080 25.45YONKERS JESSIE 16783 832.98ZIMMEL JIMMY 16999 48.32
Data Set
LastName FirstName Date Purchase
TORRES JAN 16409 621.00SMITH MICHAEL 16614 124.18YONKERS JESSIE 16783 832.98ZIMMEL JIMMY 16999 48.32
Summarized Report
...
7777
REPORT Procedure FeaturesPROC REPORT enables you to create listing reports
Rows are listed one line at a time (as in PROCPRINT output).
create summary reports
Data is grouped and many rows are combined in one line of output.
enhance reports easily, for example, with formats, labels, and groups
request separate subtotals and grand totals
generate reports in an interactive point-and-click (default) or programming environment.
7878
PROC REPORT versus PROC PRINTFEATURE PROC REPORT PROC PRINT
Detail Report Yes Yes
Summary Report Yes No
Crosstabular Report Yes No
Grand Totals Yes Yes
Subtotals Yes Yes, but not without
Grand Total Labels used automatically Yes No
Can have data appear in sorted order without sorting the data first.
Yes No
7979
proc report data=ia.crew nowd;run;
Creating a List ReportGeneral form of a simple PROC REPORT step:
Selected options:
PROC REPORT DATA=SAS-data-set <options>;RUN;
PROC REPORT DATA=SAS-data-set <options>;RUN;
WINDOWS | WD invokes the procedure in an interactive REPORT window (default).
NOWINDOWS | NOWD displays the report in the OUTPUT window.
8080
proc report data=ia.crew nowd;run;
Creating a List Report
Output JobCod Location Phone EmpID e Salary LONDON 2388 E01163 FLTAT2 34000 CARY 1381 E02102 FLTAT3 42000 LONDON 2553 E00710 FLTAT2 33000 CARY 2554 E01818 PILOT2 82000 CARY 2569 E03921 FLTAT3 47000 LONDON 2577 E03339 FLTAT2 35000 LONDON 2582 E03555 PILOT2 83000 CARY 2599 E02766 FLTAT2 32000 LONDON 2745 E03740 PILOT1 73000 FRANKFURT 1160 E01483 FLTAT2 33000 CARY 2779 E01384 FLTAT2 38000 FRANKFURT 2797 E00223 PILOT3 105000 FRANKFURT 1136 E04581 PILOT1 69000 FRANKFURT 1183 E00632 PILOT3 100000 FRANKFURT 2960 E03884 FLTAT2 38000 LONDON 2997 E00034 FLTAT3 44000 LONDON 1156 E03591 FLTAT3 47000 FRANKFURT 1194 E04064 FLTAT2 37000 FRANKFURT 1197 E01996 FLTAT1 26000 LONDON 1160 E04356 FLTAT2 34000 LONDON 1552 E01447 FLTAT3 45000 FRANKFURT 1553 E02679 FLTAT1 27000 CARY 1555 E02606 FLTAT2 36000 LONDON 1565 E03323 FLTAT1 22000
8181
What do you notice about JobCode?
Creating a List Report
JobCod Location Phone EmpID e Salary LONDON 2388 E01163 FLTAT2 34000 CARY 1381 E02102 FLTAT3 42000 LONDON 2553 E00710 FLTAT2 33000 CARY 2554 E01818 PILOT2 82000 CARY 2569 E03921 FLTAT3 47000 LONDON 2577 E03339 FLTAT2 35000 LONDON 2582 E03555 PILOT2 83000 CARY 2599 E02766 FLTAT2 32000 LONDON 2745 E03740 PILOT1 73000 FRANKFURT 1160 E01483 FLTAT2 33000 CARY 2779 E01384 FLTAT2 38000 FRANKFURT 2797 E00223 PILOT3 105000 FRANKFURT 1136 E04581 PILOT1 69000 FRANKFURT 1183 E00632 PILOT3 100000 FRANKFURT 2960 E03884 FLTAT2 38000 LONDON 2997 E00034 FLTAT3 44000 LONDON 1156 E03591 FLTAT3 47000 FRANKFURT 1194 E04064 FLTAT2 37000 FRANKFURT 1197 E01996 FLTAT1 26000 LONDON 1160 E04356 FLTAT2 34000 LONDON 1552 E01447 FLTAT3 45000 FRANKFURT 1553 E02679 FLTAT1 27000 CARY 1555 E02606 FLTAT2 36000 LONDON 1565 E03323 FLTAT1 22000
Output
8282
proc report data=ia.crew;run;
Creating a List ReportWhat happens if you forget the NOWD option?
Try it.
Your instructor can show you how to easily change the width of JobCode, the format of Salary, and change
the color of Salary to green.
8383
proc report data=ia.crew;run;
Creating a List ReportWhat happens if you forget the NOWD option?
An interactive window opens and you can make changes to the report interactively, rather than modifying the code.
You must close this window before any other code is submitted, otherwise, the code will wait in the buffer for you to close the window.
After the window is closed, any code submitted will be executed.
!
8484
The REPORT ProcedureThe default listing displays each data value as it is stored in the data set, or
formatted value if a format is stored with the data
variable names or labels as report column headings
a default width for the report columns (The width that is used is discussed later.)
character values left-justified
numeric values right-justified
observations in the order in which they are stored in the data set.
8585
SAS enables you to select the variables to display and analyze.
In PROC REPORT, what statement selects the variables?
Printing Selected Variables
PROC Statement to select variables
PRINT VAR
FREQ TABLES
MEANS VAR
...
8686
Printing Selected VariablesSAS enables you to select the variables to display and analyze.
In PROC REPORT, what statement selects the variables?
The COLUMN statement
PROC Statement to select variables
PRINT VAR
FREQ TABLES
MEANS VAR
8787
Reference: Printing Selected VariablesSAS enables you to select the variables to display and analyze.
PROC Statement to select variables
PRINT VAR
FREQ TABLES
MEANS VAR
REPORT COLUMN
8888
Printing Selected Variables
COLUMN SAS-variables;COLUMN SAS-variables;
You can use a COLUMN statement in PROC REPORT to do the following: select the variables to appear in the report order the variables in the report
General form of the COLUMN statement:
8989
Sample Listing Report – Example
Partial SAS Output
title 'Salary Analysis';proc report data=ia.crew nowd; column JobCode Location Salary;run;
Salary Analysis
JobCod Location Salary e
PILOT1 LONDON 72000 FLTAT3 CARY 41000 PILOT2 FRANKFURT 81000 PILOT2 FRANKFURT 83000 FLTAT2 LONDON 36000 PILOT1 LONDON 65000 FLTAT2 FRANKFURT 35000 FLTAT2 FRANKFURT 38000 FLTAT1 LONDON 28000 FLTAT3 LONDON 44000 FLTAT2 CARY 37000 . . .
9090
The DEFINE StatementYou can enhance the report by using DEFINE statements to perform the following tasks: define how each variable is used in the report assign formats to variables specify report column headers and column widths change the order of the rows in the report
9191
The DEFINE Statement
General form of the DEFINE statement:
DEFINE variable / <usage> <attribute-list>;DEFINE variable / <usage> <attribute-list>;
You should add a DEFINE statement to the PROC REPORT step for every variable that you want to look differently from the default appearance.
You do not have to add a DEFINE statement for every variable; only for the ones whose appearance you want to change.
required
9292
The DEFINE StatementSelected attributes:
If there is a label stored in the descriptor portion of the data set, it is the default header. If one is not stored, SAS uses the variable name.
' report-column-header ' defines the report column header.
Example:
title 'Salary Analysis';proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / 'Annual Salary';run;
9393
Sample Listing Report – Example
Salary Analysis Annual JobCod Location Salary e
PILOT1 LONDON 72000 FLTAT3 CARY 41000 PILOT2 FRANKFURT 81000 PILOT2 FRANKFURT 83000 FLTAT2 LONDON 36000 PILOT1 LONDON 65000 FLTAT2 FRANKFURT 35000 FLTAT2 FRANKFURT 38000 . . .
title 'Salary Analysis';proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / ‘Annual Salary’;run;
The / is required.
9494
Selected attributes:
If there is a format stored in the descriptor portion of the data set, it is the default format.
The DEFINE Statement
FORMAT= assigns a format to a variable.
Example:
title 'Salary Analysis';proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / 'Annual Salary'
format= dollar8.;run;
9595
Sample Listing Report – Example
Salary Analysis Annual JobCod Location Salary e
PILOT1 LONDON $72,000 FLTAT3 CARY $41,000 PILOT2 FRANKFURT $81,000 PILOT2 FRANKFURT $83,000 FLTAT2 LONDON $36,000 PILOT1 LONDON $65,000 FLTAT2 FRANKFURT $35,000 FLTAT2 FRANKFURT $38,000 . . .
title 'Salary Analysis';proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / 'Annual Salary'
format= dollar8. ;run;
Use one /, followed by all attributes in any order.
9696
Selected attributes:
The default width is the variable length for character variables 9 for numeric variables the format width if there is a format stored in the
descriptor portion of the data set.
WIDTH= controls the width of a report column.
The DEFINE Statement
The WIDTH= option enables you to change the width of JobCode so that the e is not on a separate line.
9797
title 'Salary Analysis';proc report data=ia.crew nowd; column JobCode Location Salary; define Salary / 'Annual Salary'
format= dollar8. ; define JobCode / width= 8;run;
Sample Listing Report – Example
Salary Analysis Annual JobCode Location Salary
PILOT1 LONDON $72,000 FLTAT3 CARY $41,000 PILOT2 FRANKFURT $81,000 PILOT2 FRANKFURT $83,000 FLTAT2 LONDON $36,000 PILOT1 LONDON $65,000 FLTAT2 FRANKFURT $35,000 FLTAT2 FRANKFURT $38,000 . . .
The order of the DEFINE statements does not matter.
9898
Enhancing the Listing Report – Example Change column headings. Increase the column widths. Add a format to display Salary with dollar signs and
commas.
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / width=8 'Job Code'; define Location / 'Home Base'; define Salary / format=dollar10.;run;
...
9999
Enhancing the Listing Report – Example Partial SAS Output
Job Code Home Base Salary
PILOT1 LONDON $72,000
FLTAT3 CARY $41,000
PILOT2 FRANKFURT $81,000
PILOT2 FRANKFURT $83,000
FLTAT2 LONDON $36,000
PILOT1 LONDON $65,000
FLTAT2 FRANKFURT $35,000
FLTAT2 FRANKFURT $38,000
FLTAT1 LONDON $28,000 . . .
100100
Enhancing the Listing Report – Example Change the report to group the pilots and flight attendants.
Job Code Home Base Salary
PILOT1 LONDON $72,000
FLTAT3 CARY $41,000
PILOT2 FRANKFURT $81,000
PILOT2 FRANKFURT $83,000
FLTAT2 LONDON $36,000
PILOT1 LONDON $65,000
FLTAT2 FRANKFURT $35,000
FLTAT2 FRANKFURT $38,000
FLTAT1 LONDON $28,000 . . .
101101
Selected attributes
The ORDER attribute orders the report in ascending order. Include the
DESCENDING option in the DEFINE statement to force the order to be descending.
suppresses repetitious printing of values. does not need data to be sorted previously.
ORDER Usage Type
ORDER orders the rows in the report.
102102
ORDER Usage Type – Example Display the data in order by JobCode.
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / order width=8 'Job Code'; define Location / 'Home Base'; define Salary / format=dollar10.;run;
103103
ORDER Usage Type – Example Partial SAS Output
Salary Analysis
Job Code Home Base Salary
FLTAT1 LONDON $28,000
FRANKFURT $25,000
CARY $23,000
. . .
FRANKFURT $27,000
LONDON $22,000
FLTAT2 LONDON $36,000
FRANKFURT $35,000
. . .
FRANKFURT $33,000
CARY $38,000
The values of FLTAT are not repeated for each row; they are suppressed.
104104
ORDER Usage Type – Example Display the data in descending order by JobCode.
The DESCENDING keyword can go anywhere in the DEFINE statement after the /. It cannot be abbreviated.
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / descending order width=8 'Job Code'; define Location / 'Home Base'; define Salary / format=dollar10.;run;
105105
ORDER Usage Type – Example Partial SAS Output
Salary Analysis
Job Code Home Base Salary
PILOT3 LONDON $108,000
CARY $112,000
LONDON $94,000
. . .
PILOT2 FRANKFURT $81,000
FRANKFURT $83,000
. . .
PILOT1 LONDON $72,000
LONDON $65,000
CARY $71,000
. . .
106106
ORDER Usage Type – Example What if you also want Location in sorted order?
Salary Analysis
Job Code Home Base Salary
FLTAT1 LONDON $28,000
FRANKFURT $25,000
CARY $23,000
. . .
FRANKFURT $27,000
LONDON $22,000
FLTAT2 LONDON $36,000
FRANKFURT $35,000
. . .
107107
ORDER Usage Type – Example Display the data in order by JobCode and Location.
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / order width=8 'Job Code'; define Location / order 'Home Base'; define Salary / format=dollar10.;run;
108108
ORDER Usage Type – Example Output
Salary Analysis
Job Code Home Base Salary FLTAT1 CARY $23,000 $21,000 . . . FRANKFURT $25,000 $22,000 . . . LONDON $28,000 $29,000 $24,000 $25,000 $22,000 FLTAT2 CARY $37,000 $34,000 $33,000 . . . FRANKFURT $35,000 $38,000 . . .
109109
ORDER Usage Type – Example How did SAS know to group Location in JobCode?
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / order width=8 'Job Code'; define Location / order 'Home Base'; define Salary / format=dollar10.;run;
110110
ORDER Usage Type – Example
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / order width=8 'Job Code'; define Location / order 'Home Base'; define Salary / format=dollar10.;run;
The COLUMN statement selects and controls the order of the variables in the output.
Remember
How did SAS know to group Location in JobCode?
111111
ORDER Usage Type – Example How did SAS know to group Location in JobCode?
Specify Location before JobCode in the COLUMN statement.
proc report data=ia.crew nowd; column Location JobCode Salary; define JobCode / order width=8 'Job Code'; define Location / order 'Home Base'; define Salary / format=dollar10.;run;
112112
ORDER Usage Type – Example Specify Location before JobCode in the COLUMN statement.
Salary Analysis
Home Base Job Code Salary CARY FLTAT1 $23,000 $21,000 $29,000 $30,000 $28,000 FLTAT2 $37,000 $34,000 $33,000 . . . FRANKFURT FLTAT1 $25,000 $22,000 $26,000 $27,000 FLTAT2 $35,000 $38,000 $33,000 . . .
113113
This exercise reinforces the concepts discussed previously.
Exercise
114114
Exercises
Using the StudyLocations data set that you created earlier and PROC REPORT, create the following report:
1. Display the variables in the following order:Country, Length of Program, Trip Begin Date, and Trip Cost
2. Format the Trip Cost appropriately as currency the Trip Begin Date so that December 1, 2007 will
appear as 01/12/2007.
3. Give an appropriate width for other variables
4. Order the rows by Country.
5. Title the report Study Abroad Options.
115115
Exercises
title 'Study Abroad Options';proc report data=StudyLocations nowd; column country time begindate cost; define beginDate /format=ddmmyy10.; define country / order;run;
116116
ExercisesPartial Output
117117
Business TaskInternational Airlines wants to summarize Salary by
JobCode for each Location.
118118
Desired ReportSalary Analysis
Job Code Home Base Salary FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000
You want one line for each flight attendant from Cary and the total salary.
119119
The DEFINE Statement - ReviewGeneral form of the DEFINE statement in PROC REPORT:
The USAGE attribute is necessary to produce this summary report.
DEFINE variable / <usage> <attribute-list>;DEFINE variable / <usage> <attribute-list>;
120120
The DEFINE Statement – USAGE AttributeBy default in PROC REPORT, character variables have a display usage and produce
a listing report. (Each row is listed and there is no summarization or collapsing of rows.)
numeric variables have an analysis usage and produce summary reports.
Variable Type
Default Usage
Report Produced
Character Display Listing
Numeric Analysis Summary
121121
The DEFINE Statement – USAGE AttributeThe analysis usage for numeric variables uses a default statistic of SUM (You can
choose a different statistic.) has no effect when you produce a report that
contains character variables by default
Character data has a display usage by default.
If you have at least one column with a display usage, you get a listing report.
!
122122
The DEFINE Statement – USAGE AttributeIf your data set has one character display column, PROC REPORT will output a listing report by default, regardless of the number of numeric columns.
VariableType
Default Usage
Report Produced
Character Display Listing
Numeric Analysis Summary
123123
Character and Numeric VariablesDisplay Usage
Type (Character Variable Default)
Analysis UsageType (Numeric
Variable Default) Report
Listing ReportProduced
Original Data SetJobCode Salary PILOT1 72000 FLTAT3 41000 PILOT2 81000 PILOT2 83000 FLTAT2 36000 PILOT1 65000
JobCode Salary PILOT1 72000 FLTAT3 41000 PILOT2 81000 PILOT2 83000 FLTAT2 36000 PILOT1 65000
...
The default statistic is SUM.
124124
Numeric Variables OnlyAnalysis UsageSUM Statistic
Original Data SetReport
Summary ReportProduced
...
Salary 72000 41000 81000 83000 36000 65000
Salary 378000
Sum of all Salary values in
the data set
Why did you get a summary report?
125125
Defining Group VariablesTo have character columns appear in the summarized report, use the GROUP attribute.
126126
Defining Group VariablesIn order for grouping to take affect, the word group must be placed in the DEFINE statement for every character variable.
Example:
proc report data=ia.crew nowd; column JobCode Location Salary;; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.;run;
127127
Defining Group VariablesAll observations whose group variables have the same values are collapsed into a single row in the report.
128128
Listing ReportProduced
JobCode Salary PILOT1 72000 FLTAT3 41000 PILOT2 81000 PILOT2 83000 FLTAT2 36000 PILOT1 65000
Defining Group VariablesJobCode asDisplay Usage Analysis Usage
SUM StatisticOriginal Data Set
JobCode as Group Usage
Report JobCode Salary FLTAT2 36000 FLTAT3 41000 PILOT1 137000 PILOT2 164000
JobCode Salary PILOT1 72000 FLTAT3 41000 PILOT2 81000 PILOT2 83000 FLTAT2 36000 PILOT1 65000
Report
Summary ReportProduced
FLTAT2 is in both reports.
...
129129
Defining Group VariablesAs you saw with the ORDER option, nesting of group variables is determined by the order of the variables in the COLUMN statement.
Example
proc report data=ia.crew nowd; column JobCode Location Salary;; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.;run;
130130
Summarizing the Data
Salary Analysis
Job Code Home Base Salary FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000
Partial SAS Output
131131
Defining Group Variables
Report Salary Analysis
JobCode Location Salary FLTAT2 CARY 67000 FLTAT3 CARY 85000 FRANKFURT 93000
Location as Group
Usage
Location JobCode Salary FRANKFURT FLTAT3 48000 FRANKFURT FLTAT3 45000 CARY FLTAT2 34000 CARY FLTAT3 44000 CARY FLTAT3 41000 CARY FLTAT2 33000
Original Data Set
JobCode asGroup Usage
Analysis UsageSUM Statistic
...
132132
Defining Group VariablesList Location before JobCode in the COLUMN statement.
Example
proc report data=ia.crew nowd; column Location JobCode Salary;; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.;run;
133133
Defining Group VariablesOutput proc report data=ia.crew nowd;
column Location JobCode Salary;; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.;run;
Home Base Job Code Salary CARY FLTAT1 $131,000 FLTAT2 $245,000 FLTAT3 $217,000 PILOT1 $211,000 PILOT2 $323,000 PILOT3 $300,000 FRANKFURT FLTAT1 $100,000 FLTAT2 $181,000 FLTAT3 $134,000 PILOT1 $135,000 PILOT2 $240,000 PILOT3 $205,000 LONDON FLTAT1 $128,000 FLTAT2 $206,000 FLTAT3 $180,000 PILOT1 $210,000 PILOT2 $158,000 PILOT3 $294,000
Location appears first and JobCode is nested in Location.
134134
Reference: Defining Group VariablesIf you have a group variable, there must be no display or order variables.
Group variables produce summary reports (observations collapsed into groups).
Display and order variables produce listing reports (one row for each observation).
135135
Reference: Defining Analysis VariablesDefault usage for numeric variables is analysis with a default statistic of SUM.
If… Then…the report contains group variables,
the report displays the sum of the numeric variables’ values for each group.
the report contains at least one display or order variable and no group variables,
the report lists all of the values of the numeric variable.
the report contains only numeric variables,
the report displays grand totals for the numeric variables.
136136
Defining Analysis VariablesSelected statistics include the following:
To specify a statistic other than SUM, type the name of the statistic after the slash in the DEFINE statement.
Example:
define Salary / mean format=dollar10.;
SUM sum (default)
N number of nonmissing values
MEAN average
MAX maximum value
MIN minimum value
137137
Specify the MEAN statistic.
Summarizing the Data
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / mean format=dollar10.;run;
138138
Output
Summarizing the Dataproc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / mean format=dollar10.;run;
Job Code Home Base Salary FLTAT1 CARY $26,200 FRANKFURT $25,000 LONDON $25,600 FLTAT2 CARY $35,000 FRANKFURT $36,200 LONDON $34,333 FLTAT3 CARY $43,400 FRANKFURT $44,667 LONDON $45,000 PILOT1 CARY $70,333 FRANKFURT $67,500 LONDON $70,000 PILOT2 CARY $80,750 FRANKFURT $80,000 LONDON $79,000 PILOT3 CARY $100,000 FRANKFURT $102,500 LONDON $98,000
139139
This exercise reinforces the concepts discussed previously.
Exercise
140140
Exercises
Using the CollegeStats data set, produce the following report:
Average SAT Scores and GPAs for Second Attempt by Gender
Average Second SAT Average Gender Score HS_GPA Female 1,138 3.39 Male 1,089 3.29
141141
Exercises – A Solution
proc format; value $gender 'm','M' = 'Male' 'f','F' = 'Female';run;title 'Average SAT Scores and GPAs for Second Attempt';title2 'by Gender';options nodate nonumber center ls=64;proc report data=collegestats nowd; column gender SAT_Score_II HS_GPA ; define gender / group width=6 format=$gender.; define SAT_Score_II / mean 'Average Second SAT Score'
format=comma12.; define HS_GPA/ mean 'Average HS_GPA' width=7;run;
142142
Printing Grand TotalsYou can use an RBREAK statement to add the following: grand total to the top or bottom of the report line before the grand total line after the grand total
General form of the RBREAK statement:
RBREAK BEFORE | AFTER </options>;RBREAK BEFORE | AFTER </options>;
143143
Printing Grand TotalsSelected options:
SUMMARIZE prints the total.
OL prints a single line above the total.
DOL prints a double line above the total.
UL prints a single line below the total.
DUL prints a double line below the total.
Refer to SAS OnlineDoc for more information about the RBREAK statement and other PROC REPORT options.
144144
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol;run;
Use the RBREAK statement to display the grand total at the bottom of the report.
The RBREAK Statement
The SUMMARIZE option gives you the grand total.
145145
The RBREAK StatementSalary Analysis
Job Code Home Base Salary FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000
146146
Printing SubtotalsYou can use a BREAK statement to add the following: subtotal to the top or bottom of the report line before the subtotal line after the subtotal
General form of the BREAK statement:
BREAK BEFORE | AFTER VariableName </options>;BREAK BEFORE | AFTER VariableName </options>;
147147
Use the BREAK statement to display the subtotal at the end of each group in the report.
The BREAK Statement
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol; break after JobCode / summarize ol;
run;
Example: You want subtotals after each JobCode.
148148
The BREAK Statement
Subtotals
149149
Use the SKIP option in the BREAK statement to add a blank line between the groups.
The BREAK Statement
proc report data=ia.crew nowd; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol; break after JobCode / summarize ol skip;run;
Example: Add a line between each group.
150150
The BREAK Statement
Breaks
151151
proc report data=ia.crew nowd headline; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol;run;
To add a line under your column headings, use the HEADLINE option in the PROC REPORT statement.
Enhancing the Report
152152
Enhancing the Report Salary Analysis
Job Code Home Base Salary ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000
Headline
153153
proc report data=ia.crew nowd headline headskip; column JobCode Location Salary; define JobCode / group width=8 'Job Code'; define Location / group 'Home Base'; define Salary / format=dollar10.; rbreak after / summarize dol;run;
To skip a line under your column headings, use the HEADSKIP option in the PROC REPORT statement.
Enhancing the Report
154154
Enhancing the Report Salary Analysis
Job Code Home Base Salary ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
FLTAT1 CARY $131,000 FRANKFURT $100,000 LONDON $128,000 FLTAT2 CARY $245,000 FRANKFURT $181,000 LONDON $206,000 FLTAT3 CARY $217,000 FRANKFURT $134,000 LONDON $180,000 PILOT1 CARY $211,000 FRANKFURT $135,000 LONDON $210,000 PILOT2 CARY $323,000 FRANKFURT $240,000 LONDON $158,000 PILOT3 CARY $300,000 FRANKFURT $205,000 LONDON $294,000 ========== $3,598,000
Headskip
155155
This exercise reinforces the concepts discussed previously.
Exercise
156156
Exercises
Modify a previous exercise using the StudyLocation data set that you created and the PROC REPORT step to create the following output:
Average SAT Scores and GPAs for Second Attemptby Gender
Average Second SAT Average Gender Score HS_GPA ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Female 1,138 3.39
Male 1,089 3.29
ƒƒƒƒƒƒƒƒƒƒƒƒ ƒƒƒƒƒƒƒ 1,114 3.34 ============ =======
A blank line appears between genders.
157157
Exercises
proc format;
value $gender 'm','M' = 'Male'
'f','F' = 'Female';
run;
title 'Average SAT Scores and GPAs for Second Attempt';
title2 'by Gender';
options nodate nonumber;
proc report data=collegestats nowd headline headskip;
column gender SAT_Score_II HS_GPA ;
define gender / group width=6 format=$gender.;
define SAT_Score_II / mean 'Average Second SAT Score' format=comma12.;
define HS_GPA/ mean 'Average HS_GPA' width=7;
rbreak after / summarize ol dul;
break after gender/ skip;
run;
158158
This exercise reinforces the concepts discussed previously.
Exercise – Section 8.3