the information delivery process

72
The Information Delivery Process Data In Information Out Manage Organize Exploit

Upload: hayden

Post on 04-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

Data In. Information Out. The Information Delivery Process. Manage. Organize. Exploit. Turning Data Into Information. DATA Step. Data. Data. PROC Steps. Data. SAS Data Sets. PROC Steps. Information. Information. Turning Data Into Information. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Information Delivery Process

The Information Delivery Process

DataIn

InformationOut

Manage Organize Exploit

Page 2: The Information Delivery Process

2

Turning Data Into Information

Data

DATA Step

PROC Steps

Data

Information

SAS Data Sets

Data

PROC Steps

Information

Page 3: The Information Delivery Process

3

Turning Data Into Information

Process of delivering meaningful information:

80% Data-related:

• Access

• Scrub

• Transform

• Manage

• Store and retrieve

20% Analysis

Page 4: The Information Delivery Process

4

The Raw Data

0031GOLDENBERG DESIREE PILOT1 50221.610040WILLIAMS ARLENE M. FLTAT1 23666.12

0071PERRY ROBERT A. FLTAT1 21957.710082MCGWIER-WATTSCHRISTINA PILOT3 96387.390091SCOTT HARVEY F. FLTAT2 32278.40

0106THACKER DAVID S. FLTAT1 24161.14

0275GRAHAM DEBORAH S. FLTAT2 32024.93

0286DREWRY SUSAN PILOT1 55377.00

0309HORTON THOMAS L. FLTAT1 23705.12

0334DOWN EDWARD PILOT1 56584.87

Partial fixed-column raw data file:

1---5----0----5----0----5----0----5----0----51 1 2 2 3 3 4 4

Page 5: The Information Delivery Process

5

Browsing the Data Values

Listing of Flight Crew Employees

Obs empid lastname firstname jobcode salary

1 0031 GOLDENBERG DESIREE PILOT1 50221.62 2 0040 WILLIAMS ARLENE M. FLTAT1 23666.12 3 0071 PERRY ROBERT A. FLTAT1 21957.71 4 0082 MCGWIER-WATTS CHRISTINA PILOT3 96387.39 5 0091 SCOTT HARVEY F. FLTAT2 32278.40 6 0106 THACKER DAVID S. FLTAT1 24161.14 7 0275 GRAHAM DEBORAH S. FLTAT2 32024.93 8 0286 DREWRY SUSAN PILOT1 55377.00 9 0309 HORTON THOMAS L. FLTAT1 23705.12 10 0334 DOWN EDWARD PILOT1 56584.87 11 0347 CHERVENY BRENDA B. FLTAT2 38563.45 12 0355 BELL THOMAS B. PILOT1 59803.16 13 0366 GLENN MARTHA S. PILOT3 120202.38 14 0385 HOLMAN GREGORY A. PILOT2 93001.09 15 0390 NOE BARBARA E. FLTAT2 37101.32

Page 6: The Information Delivery Process

6

Reading a Raw Data File

0031GOLDENBERG DESIREE PILOT1 50221.610040WILLIAMS ARLENE M. FLTAT1 23666.12

0071PERRY ROBERT A. FLTAT1 21957.710082MCGWIER-WATTSCHRISTINA PILOT3 96387.390091SCOTT HARVEY F. FLTAT2 32278.40

empid lastname firstname jobcode salary0031 GOLDENBERG DESIREE PILOT1 50221.620040 WILLIAMS ARLENE M. FLTAT1 23666.12

0071 PERRY ROBERT A. FLTAT1 21957.71

0082 MCGWIER-WATTS CHRISTINA PILOT3 96387.390091 SCOTT HARVEY F. FLTAT2 32278.40

RawDataFile

SASDataSet

Page 7: The Information Delivery Process

7

Reading Raw Data Files

Raw Data File

DATA StepSAS Data Set

data . . .; infile . . .; input . . .;run;

0031GOLDENBERG DESIREE0040WILLIAMS ARLENE M.0071PERRY ROBERT A.0082MCGWIER-WATTSCHRISTINA

empid lastname firstname

0031 GOLDENBERG DESIREE

0040 WILLIAMS ARLENE M.

0071 PERRY ROBERT A.

0082 MCGWIER-WATTS CHRISTINA

Page 8: The Information Delivery Process

8

Reading Raw Data Files

In order to create a SAS data set from a raw data file, you must

• start a DATA step and name the SAS data set being created (DATA statement)

• identify the location of the raw data file to read (INFILE statement)

• describe how to read the data fields from the raw data file (INPUT statement).

Page 9: The Information Delivery Process

9

Creating a SAS Data Set with the DATA Statement

General form of the DATA statement:

This DATA statement creates a SAS data set called WORK.EMPDATA:

data work.empdata;

DATA SAS-data-set(s);

Page 10: The Information Delivery Process

10

Pointing to a Raw Data File with the INFILE Statement

General form of the INFILE statement:

Examples:

OS/390 infile ‘edc.prog1.employee’;

UNIX infile ‘/user/prog1/employee.dat’;

WIN infile ‘C:\workshop\winsas\ prog1\employee.dat’;

INFILE ‘filename’ <options>;

Page 11: The Information Delivery Process

11

Reading Raw Data Using Column Input

General form of column input:

To read raw data values with column input,

1. name the SAS variable you want to create

2. use a dollar sign, $, if the SAS variable is character

3. specify the starting column, a dash, and the ending column of the raw data field.

INPUT variable $ startcol-endcol …;

Page 12: The Information Delivery Process

12

Reading Raw Data Using Column Input

0031GOLDENBERG DESIREE PILOT1 50221.62

input empid $ 1-4

lastname $ 5-17

1---5----0----5----0----5----0----5----0----521 1 2 3 3 4 4

Page 13: The Information Delivery Process

13

Reading Raw Data Using Column Input

0031GOLDENBERG DESIREE PILOT1 50221.62

input empid $ 1-4

lastname $ 5-17

firstname $ 18-30

1---5----0----5----0----5----0----5----0----521 1 2 3 3 4 4

Page 14: The Information Delivery Process

14

Reading Raw Data Using Column Input

0031GOLDENBERG DESIREE PILOT1 50221.62

input empid $ 1-4

lastname $ 5-17

firstname $ 18-30

jobcode $ 31-36

1---5----0----5----0----5----0----5----0----521 1 2 3 3 4 4

Page 15: The Information Delivery Process

15

Reading Raw Data Using Column Input

0031GOLDENBERG DESIREE PILOT1 50221.62

input empid $ 1-4

lastname $ 5-17

firstname $ 18-30

jobcode $ 31-36

salary 37-45;

1---5----0----5----0----5----0----5----0----521 1 2 3 3 4 4

Page 16: The Information Delivery Process

16

Reading Raw Data Using Column Input

0031GOLDENBERG DESIREE PILOT1 50221.62

input empid $ 1-4

lastname $ 5-17

firstname $ 18-30

jobcode $ 31-36

salary 37-45;

1---5----0----5----0----5----0----5----0----521 1 2 3 3 4 4

Page 17: The Information Delivery Process

17

Business Scenario

International Airlines is preparing to review its flight crew. The immediate goal is to read the Excel spreadsheet and create a SAS data set.

Excel Spreadsheet

SAS Data Set

Page 18: The Information Delivery Process

18

What is the Import Wizard?

A point-and-click graphical interface that enables you to create a SAS data set from several types of external files including

• dBASE file (*.DBF)

• Excel 97 Spreadsheet (*.XLS)

• Microsoft Access Table

• Delimited file (*.*)

• Comma Separated Values (*.CSV)

Page 19: The Information Delivery Process

19

The Raw DataThe aircraft data is stored in a fixed-column raw data file:

JetCruise LF5200 030003 04/05/1994 03/11/2001JetCruise LF5200 030005 02/15/1999 07/05/2001JetCruise LF5200 030008 03/06/1996 04/02/2002JetCruise LF5200 030009 10/14/1998 09/15/2001JetCruise LF5200 030011 09/04/1998 08/31/2001JetCruise LF5200 030012 01/02/1994 03/29/2001JetCruise LF5200 030013 02/01/1996 11/23/2002JetCruise LF5200 030015 06/24/1998 02/06/2001

1---5----0----5----0----5----0----5----0----511 2 32 3 4 4

aircraft model date in service

last maintenance dateaircraft IDPartial data:

Page 20: The Information Delivery Process

20

Using Formatted Input

The raw data file will be read with formatted input.

Raw Data File

DATA Step

SAS Data Set

data sas-data-set-name; infile raw-filename; input pointer-control

variable informat-name;

run;

JetCruise LF5200 030003 04/05/1994 03/11/2001JetCruise LF5200 030005 02/15/1999 07/05/2001JetCruise LF5200 030008 03/06/1996 04/02/2002

Obs model aircraftid inservice lastmaint1 JetCruise LF5200 030003 05APR1994 11MAR20012 JetCruise LF5200 030005 15FEB1999 05JUL2001

3 JetCruise LF5200 030008 06MAR1996 02APR2002

Page 21: The Information Delivery Process

21

What is a SAS Format?

A format is an instruction that the SAS System uses to write data values.

SAS formats have the following form:

<$>format<w>.<d>

Page 22: The Information Delivery Process

22

SAS Formats

Selected SAS formats:

w.d standard numeric format

$w. standard character format

COMMAw.d commas in a number: 12,234.21

DOLLARw.d dollar signs and commas in a

number: $12,234.41

Page 23: The Information Delivery Process

23

SAS Formats

Stored Value Format Displayed Value

27134.2864 COMMA12.2 27,134.29

27134.2864 12.2 27134.29

27134.2864 DOLLAR12.2 $27,134.29

27134.2864 DOLLAR9.2 $27134.29

27134.2864 DOLLAR8.2 27134.29

Page 24: The Information Delivery Process

24

Using Formatted Input

General form of the INPUT statement with formatted input:

Pointer control:

@n moves the pointer to column n.

+n moves the pointer n positions.

INPUT pointer-control column informat ...;

Page 25: The Information Delivery Process

25

Using Formatted Input

Formatted input can be used to read non-standard data values by

• moving the input pointer to the starting position of the field

• specifying a column name

• specifying an informat.

An informat specifies the width of the input field and how to read the data values that are stored in the field.

Page 26: The Information Delivery Process

26

Using Formatted Input

General form of an informat:

$ indicates a character format.

informat-name names the informat.

w is an optional field width.

. is the required delimiter.

d optionally, specifies a decimal for numeric informats.

$informat-namew.d

Page 27: The Information Delivery Process

27

Selected Informats

7. or 7.0 reads seven columns of numeric data.

7.2 reads seven columns of numeric data and inserts a decimal point in the data value.

$5. reads five columns of character data and removes leading blanks.

$CHAR5. reads five columns of character data and preserves leading blanks.

Page 28: The Information Delivery Process

28

Selected Informats

COMMA7. reads seven columns of numeric data and removes selected nonnumeric characters, such as dollar signs and commas.

PD4. reads four columns of packed decimal data.

MMDDYY10. reads dates of the form 01/20/2000.

Page 29: The Information Delivery Process

29

Working with Date Values

The raw data file contains date values. These date values will be read with the MMDDYY10. informat:

Jetcruise LF5200 030003 04/05/1990 3/11/2001

Jetcruise LF5200 030005 02/15/1990 7/05/2001

Jetcruise LF5200 030008 03/06/1990 4/02/2002

1---5----0----5----0----5----0----5----0----511 2 32 3 4 4

Page 30: The Information Delivery Process

30

Converting Dates to SAS Date Values

SAS uses date informats to read and convert dates to SAS date values. For example,

Stored Value Informat Converted Value

10/29/1999 MMDDYY10. 14546

29OCT1999 DATE9. 14546

29/10/1999 DDMMYY10. 14546

Page 31: The Information Delivery Process

31

SAS Formats

Selected SAS date formats:

MMDDYYw. 101692 (MMDDYY6.)

10/16/92 (MMDDYY8.)

10/16/1992 (MMDDYY10.)

DATEw. 16OCT92 (DATE7.)

16OCT1992 (DATE9.)

Page 32: The Information Delivery Process

32

Locating and Browsing the Raw Data FileBrowse the raw data file and determine the column layout and type:

JetCruise LF5200 030003 04/05/1994 03/11/2001JetCruise LF5200 030005 02/15/1999 07/05/2001JetCruise LF5200 030008 03/06/1996 04/02/2002JetCruise LF5200 030009 10/14/1998 09/15/2001JetCruise LF5200 030011 09/04/1998 08/31/2001JetCruise LF5200 030012 01/02/1994 03/29/2001JetCruise LF5200 030013 02/01/1996 11/23/2002JetCruise LF5200 030015 06/24/1998 02/06/2001

1---5----0----5----0----5----0----5----0----511 2 32 3 4 4

aircraft model date in service

last maintenance dateaircraft IDPartial raw data file:

Page 33: The Information Delivery Process

33

Starting the DATA Step

Use the DATA statement to begin the DATA step and name the SAS data set:

data work.aircraft; other SAS statementsrun;

Use the INFILE statement to identify the input raw data file:

data work.aircraft; infile ‘aircraft.dat’; other SAS statementsrun;

Page 34: The Information Delivery Process

34

Writing the INPUT Statement

Use the INPUT statement and pointer control to read the record starting with the first column. Read the value with the $16. informat and assign it to the variable MODEL.

JetCruise LF5200 030003 04/05/1994 03/11/2001

data work.aircraft; infile ‘aircraft.dat’; input @1 model $16. other SAS statementsrun;

1---5----0----5----0----5----0----5----0----51 1 2 2 3 3 4 4

Page 35: The Information Delivery Process

35

Writing the INPUT Statement

Use the INPUT statement and pointer control to read the record starting with column 18. Read the value with the $6. informat and assign the value to AIRCRAFTID.

JetCruise LF5200 030003 04/05/1994 03/11/2001

data work.aircraft; infile ‘aircraft.dat’; input @1 model $16. @18 aircraftid $6. other SAS statementsrun;

1---5----0----5----0----5----0----5----0----51 1 2 2 3 3 4 4

Page 36: The Information Delivery Process

36

Writing the INPUT StatementUse the INPUT statement and pointer control to read the record starting with column 25. Read the value with the MMDDYY10. informat and assign the value to INSERVICE.

JetCruise LF5200 030003 04/05/1994 03/11/2001

data work.aircraft infile ‘aircraft.dat’; input @1 model $16. @18 aircraftid $6. @25 inservice mmddyy10. other SAS statementsrun;

1---5----0----5----0----5----0----5----0----51 1 2 2 3 3 4 4

Page 37: The Information Delivery Process

37

Use the INPUT statement and pointer control to read the record starting with column 36. Read the value with the MMDDYY10. informat and assign the value to LASTMAINT.

JetCruise LF5200 030003 04/05/1994 03/11/2001

data work.aircraft; infile ‘aircraft.dat’; input @1 model $16. @18 aircraftid $6. @25 inservice mmddyy10. @36 lastmaint mmddyy10.;run;

1---5----0----5----0----5----0----5----0----51 1 2 2 3 3 4 4

Writing the INPUT Statement

Page 38: The Information Delivery Process

38

SAS Syntax Rules

• They can begin and end in any column.

• One or more blanks or special characters can be used to separate words.

• A single statement can span multiple lines.

• Several statements can be on the same line.

SAS statements are free-format.

data work.mech_pilot; infile 'c:\coursedata\emplist.dat';input lastname $ 1-20 firstname $ 21-30jobtitle $ 36-43 salary 54-59;run; proc means data=work.mech_pilot n mean; class jobtitle; var salary;run;

Unconventional spacing

Page 39: The Information Delivery Process

39

SAS Syntax Rules

• They can begin and end in any column.

• One or more blanks or special characters can be used to separate words.

• A single statement can span multiple lines.

• Several statements can be on the same line.

SAS statements are free-format.

data work.mech_pilot; infile 'c:\coursedata\emplist.dat';input lastname $ 1-20 firstname $ 21-30jobtitle $ 36-43 salary 54-59;run; proc means data=work.mech_pilot n mean; class jobtitle; var salary;run;

Unconventional spacing

Page 40: The Information Delivery Process

40

SAS Syntax Rules

data work.mech_pilot; infile 'c:\coursedata\emplist.dat'; input lastname $ 1-20 firstname $ 21-30 jobtitle $ 36-43 salary 54-59;run;

proc print data=work.mech_pilot;run;

proc means data=work.mech_pilot n mean; class jobtitle; var salary;run;

SAS statements• usually begin with an identifying keyword• always end with a semicolon.

Page 41: The Information Delivery Process

41

Adding a New Variable

Obs model aircraftid inservice lastmaint

1 JetCruise LF5200 030003 12513 15045

2 JetCruise LF5200 030005 14290 15161

3 JetCruise LF5200 030008 13214 15432

4 JetCruise LF5200 030009 14166 15233

5 JetCruise LF5200 030011 14126 15218

6 JetCruise LF5200 030012 12420 15063

Yrbeg_service

1994

1999

1996

1998

1994

1994

Aircraft Service Records

Create a new variable by extracting the four-digit year values from the SAS date values.

Page 42: The Information Delivery Process

42

Using an Assignment Statement

An assignment statement evaluates an expression and assigns the resulting value to a variable.

General syntax of an assignment statement:

variable=expression;

Page 43: The Information Delivery Process

43

Using Operators

Selected operators for basic arithmetic calculations in an assignment statement:

Operator Action Example Priority

+ addition sum=x+y; I I I

- Subtraction diff=x-y; I I I

* Multiplication mult=x*y; I I

/ Division divide=x/y; I I

** Exponentiation raise=x**y; I

- Negative prefix negative=-x; I

Page 44: The Information Delivery Process

44

Using SAS Functions

A SAS function is a routine that returns a value that is determined from specified arguments.

General syntax of a SAS function:

function-name(argument1,argument2, . . .)

Page 45: The Information Delivery Process

45

Using SAS Functions

SAS functions

perform arithmetic operations

compute statistics (for example, mean)

manipulate SAS dates and process character values

perform many other tasks.

Page 46: The Information Delivery Process

46

Creating a Vertical Bar Chart

Use the GCHART procedure and the VBAR statement to create a vertical bar chart.

proc gchart data=work.aircraft; vbar yrbeg_service; title 'Aircraft In Service, by Year';run;

Page 47: The Information Delivery Process

47

Reading a Subset of Raw Data

Use the DATA step that was written earlier. Add a subsetting IF statement to process only the subset in which the value of AGE is at least 15.

data work.aircraft; infile ‘aircraft.dat’; input @1 model $16. @18 aircraftid $6. @25 inservice mmddyy10. @36 lastmaint mmddyy10.; yrbeg_service=year(inservice); age=year(today())-yrbeg_service; if age>=15;run;

Page 48: The Information Delivery Process

48

What Is a SAS Data Library?

DATA

LI BRARY

Page 49: The Information Delivery Process

49

What Is a SAS Data Library?

Regardless of which host operating system you use, you identify SAS data libraries by assigning each one a libref.

libref

Page 50: The Information Delivery Process

50

What Is a SAS Data Library?

By default, SAS creates two SAS data libraries:

• a temporary library called WORK

• a permanent library called SASUSER.

SASUSER

WORK

Page 51: The Information Delivery Process

51

SAS Data Libraries

FILES

LIBRARIES

You can think of a SAS data library as a drawer in a filing cabinet and a SAS data set as one of the file folders in the drawer.

Page 52: The Information Delivery Process

52

SAS Data Libraries

• WORK - temporary library

When you invoke SAS, you automatically have access to a temporary and a permanent SAS data library.

• SASUSER - permanent library

You can create and access your own permanent libraries.

• IA - permanent library

WORK

SASUSER

IA

Page 53: The Information Delivery Process

53

Reading a SAS Data Set

Input data set Output data set

SET statement DATA statement

TemporarySAS data

set

TemporarySAS data set

PermanentSAS data

set

PermanentSAS data

set

Page 54: The Information Delivery Process

54

Two-level SAS Filenames

• The first name (libref) refers to the library.

Every SAS file has a two-level name.

• The second name (filename) refers to the file in the library.

WORK

SASUSER

IAmech_pilot

The data set MECH_PILOT is a SAS file in the WORK library.

libref.filename

Page 55: The Information Delivery Process

55

Browsing the Data Portion

The PRINT procedure displays the data portion of a SAS data set.

By default, PROC PRINT displays

• all observations

• all variables

• OBS column on the left-hand side.

Page 56: The Information Delivery Process

56

Browsing the Data Portion

General form of the PRINT procedure:

Example:

proc print data=work.empdata;run;

PROC PRINT DATA=SAS-data-set;RUN;

Page 57: The Information Delivery Process

57

Objectives

• Generate list reports using the PRINT procedure.

• Display selected variables in a list report using the VAR statement.

• Display selected observations in a list report using the WHERE statement.

• Sort the observations in a SAS data set using the SORT procedure.

Page 58: The Information Delivery Process

58

Creating a List Report

empid lastname firstname jobcode salary0031 GOLDENBERG DESIREE PILOT1 50221.620040 WILLIAMS ARLENE M. FLTAT1 23666.12

0071 PERRY ROBERT A. FLTAT1 21957.71

The SAS System

Obs empid salary jobcode 1 0031 50221.62 PILOT1 2 0040 23666.12 FLTAT1 3 0071 21957.71 FLTAT1

PROC Step

proc print data=work.empdata; var empid salary jobcode;run;

Page 59: The Information Delivery Process

59

Formatting Data Values

empid lastname firstname jobcode salary0525 BEAUMONT SALLY T. PILOT3 112783.213370 BEDNAREK AMY L. PILOT3 122933.19

2291 BEECH DAVID C. PILOT2 73168.13

The SAS System

Obs empid lastname firstname jobcode salary 1 0525 BEAUMONT SALLY T. PILOT3 $112,783.21

2 3370 BEDNAREK AMY L. PILOT3 $122,933.19 3 2291 BEECH DAVID C. PILOT2 $73,168.13

proc print data=work.empsort; format salary dollar11.2;run;

Page 60: The Information Delivery Process

60

Creating a Frequency Report

empid lastname firstname jobcode salary0525 BEAUMONT SALLY T. PILOT3 112783.213370 BEDNAREK AMY L. PILOT3 122933.19

2291 BEECH DAVID C. PILOT2 73168.13

The FREQ Procedure Job Code Cumulative Cumulativejobcode Frequency Percent Frequency PercentƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒFLTAT1 41 14.70 41 14.70FLTAT2 70 25.09 111 39.78FLTAT3 88 31.54 199 71.33PILOT1 46 16.49 245 87.81PILOT2 18 6.45 263 94.27PILOT3 16 5.73 279 100.00

PROCStep

Page 61: The Information Delivery Process

61

Creating a Frequency Report

The FREQ procedure displays frequency counts of the data values in a SAS data set.

General form of a simple PROC FREQ step: PROC FREQ DATA=SAS-data-set; RUN;

Example: proc freq data=work.empsort; run;

Page 62: The Information Delivery Process

62

Creating a One-Way Frequency Report

Only variables listed on the TABLES statement are included in the frequency counts. These are typically variables that have a limited number of distinct values.

General form of a PROC FREQ step: PROC FREQ DATA=SAS-data-set; TABLES SAS-variables; RUN;

Page 63: The Information Delivery Process

63

Calculating Job Code Frequencies

Job Code Frequency Report

The FREQ Procedure

Job Code

Cumulative Cumulative Job_Code Frequency Percent Frequency Percent ----------------------------------------------------------------------- Flight Attendant 199 71.33 199 71.33 Pilot 80 28.67 279 100.00

Page 64: The Information Delivery Process

64

Calculating Salary Frequencies

Salary Frequency Report

The FREQ Procedure

Annual Salary

Cumulative Cumulative Salary Frequency Percent Frequency Percent --------------------------------------------------------------------- Low to $25,000 41 14.70 41 14.70 $25,000 to $50,000 172 61.65 213 76.34 $50,000 and up 66 23.66 279 100.00

Page 65: The Information Delivery Process

65

Calculating Job Code/Salary Frequencies

The FREQ Procedure

Table of Job_Code by Salary

Job_Code(Job Code) Salary(Annual Salary)

Frequency Percent Row Pct Col Pct Low to $ $25,000 |$50,000 | Total 25,000 to $50,0|and up | 00 | | --------------------------------------------| Flight Attendant 41 158 | 0 | 199 14.70 56.63 | 0.00 | 71.33 20.60 79.40 | 0.00 | 100.00 91.86 | 0.00 | --------------------------------------------| Pilot 0 14 | 66 | 80 0.00 5.02 | 23.66 | 28.67 0.00 17.50 | 82.50 | 0.00 8.14 | 100.00 | -------------------------------------------- Total 41 172 66 279

Page 66: The Information Delivery Process

66

Creating a Frequency Report

By default, PROC FREQ

• analyzes every variable in the SAS data set

• displays each distinct data value

• calculates the number of observations in which each data value appears (and corresponding percentage)

• indicates for each variable how many observations have missing values.

Page 67: The Information Delivery Process

67

Calculating Summary Statistics

The MEANS procedure displays simple descriptive statistics for the numeric variables in a SAS data set.

General form of a simple PROC MEANS step:

PROC MEANS DATA=SAS-data-set;RUN;

Example:

proc means data=ia.aircraftcap;

run;

Page 68: The Information Delivery Process

68

Calculating Summary Statisticsmodel aircraftid inservice totpasscapMF4000 010012 10890 267LF5200 030006 10300 207

LF5200 030008 11389 207

The SAS System

The MEANS Procedure

Variable N Mean Std Dev Minimum Maximumƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ-ƒƒƒƒƒƒƒƒƒƒƒƒƒinservice 64 10647.97 1966.95 3282.00 13125.00totpasscap 64 163.4687500 47.2208485 97.0000000 290.0000000ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

proc means data=ia.aircraftcap;run;

Page 69: The Information Delivery Process

69

Calculating Summary Statistics

By default, PROC MEANS

• analyzes every numeric variable in the SAS data set

• prints the statistics N, MEAN, STD, MIN, and MAX

• excludes missing values before calculating statistics.

Page 70: The Information Delivery Process

70

proc means data=ia.aircraftcap; var totpasscap;run;

Selecting Variables

model aircraftid inservice totpasscapMF4000 010012 10890 267LF5200 030006 10300 207

LF5200 030008 11389 207

The SAS System The MEANS Procedure Analysis Variable : totpasscap

N Mean Std Dev Minimum Maximum ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 64 163.4687500 47.2208485 97.0000000 290.0000000 ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Page 71: The Information Delivery Process

71

proc means data=ia.aircraftcap maxdec=2; var totpasscap; class model;run;

Grouping Observations

model aircraftid inservice totpasscapMF4000 010012 10890 267LF5200 030006 10300 207

LF5200 030008 11389 207

Page 72: The Information Delivery Process

72

Calculating Capacity Statistics for Each Type of Plane

The SAS System

The MEANS Procedure

Analysis Variable : totpasscap

N

size Obs N Mean Std Dev Minimum Maximum

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

Large 16 16 230.13 32.39 207.00 290.00

Medium 9 9 178.56 11.40 165.00 188.00

Small 39 39 132.64 18.85 97.00 150.00

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ