How to start using SAS. The topics An overview of the SAS system Reading raw data/ create SAS data set Combining SAS data sets & Match merging SAS Data

Download How to start using SAS. The topics An overview of the SAS system Reading raw data/ create SAS data set Combining SAS data sets & Match merging SAS Data

Post on 24-Dec-2015

225 views

Category:

Documents

2 download

Embed Size (px)

TRANSCRIPT

<ul><li> Slide 1 </li> <li> How to start using SAS </li> <li> Slide 2 </li> <li> The topics An overview of the SAS system Reading raw data/ create SAS data set Combining SAS data sets &amp; Match merging SAS Data Sets Formatting data Introduce some simple regression procedure Summary report procedures </li> <li> Slide 3 </li> <li> Basic Screen Navigation Main: Editor contains the SAS program to be submitted. Log contains information about the processing of the SAS program, including any warning and error messages Output contains reports generated by SAS procedures and DATA steps Side: Explore navigate to other objects like libraries Results navigate your Output window </li> <li> Slide 4 </li> <li> SAS programs A SAS program is a sequence of steps that the user submits for execution. Data steps are typically used to create SAS data sets PROC steps are typically used to process SAS data sets (that is, generate reports and graphs, edit data, sort data and analyze data </li> <li> Slide 5 </li> <li> SAS Data Libraries A SAS data library is a collection of SAS files that are recognized as a unit by SAS A SAS data set is one type of SAS file stored in a data library Work library is temporary library, when SAS is closed, all the datasets in the Work library are deleted; create a permanent SAS dataset via your own library. </li> <li> Slide 6 </li> <li> SAS Data Libraries Identify SAS data libraries by assigning each a library reference name (libref) with LIBNAME statement LIBNAME libref file-folder-location; Eg: LIBNAME readData 'C:\temp\sas class\readData; Rules for naming a libref: The name must be 8 characters or less The name must begin with a letter or underscore The remaining characters must be letters, numbers or underscores. </li> <li> Slide 7 </li> <li> Reading raw data set into SAS system In order to create a SAS data set from a raw data file, you must Start a DATA step and name the SAS data set being created (DATA statement) Identify the location of the raw data file to read (INFILE statement) Describe how to read the data fields from the raw data file (INPUT statement) </li> <li> Slide 8 </li> <li> Reading external raw data file into SAS system LIBNAME readData 'C:\temp\sas class\readData; DATA readData.wa80; INFILE k:\census\stf2_wa80.txt; INPUT @10 SUMRYLVL $2. @40 COUNTY $3. @253 TABA1 9.0 @271 TABA1 9.0; RUN; The LIBNAME statement assigns a libref readData to a data library. The DATA statement creates a permanent SAS data set named wa80. The INFILE statement points to a raw data file. The INPUT statement - name the SAS variables - identify the variables as character or numeric ($ indicates character data) - specify the locations of the fields in the raw data - can be specified as column, formatted, list, or named input The RUN statement detects the end of a step </li> <li> Slide 9 </li> <li> Example 1 Reading raw data separated by spaces /* Create a SAS permanent data set named HighLow1; Read the data file temperature1.dat using listing input */ DATA readData.HighLow1; INFILE C:\sas class\readData\temperature1.dat; INPUT City $ State $ NormalHigh NormalLow RecordHigh RecordLow; RUN; /* The PROC PRINT step creates a isting report of the readData.HighLow1 data set */ PROC PRINT DATA = readData.highlow1; TITLE High and Low Temperatures for July; RUN; Nome AK 55 44 88 29 Miami FL 90 75 97 65 Raleign NC 88 68 105 50 temperature1.dat: </li> <li> Slide 10 </li> <li> Example 2 Reading multiple lines of raw data per observation /* Read the data file using line pointer, slash(/) and pount-n (#n). The slash(/) indicates next line, the #n means to go to the n line for that observation. Slash(/) can be replaced by #2 here */ DATA readData.highlow2; INFILE C:\sas class\readData\temperature2.dat; INPUT City $ State $ / NormalHigh NormalLow #3 RecordHigh RecordLow; PROC PRINT DATA = readData.highlow2; TITLE High and Low Temperatures for July; RUN; Nome AK 55 44 88 29 Miami FL 90 75 97 65 Raleign NC 88 68 105 50 temperature2.dat: </li> <li> Slide 11 </li> <li> Example 3 Reading multiple observations per line of raw data /* To read multiple observations per line of raw data,use double railing at signs (@@) at the end of INPUT statement */ DATA readData.highlow3; INFILE C:\sas class\readData\temperature3.dat; INPUT City $ State $ NormalHigh NormalLow RecordHigh RecordLow @@; PROC PRINT DATA = readData.highlow3; TITLE High and Low Temperatures for July; RUN; Nome AK 55 44 88 29 Miami FL 90 75 97 65 Raleign NC 88 68 105 50 temperature3.dat : </li> <li> Slide 12 </li> <li> Reading external raw data file into SAS system Reading raw data arranged in columns INPUT FILEID $ 1-5 RECTYP $ 6-9 SUMRYLVL $ 10-11 URBARURL $ 12-13 SMSACOM $ 14-15; Reading raw data mixed in columns INPUT FILEID $ 1-5 @10 SUMRYLVL $ 2. @253 TABA1 9.0 @271 TABA1 9.0; /* The @n is the column pointer, where n is the number of the column SAS should move to. The $w. reads standard character data, and w.d reads standard numeric data, where w is the total width and d is the number of decimal places. */ </li> <li> Slide 13 </li> <li> Reading Delimited or PC Database Files with the IMPORT Procedure If your data file has the proper extension, use the simplest form of the IMPORT procedure: PROC IMPORT DATA FILE = filename OUT = data-set Type of File Extension DBMS Identifier Comma-delimited.csv CSV Tab-delimited.txt TAB Excel.xls EXCEL Lotus Files.wk1,.wk3,.wk4 WK1,WK3,WK4 Delimiters other than commas or tabs DLM Examples: 1. PROC IMPORT DATAFILE=c:\temp\sale.csv OUT=readData.money; RUN; 2. PROC IMPORT DATAFILE=c:\temp\bands.xls OUT=readData.music; RUN; </li> <li> Slide 14 </li> <li> Reading Files with the IMPORT Procedure If your file does not have the proper extension, or your file is of type with delimiters other than commas or tabs, then you must use the DBMS= and DELIMITER= option PROC IMPORT DATAFILE = filename OUT = data-set DBMS = identifier; DELIMITER = delimiter-character; RUN; Example: PROC IMPORT DATAFILE = C:\sas class\readData\import2.txt OUT =readData.sasfile DBMS =DLM; DELIMITER = &amp;; RUN; </li> <li> Slide 15 </li> <li> Format in SAS data set Standard Formats (selected): Character: $ w. Date, Time and Datetime: DATE w., MMDDYY w., TIMEw. d, Numeric: COMMA w. d, DOLLAR w. d, Use FORMAT statement PROC PRINT DATA=sales; VAR Name DateReturned CandyType Profit; FORMAT DateReturned DATE9. Profit DOLLAR 6.2; RUN; </li> <li> Slide 16 </li> <li> Format in SAS data set Create your own custom formats with two steps: Create the format using PROC FORMAT and VALUE statement. Assign the format to the variable using FORMAT statement. General form of a simple PROC FORMAT steps: PROC FORMAT; VALUE name range-1=formatted-text-1 range-2=formatted-text-2 ; RUN; The name in VALUE statement is the name of the format you are creating, which cant be longer than eight characters, must not start or end with a number. If the format is for character data, it must start with a $. </li> <li> Slide 17 </li> <li> Format in SAS data set Exmaple: /* Step1: Create the format for certain variables */ PROC FORMAT; VALUE genFmt 1 = 'Male' 2 = 'Female'; VALUE money low-</li></ul>