introduction to sas
DESCRIPTION
BIO 226 – Spring 2011. Introduction to SAS. Slides 3-7 Slides 8-10 Slide 9 Slide 13 Slides 14-15 Slide 15 Slide 16 Slide 16 Slide 17 Slide 11-12 Slide 18 Slide 19 Slide 20 Slides 21-22. Outline. Windows and common rules Getting the data The PRINT and CONTENTS Procedures - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/1.jpg)
Introduction to SAS
BIO 226 – Spring 2011
![Page 2: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/2.jpg)
2
Outline• Windows and common rules• Getting the data
– The PRINT and CONTENTS Procedures• Basic SAS procedures
– The SORT Procedure– The MEANS Procedure– The UNIVARIATE Procedure– The FREQ Procedure– The CORR Procedure – The PLOT Procedure
• Manipulating the data, e.g., creating new variables
• Libraries• Output in Word document• References• Practice
Slides 3-7Slides 8-10
Slide 9
Slide 13Slides 14-15
Slide 15Slide 16Slide 16Slide 17
Slide 11-12Slide 18Slide 19Slide 20
Slides 21-22
![Page 3: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/3.jpg)
3
The different SAS windows
• Explorer: contains SAS files and libraries
• Editor: where you can open or type SAS programs
• Log: stores details about your SAS session (code run, dataset created, errors...)
• Results: table of contents for output of programs
• Output: printed results of SAS programs
![Page 4: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/4.jpg)
4
Basic SAS rules (1)
• Variable names must:– be one to 32 characters in length– begin with letter (A-Z) or underscore (_)– continue with any combination of number, letters or underscores
• A variable’s type is either character or numeric
• Missing values: – missing character data is left blank– missing numeric data is denoted by a period (.)
![Page 5: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/5.jpg)
5
Basic SAS rules (2)
• Two ways to make comments: – * write comment here;– /* write comment here */
• SAS is insensitive to case
![Page 6: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/6.jpg)
6
Basic programming rules (1)
• SAS programs are composed of statements: these are organized in DATA steps and PROC steps– DATA step: gives dataset a name, manipulates dataset– PROC step: procedure or analysis you want SAS to carry out
• SAS reads code line by line and the end of a line is marked by a semicolon.
• All SAS programs end with RUN;
• Quotes can be single or double.
![Page 7: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/7.jpg)
7
Basic programming rules (2)
• SAS statements are free-format:– Can begin and end in any column– One statement can continue over several lines– Several statements can be on one line
• To submit program, highlight the code to run and click on the submit button (running silhouette)
![Page 8: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/8.jpg)
8
Loading data• If you have SAS data set (sasintro.sas7bdat) you can double
click on it and it will load itself.
• If you don’t have SAS data set (sasintro.txt), and the first row of your dataset contains the variable names, you can import it using File > Import Data… and specify the directory.
• Or you can use the following code:
DATA mydata;INFILE ‘g:\shared\bio226\sasintro.txt’;INPUT weight bmi id age activity education smoking;RUN;
• Setting your current directory: on the bottom line of the main SAS window, you should see it set to C:\WINDOWS\system32. Double click on it to change it.
![Page 9: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/9.jpg)
9
How to view the loaded data?
• Go in the Explorer window, double click on Libraries, then Work and sasintro.sas7bdat
• To view general information about the data set, like variables’ name and type:
PROC CONTENTS DATA=mydata;RUN;
• Use the PRINT procedure to view the first 25 records:
PROC PRINT DATA=mydata (OBS=25);RUN;
![Page 10: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/10.jpg)
10
Variables from sasintro.txt
# Variable Type Unit
5 activity Num kcal/week
4 age Num years
2 bmi Num kg/m2
6 education Num years
3 id Num
7 smoking Num 1:current smoker, 0:non-smoker
1 weight Num lbs
![Page 11: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/11.jpg)
11
Manipulating data (1)
• selecting a subset of rows
DATA mydata_s;SET mydata;IF smoking=1;RUN;
• deleting a column (or columns)
DATA mydata2;SET mydata;DROP weight education;RUN;
![Page 12: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/12.jpg)
12
Manipulating data (2)
• adding a column (or columns)
DATA mydata3;SET mydata;weight_kg=weight*0.453;IF age <= 60 THEN agegroup=1;ELSE IF age<=70 THEN agegroup=2;ELSE agegroup=3;/*drop age;*/RUN;
![Page 13: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/13.jpg)
13
Sorting data
PROC SORT DATA=mydata OUT=mydata4;BY ID age weight;
PROC PRINT DATA=mydata (OBS=5);PROC PRINT DATA=mydata4 (OBS=5);RUN;
![Page 14: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/14.jpg)
14
Summarizing data (1)
• Summarizing weight:
PROC MEANS DATA=mydata;VAR weight;RUN;
• Summarizing weight in the youngest agegroup:
PROC MEANS DATA=mydata3;VAR weight;WHERE agegroup=1;RUN;
![Page 15: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/15.jpg)
15
Summarizing data (2)
• Summarizing weight by smoking status (two possible codes):
PROC SORT DATA=mydata OUT=mydata5;BY smoking;PROC MEANS DATA=mydata5;VAR weight;BY smoking;RUN;
PROC MEANS DATA=mydata;CLASS smoking;VAR weight;RUN;
• All these summarizing measures can be obtained with PROC UNIVARIATE also.
![Page 16: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/16.jpg)
16
Categorical data and correlation
• Summarizing categorical data
PROC FREQ DATA=mydata3;TABLES smoking*agegroup /chisq exact;RUN;
• Examining correlation
PROC CORR DATA=mydata;VAR weight;WITH bmi age;RUN;
![Page 17: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/17.jpg)
17
Basic procedures: plots
• BarchartsPROC CHART DATA=mydata3;VBAR agegroup /DISCRETE;RUN;
• ScatterplotPROC PLOT DATA=mydata3;PLOT bmi*weight='*';RUN;
• Histogram, Boxplot, Normal Probability PlotPROC UNIVARIATE DATA=mydata3 PLOT;VAR weight;RUN;
![Page 18: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/18.jpg)
18
/* Libraries */• A library is the directory where your SAS dataset is stored.
• The default library is named Work and stores your SAS datasets temporarily: they will be deleted when you end your SAS session
• If you want to save your SAS datasets and use them again later, create your own library:
LIBNAME SAS_Lab 'p:\BIO226\SAS';DATA SAS_Lab.mydata;INFILE ‘g:\shared\bio226\sasintro.txt’;INPUT weight bmi id age activity education smoking;RUN;
![Page 19: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/19.jpg)
19
SAS output and Word
• To send you SAS output to a Word document:
ODS RTF FILE=‘p:output.RTF’ style=minimal;PROC CORR DATA =mydata;
VAR weight;WITH bmi age;RUN;
ODS RTF CLOSE;
• Other styles: Journal, Analysis, Statistical
![Page 20: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/20.jpg)
20
For further references
• SAS9 Documentation on the Web: http://support.sas.com/onlinedoc/913/docMainpage.jsp
• Applied Statistics and the SAS Programming Language (5th Edition) Ron P. Cody and Jeffrey K. Smith
• The Little SAS Book, L.D. Delwiche and S.J. Slaughter
• See SAS_help.doc on course website
![Page 21: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/21.jpg)
21
Try your own
• Find the summary statistics (mean, mode, standard deviation,…) for education with PROC UNIVARIATE, as well as a histogram for years of education.
• Create a new variable educ_group which breaks years of education into four groups (0-10, 10-15,15-18,18-25). Put this new variable in a new data set and drop the education variable, as well as weight, bmi and age.
• Find the number of smokers per education group.
• Find the mean physical activity in each education group.
![Page 22: Introduction to SAS](https://reader036.vdocuments.mx/reader036/viewer/2022062807/56815027550346895dbe1418/html5/thumbnails/22.jpg)
22
Data name Description
mydata original imported data
mydata_s only smokers
mydata2 dropped weight, education
mydata3 added weight_kg, agegroup, dropped age
mydata4 sorted original data by age and weight
mydata5 sorted original data by smoking status
Recap of different datasets created