5.sas codes extra
DESCRIPTION
5.Sas Codes ExtraTRANSCRIPT
SAS codes for graphs: For vertical bar graphs:
proc chart data = <dataset name>; vbar <variable>; run; For horizontal bar graphs: proc chart data = <dataset name>; hbar <variable>; run; For vertical bar graphs with sub-groups: proc chart data = <dataset name>; vbar <variable1> / subgroup = <variable2>; run; For vertical bar graphs (measuring variable1 not in terms of its frequency of occurrence but in terms of variable2): proc chart data = <dataset name>; vbar <variable1> / sumvar =<variable2>; run; proc chart data = <dataset name>; vbar <variable1> / subgroup = <variable2> sumvar =<variable3>; run; For pie charts: proc chart data = <dataset name>; pie <variable>; run; proc gchart data = <dataset name>; pie <variable> / discrete value = inside percent = inside slice = outside; run; For histograms: proc univariate data = <dataset name> normal; var <variable1> <variable2>; histogram; run; proc univariate data = <dataset name>; var <variable1> <variable2>;
histogram/normal; run; For scatter plot: proc gplot data = <dataset name>; plot <variable1> *<variable2>; run;
Descriptive Statistics: proc univariate data = <dataset> plots; var <variable1>; run;
Distributions:
Binomial:
For cdf of a binomial distribution:
CDF(‘BINOMIAL’, m, p, n);
m – is an integer random variable that counts the number of successes.
p – is a numeric probability of success.
n – is an integer parameter that counts the number of independent Bernoulli trials.
The CDF function for the binomial distribution returns the probability that an observation from a binomial distribution, with parameters p and n, is less than or equal to m. For probability from a binomial distribution: Probbnml(p, n, m); The PROBBNML function returns the probability that an observation from a binomial distribution, with probability of success p, number of trials n, and number of successes m, is less than or equal to m. To compute the probability that an observation is equal to a given value m, compute the difference of two probabilities from the binomial distribution for m and m-1 successes.
For pdf of a binomial distribution:
PDF(‘BINOMIAL’, m, p, n);
The PDF function for the binomial distribution returns the probability density function of a binomial
distribution, with parameters p and n, which is evaluated at the value m.
Poisson:
For cdf of a poisson distribution:
CDF(‘POISSON, n, m);
n – is an integer random variable.
m – is a numeric mean parameter.
The CDF function for the Poisson distribution returns the probability that an observation from a Poisson distribution, with mean m, is less than or equal to n.
For probability from a poisson distribution: POISSON(lamda, x);
The POISSON function returns the probability that an observation from a Poisson distribution, with mean m, is less than or equal to n. To compute the probability that an observation is equal to a given value, n, compute the difference of two probabilities from the Poisson distribution for n and n- 1.
For pdf of a poisson distribution:
PDF(‘POISSON’, n, m);
The PDF function for the Poisson distribution returns the probability density function of a Poisson
distribution, with mean m. The PDF function is evaluated at the value n.
Normal distribution:
For cdf of a normal distribution:
CDF(‘NORMAL’, x, θ, λ);
x – is a numeric random variable.
θ – is a numeric location parameter(mean). Default – 0.
λ – is a numeric scale parameter(standard deviation). Default – 1.
The CDF function for the normal distribution returns the probability that an observation from the normal distribution, with the location parameter θ and the scale parameter λ, is less than or equal to x.
For pdf of a normal distribution:
PDF(‘NORMAL’, x, θ, λ);
The PDF function for the normal distribution returns the probability density function of a normal
distribution, with the location parameter θ and the scale parameter λ. The PDF function is evaluated
at the value x.
To calculate the probability from the standard normal distribution:
PROBNORM(x);
The PROBNORM function returns the probability that an observation from the standard normal distribution is less than or equal to x.
Confidence limit:
Proc means data= clm alpha= ;
Run;
Proc univariate data= cibasic ;
Run;
t test
Single sample t test
proc ttest data = <dataset> H0=50; var <variable1>; run;
Dependent group t test (Paired t test)
proc ttest data = <dataset>; paired <variable1>*<variable2>; run;
Independent group t test
proc ttest data = <dataset>; class <variable1>; var <variable2>; run;
(the independent groups should be categorized in the class statement)
ANOVA
proc glm data = <dataset>; class <variable1> <variable2> <variable3>; model <variable4> = <variable1> <variable2> <variable3>; run;
(the independent groups should be categorized in the class statement)
(Suppose we want to compare mean nutrition content between complain, horlicks and bornvita.
Then the dataset should be categorized by these three groups in the class statement. – Var 1 ,2 and
3 are the three groups and var 4 is the nutrition content, on which we are making the comparison)
Chi-square test
proc freq data = <dataset>; tables <variable1>*<variable2> / chisq; run; Correlation: proc corr data = <dataset>; var <variable1> <variable2> <variable3> <variable4>; run;
Regression
proc reg data = <dataset>; model <dependent variable> = <independent variable1> <independent variable2> <independent variable3> / clb; run;
(CLB computes 100(1-α)% confidence limits for the parameter estimates.)
Logistic Regression: proc logistic data = <dataset> desc; model <dependent variable> = <independent variable1> <independent variable2> <independent variable3>/ lackfit ctable selection = forward stepwise; run;
Cluster Analysis:
Proc cluster data=<dataset> method =<method name> outtree=<output data>;
Id ;
Var ;
Run;