sas random number generation
TRANSCRIPT
Random Number Generationin SAS
SAS function for common distributions
• PROBNORM(x)– returns the probability from the standard normal distribution
• PROBBNRM(x,y,r)– standardized bivariate normal distribution
• POISSON(m,n)– returns the probability from a POISSON distribution
• PROBBNML(p,n,m)– returns the probability from a binomial distribution
• PROBCHI(x,df<,nc>)– returns the probability from a chi-squared distribution
• PROBF(x,ndf,ddf<,nc>)– returns the probability from an F distribution
• PROBT(x,df<,nc>)– returns the probability from a Student's t distribution
Random number generation
• How to randomly draw one observation according to a distribution?
• Pseudo random variables are numbers generated by a computer algorithm to simulate– Commonly used in Monte Carlo simulation– real random numbers.
• SAS has built in functions that generate pseudo random variables from several standard distributions.
• The algorithm used by SAS to produce these pseudo random variables , must be provided with an initial seed.
3
Uniform distribution
• UNIFORM(SEED) - generates values from a random uniform distribution between 0 and 1
DATA random1; x = UNIFORM(-1); IF x>.5 THEN coin = 'heads'; ELSE coin = 'tails';RUN; PROC PRINT DATA=random1; VAR x coin;RUN;
4
The statements if x>.5 then coin = 'heads' and else coin = 'tails' create a random variable called coins that has values 'heads' and 'tails'. The data sets random1 uses a seed value of -1. Negative seed values will result in different random numbers being generated each time.
Ranuni() also works!
Normal distributionNORMAL(SEED) - generates values from a random normal
distribution with mean 0 and standard deviation 1 DATA random2; x = UNIFORM(-1); y = 50 + 3*NORMAL(-1); IF x>.5 THEN coin = 'heads'; ELSE coin = 'tails';RUN;
PROC PRINT DATA=random2; VAR x y coin;Run;
5Rannor() also works!
Seed• Sometimes we want to generate the same set of random
numbers so that we can debug our programs or compare models. – We use the same number as the seed value. – The data sets random3 illustrates how to generate the same results each
time.
data random3; x = UNIFORM(123456); y = 50 + 3*NORMAL(123456); IF x>.5 THEN coin = 'heads'; ELSE coin = 'tails';RUN; PROC PRINT DATA=random3; VAR x y coin;RUN;
6
Generating multiple runs, get statistics
DATA random4; DO i=1 to 100; x = UNIFORM(123456); IF x>.5 THEN coin = 'heads'; ELSE coin = 'tails'; OUTPUT; END;RUN; PROC FREQ DATA=random4; table coin;RUN;
7
Cumulative CumulativeCOIN Frequency Percent Frequency Percent---------------------------------------------------heads 48 48.0 48 48.0tails 52 52.0 100 100.0
Getting Statisticsproc univariate data=random4 plot ;var x ; run;
8
Other common distributions
• Exponential – RANEXP()• Binomial – RANBIN(seed,n,p)– returns a random variate from a binomial
distribution• Poisson - RANPOI(seed,m)– returns a random variate from a Poisson
distribution• Gamma distribution – RANGAM()
9
Simulate student’s t-distributionsdata student ; seedz = 314159 ; seedw = 271828 ; array zz[40] _TEMPORARY_ ; array ww[40] _TEMPORARY_ ; do r=1 to 1000 ; sum_z = 0 ; sum_zsq = 0 ; sum_w = 0 ; sum_wsq = 0 ; do i=1 to 40 ; zz[i] = rannor(seedz) ; ww[i] = ranexp(seedw)-1 ; sum_z = sum_z + zz[i] ;sum_zsq = sum_zsq + zz[i]*zz[i] ; sum_w = sum_w + ww[i] ; sum_wsq = sum_wsq + ww[i]*ww[i] ; end ; t_z = sum_z/sqrt(40)/sqrt((sum_zsq-sum_z*sum_z/40)/39); t_w = sum_w/sqrt(40)/sqrt((sum_wsq-sum_w*sum_w/40)/39) ; output ; end ;title 'SIMULATED T FOR NORMAL AND EXPONENTIAL DATA';proc univariate data=student;var t_z t_w ;run;
10
Cont’dproc gchart data=student ;title "Histograms for LOGBILI" ;vbar t_w / LEVELS=30 type=percent;RUN ;
11