enter the matrixtorsas.ca/attachments/file/20190920/2-ej_iml_2019q3.pdfenter the matrix proc iml for...

61
Enter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst Credit Risk Analytics

Upload: others

Post on 13-Aug-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Enter the MatrixProc IML for Neophytes

September 20, 2019

Toronto Area SAS Society

Erik Johnson

Senior Analyst

Credit Risk Analytics

Page 2: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Motivation

This is your last chance. After this, there is no turning back. You take the blue pill - the story ends, you wake up in your bed and believe whatever you want to believe. You take the red pill - you stay in Wonderland and I show you how deep the rabbit-hole goes.

Morpheus, The Matrix (1999)

TASS | Proc IML for beginners 2

Page 3: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Welcome to Wonderland

• SAS User mantra “there’s a PROC for that”

•Until there isn’t …

•Enter Proc IML

–Solving a set of equations

–Statistical Analysis

–Custom algorithms

–Matrix algebra makes the above much easier

–Incorporate R code into SAS ☺

TASS | Proc IML for beginners 3

Page 4: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Trigger Warning

There will be math …

TASS | Proc IML for beginners 4

Page 5: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

But maybe you took the blue pill?

• It’s a Base-SAS world, we’re just living in it– https://www.lexjansen.com/pharmasug/2010/CC/CC15.pdf

• You can still do matrix algebra …

TASS | Proc IML for beginners 5

Page 6: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

You’ll be fine ..

TASS | Proc IML for beginners 6

Page 7: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Oh crap ...

TASS | Proc IML for beginners 7

Page 8: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Shut the front door

TASS | Proc IML for beginners 8

Page 9: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

What is the matrix?

• “The matrix is a system …”

• Or a collection of numbers structured into rows and columns (which can impart meaning)

TASS | Proc IML for beginners 9

𝐴 =𝑎11 𝑎12𝑎21 𝑎22

Page 10: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Back to Wonderland

TASS | Proc IML for beginners 10

• Importing/Exporting data to/from IML and SAS

• Matrix Algebra

• Statistics in IML

• Simulations

• If you want me back for “advanced” IML

– (Custom) Model Validation

–Optimization Methods in IML

–Advanced Statistics in IML

Page 11: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Basic IML syntax

proc iml <symsize=n1> <worksize=n2>;

<iml/sas code>;

<iml/sas code>;

<iml/sas code>;

quit;

TASS | Proc IML for beginners 11

Page 12: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

A word on worksize

• If you don’t specify WORKSIZE, SAS will use the host-dependent default—it’s in KBs

• SYMSIZE allocates memory to PROC IML’s symbol space

–This is where the names of matrices are stored

–There are two kinds local and global

–Locals are defined for each module with arguments

TASS | Proc IML for beginners 12

Page 13: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Reading datasets into IML from SAS I

/* Importing data into SAS/IML */

proc iml;

use work.my_data;

read all var _ALL_ into matrix[colname=varNames];

close work.my_data;

print matrix;

quit;

TASS | Proc IML for beginners 13

Page 14: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Reading datasets into IML from SAS II

TASS | Proc IML for beginners 14

matrix

A B C

1 5 3

5 1 5

3 3 1

matrix_2

D E F

3 7 0

7 3 7

0 0 3

Page 15: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Exporting datasets from IML to SAS code

/* Exporting data from IML to SAS */

proc iml;

varNames = {A B C};

create my_data_is_back from matrix [colname=varNames];

append from matrix;

close my_data_is_back;

quit;

TASS | Proc IML for beginners 15

Page 16: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Transposition Code

/* Transposition of a matrix */

proc iml;

transposed=T(matrix);

transposed_2=matrix_2`;

print transposed transposed_2;

quit;

TASS | Proc IML for beginners 16

Page 17: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Transposition Results

TASS | Proc IML for beginners 17

transposed transposed_2

1 5 3 3 7 0

5 1 3 7 3 0

3 5 1 0 7 3

Page 18: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Addition Code

/* Matrix Addition */

proc iml;

matrix_add=matrix+matrix_2;

print matrix_add;

quit;

TASS | Proc IML for beginners 18

Page 19: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Addition Results

TASS | Proc IML for beginners 19

matrix_add

4 12 3

12 4 12

3 3 4

Page 20: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Multiplication Code

/* Matrix Multiplication */

proc iml;

matrix_mult=matrix*matrix_2;

matrix_mult_2=matrix_2*matrix;

print matrix_mult[rowname= {row1,row2,row3}

colname={A B C}]

matrix_mult_2[colname={co1 col2 col3}];

quit;

TASS | Proc IML for beginners 20

Page 21: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Multiplication Results

TASS | Proc IML for beginners 21

matrix_mult A B Cmatrix_mult_2

CO1 COL2 COL3

ROW1 38 22 44 38 22 44

ROW2 22 38 22 43 59 43

ROW3 30 30 24 9 9 3

Page 22: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Element-Wise Powers Code

/* Matrix Element-wise powers */

proc iml;

matrix_e_power=matrix##3;

matrix_e_power_2=matrix_2##3;

print matrix_e_power matrix_e_power_2 ;

quit;

TASS | Proc IML for beginners 22

Page 23: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Element-Wise Operation Results

TASS | Proc IML for beginners 23

matrix_e_power matrix_e_power_2

1 125 27 27 343 0

125 1 125 343 27 343

27 27 1 0 0 27

Page 24: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Other Matrix Operators Code

/* Other Matrix Operators */

proc iml;

matrix_inv=inv(matrix);

matrix_trace=trace(matrix);

matrix_det=det(matrix);

matrix_logic=matrix>=matrix_2;

print matrix_inv matrix_logic;

print matrix_trace matrix_det;

quit;

TASS | Proc IML for beginners 24

Page 25: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Other Matrix Operators Results I

TASS | Proc IML for beginners 25

matrix_inv matrix_logic

-0.194444 0.0555556 0.3055556 0 0 1

0.1388889 -0.111111 0.1388889 0 0 0

0.1666667 0.1666667 -0.333333 1 1 0

Page 26: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Other Matrix Operators Results II

TASS | Proc IML for beginners 26

matrix_trace matrix_det

3 72

Page 27: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Reduction Operations Code

/* Matrix reduction operations */

proc iml;

matrix_row_red=matrix_e_power[+,];

matrix_row_red_2=(matrix_e_power_2[+,])[,<>];

print matrix_row_red matrix_row_red_2;

quit;

TASS | Proc IML for beginners 27

Page 28: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Reduction Operations Results

TASS | Proc IML for beginners 28

matrix_row_red matrix_row_red_2

153 153 153 370

Page 29: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Algebra in IML Review

TASS | Proc IML for beginners 29

Matrix Operation IML Shortcut

Addition +

Subtraction -

Division, element wise /

Multiplication, element wise #

Multiplication, matrix *

Power, element wise ##

Power, Matrix **

Page 30: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Operators Review

TASS | Proc IML for beginners 30

Matrix Function IML Alias

Transpose ` or T(matrix)

Determinant Det(matrix)

Inverse Inv(matrix)

Trace Trace(matrix)

Logicals >,>=,=,<,<=,^=,

Identity I(n)

Dummy Matrix J(nrow,ncol,a)

Reshape a matrix in row major order

rowvec(matrix)

Page 31: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Matrix Reduction Operators Review

TASS | Proc IML for beginners 31

Operation IML Alias

Addition +

Multiplication #

Minimum ><

Maximum <>

Index of minimum >:<

Index of maximum <:>

Mean :

Sum of squares ##

Concatenate || (horizontal) // (vertical)

Page 32: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Statistics in SAS/IML

• Standard method for evaluating the relationship between a variable of interest (dependent variable) and potential explanatory variables (independent variables) is the Ordinary Least Squares Regression

• It’s the solution to the following equation

𝑦 = 𝑋𝛽 + 𝑢መ𝛽 = 𝑋′𝑋 −1𝑋′𝑦

TASS | Proc IML for beginners 32

Page 33: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

OLS IML Code I

/* Statistics in IML (Ordinary Least Squares

Regression) */

/* Read SASHELP.CARS into work and create an

interaction with foreign and MPG */

data cars;

set sashelp.cars;

if origin eq "USA" then foreign = 0 ;

else foreign = 1;

mpg_x_foreign=foreign*mpg_highway;

run;

TASS | Proc IML for beginners 33

Page 34: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

OLS IML Code II

/* Building model in IML */

proc iml;

/* Reading in data from SAS */

use work.cars;

read all var {'MPG_Highway' 'Weight' 'Foreign'

'mpg_x_foreign'} into X;

read all var {'MSRP'} into Y;

close cars;

TASS | Proc IML for beginners 34

Page 35: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

OLS IML Code III

/* Transforming the Data for a regression */

timer=J(2,1,0);

n=nrow(X);

X=J(n,1,1) || X;

k=ncol(X);

/* Estimating Beta_hat */

t0=time();

beta_hat=(inv(X`*X))*X`*y;

timer[1,1]=time()-t0;

u_hat=y-beta_hat`*X`*y;

TASS | Proc IML for beginners 35

Page 36: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

OLS IML Code IV

/* Use Matrix Algebra to Calculate OLS statistics */

SSE=y`*y-beta_hat`*X`*y;

MSE=sse/(n-k);

Y_bar=sum(Y)/n;

ESS=beta_hat`*X`*y-n*y_bar**2;

MSR=ESS/(k-1);

F=MSR/MSE;

SST=ESS+SSE;

R_2=ESS/SST;

/* SAS is using Adj R-Sq of 1-(n)/(n-k+1)*(1-R_2) */

Adj_R_2=1-(n-1)/(n-k+1)*(1-R_2);

TASS | Proc IML for beginners 36

Page 37: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

OLS IML Code V

/* Calculate Hypothesis Testing Components */

SE=sqrt(vecdiag(inv(X`*X))#MSE);

T=beta_hat/se;

p_stats=2*(1-CDF('T',ABS(T),n-k));

timer[2,1]=time()-t0;

reg_stats=(k||ESS||MSR||F) // (n-k||SSE||MSE||{.});

coefs=beta_hat || SE || T || p_stats;

TASS | Proc IML for beginners 37

Page 38: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

OLS IML Code VI

/* Clean up the results to print */

print 'OLS Statistics for regression of Car Prices';

print reg_stats (|Colname={DF SS MS F} rowname={Model

Residuals} format=8.4|);

print 'Parameter estimates';

print coefs (|Colname={Coef SE T p_stat} rowname={INT

MPG Weight Foreign MPG_x_Foreign} format=8.4|);

print " ";

print 'The Adjusted R-Square is ' Adj_R_2;

print 'The time to invert X*X was' (timer[1,1]);

print 'The time calculate all statistics was'

(timer[2,1]);

quit;

TASS | Proc IML for beginners 38

Page 39: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

OLS Regression of Car Prices in IML part I

TASS | Proc IML for beginners 39

Regression Statistics

DF SS MS F

MODEL 5.0000 4.684E10 1.171E10 43.2981

RESIDUALS 423.0000 1.144E11 2.7044E8 .

Adj_R_2

The Adjusted R-Square is 0.2854775

Page 40: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

OLS Regression of Car Prices in IML part II

TASS | Proc IML for beginners 40

Parameter Estimates

COEF SE T P_STAT

INT -9863.91 14813.12 -0.6659 0.5058

MPG 37.0433 352.3656 0.1051 0.9163

WEIGHT 9.8881 1.8064 5.4739 0.0000

FOREIGN 32324.29 8557.125 3.7775 0.0002

MPG_X_FOREIGN -835.176 314.5948 -2.6548 0.0082

Page 41: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

The proof is in pudding

TASS | Proc IML for beginners 41

/* Check the IML results with PROC REG */

%let timer_start = %sysfunc(datetime());

proc reg data=work.cars;

model msrp = mpg_highway weight foreign mpg_x_foreign;

run;

data _null_;

dur=datetime()-&timer_start;

put 30*'-' / ' Total Duration' dur MMSS13.6 / 30*'-';

run;

Page 42: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

PROC REG Output for Car Prices

TASS | Proc IML for beginners 42

Page 43: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

SAS/IML Results one more time

TASS | Proc IML for beginners 43

Page 44: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Does “vectorizing” help? Yes.

TASS | Proc IML for beginners 44

Page 45: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Does “vectorizing” help 2.0

TASS | Proc IML for beginners 45

Page 46: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Compare to Proc Reg again

TASS | Proc IML for beginners 46

• IML appears to be running at least 5 times faster than PROC REG

• Caveat: in-memory calculations rely on RAM

Page 47: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Why bother – that was a lot of code?

• “Vectorizing” your code –> avoid loops

• Customization is key here

• Extensions to the standard statistics models can be implemented to your delight with IML

– Newey-West Variance-Covariance Matrix (SC and HC)

– Spatial Correlation (models with geographic components)

– Clustering (one-way and two-way)

• There is a PROC for that (sometimes)

– PROC SURVEYREG, PROC MODEL, PROC REG (w/ acov)

• For the thrills!

TASS | Proc IML for beginners 47

Page 48: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Simulation with SAS/IML

• An empirically robust method for constructing test statistics, is known as the “bootstrap”

– You can be agnostic about the underlying DGP

• The idea behind the bootstrap is to:

– Calculate the statistic of interest from your sample

– Resample the data B times to create B bootstrap samples

– Re-calculate the statistics for each sample

– Use the bootstrap distribution to obtain parameters of interest (confidence intervals, standard errors, etc.)

TASS | Proc IML for beginners 48

Page 49: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Sample Kurtosis

• A measure of how “heavy” the tails of a distribution are (relative to the standard normal)

𝑘𝑢𝑟𝑡𝑜𝑠𝑖𝑠 =𝜇4

𝜇22 − 3

𝑛(𝑛 + 1)

(𝑛 − 1)(𝑛 − 2)(𝑛 − 3)

𝑖=1

𝑛

𝑤𝑖2 𝑥𝑖 − ҧ𝑥𝑤

𝑠𝑤−

3 𝑛 − 1 2

(𝑛 − 2)(𝑛 − 3)

TASS | Proc IML for beginners 49

Page 50: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Estimating the Kurtosis of car weights

/* The bootstrap in IML */

ods listing gpath="<output_directory>";

ods graphics on / imagename="iml_results_" ;

proc univariate data=cars;

var weight;

histogram weight;

inset N Kurtosis (8.4) / position=NE;

run;

TASS | Proc IML for beginners 50

Page 51: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Sample Histogram of car weights

TASS | Proc IML for beginners 51

Page 52: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

SAS/IML Code for Bootstrap of Kurtosis I

proc iml;

/* Create module to estimate kurtosis */

start BootStat(A);

return kurtosis(A);

finish;

/* Set Crit Value and number of bootstrap samples */

alpha=0.05;

B=10000;

/* Read in cars data */

use work.cars;

read all var "Weight";

close;

TASS | Proc IML for beginners 52

Page 53: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

SAS/IML Code for Bootstrap of Kurtosis II

/* Resample Weight data and recalculate kurtosis */

call randseed(153);

est=BootStat(weight);

s=sample(weight, B // nrow(weight));

bStat=T(BootStat(s));

bootEst=mean(bStat);

SE=std(bStat);

call qntl(CI, bStat, alpha/2 || 1-alpha/2);

TASS | Proc IML for beginners 53

Page 54: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

SAS/IML Code for Bootstrap of Kurtosis III

/* Summarize results of Bootstrap procedure */

R=Est || BootEst || SE || CI` ;

print R[format=8.4 L="95% Bootstrap CI" c={"Obs"

"BootEst" "StdErr" "Lower" "Upper"}];

/* Output the results as a graph */

call symputx('BootEst', round(BootEst, 1e-4));

call symputx('Lower', round(CI[1], 1e-4));

call symputx('Upper', round(CI[2], 1e-4));

TASS | Proc IML for beginners 54

Page 55: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

SAS/IML Code for Bootstrap of Kurtosis IV

refStmt = 'refline &BootEst / axis=x

lineattrs=(color=red) name="BootEst"

legendlabel="Bootstrap Statistic = &BootEst";'

+ 'refline &Lower &Upper / axis=x

lineattrs=(color=blue) name="CI" legendlabel="95% Pctl

CI";'

+ 'keylegend "BootEst" "CI";';

title "Bootstrap Distribution";

call histogram(bStat) label="Kurtosis" other=refStmt;

ods graphics off;

ods _all_ close;

TASS | Proc IML for beginners 55

Page 56: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

SAS/IML Output for Bootstrap of Kurtosis

TASS | Proc IML for beginners 56

Bootstrap Distribution of Car Weights Kurtosis, 95% Bootstrap CI

Obs BootEst StdErr Lower Upper

1.6888 1.5977 0.7653 0.3489 3.2050

Page 57: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

SAS/IML Histogram of Bootstrap Kurtosis

TASS | Proc IML for beginners 57

Page 58: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Useful IML commands for Statistics

TASS | Proc IML for beginners 58

IML Function PurposeCALL RANDSEED(n) Set seed for random number generator

CALL RANDGEN(A, ‘distname’ <,parm1> …)

Generate pseudo random numbers from the ‘distname’ distribution

VECDIAG(A) Creates a column vector of the elements on the main diagonal of a matrix

CDF(‘dist’, q <,parm1, … parmk>) Returns a value from the cumulative probability distribution of ‘dist’

CALL QNTL(q, A <, probs> <, method>) Computes sample quantiles of A in q

CALL HISTOGRAM(x <,*>) Calls SGPLOT to plot the histogram of vector x

SHAPE(A, nrow <, ncol> <, pad-val>) Creates a new matrix from the data in A of size nrow by ncol.

SAMPLE(A, <,n> <, method> <, prob>) Creates a random sample of the elements of A

TIME() Outputs the current time—useful for timing code

Page 59: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

IML Examples Review

TASS | Proc IML for beginners 59

• Importing/Exporting data to/from IML and SAS

• Matrix Algebra

• Statistics in IML

• Simulations in IML

• Next time in IML

–Custom Model Validation

–Advanced Statistics

–Optimization

Page 60: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

Conclusion

60

• SAS/IML offers a lot of flexibility and vectorization missing from base SAS

•Caution: SAS/IML code isn’t validated the way PROCs are (have to say it)

• Feel free to reach out with questions

[email protected]

•Thank you!

Page 61: Enter the Matrixtorsas.ca/attachments/File/20190920/2-ej_iml_2019q3.pdfEnter the Matrix Proc IML for Neophytes September 20, 2019 Toronto Area SAS Society Erik Johnson Senior Analyst

References

61

• Ajmani, V. B. (2009). Applied Econometrics Using the SAS System. Hoboken, NJ: Wiley.

• B. Baesens, D. Roesch, H. Scheule (2016). Credit Risk Analytics: Measurement Techniques, Applications and Examples in SAS. Hoboken, NJ: Wiley.http://www.creditriskanalytics.net/

• Wicklin, R. (2010). Statistical Programming with SAS/IML Software. Cary, NC: SAS Institute Inc.

• Wicklin, R. (2013). Simulating Data with SAS. Cary, NC: SAS Institute Inc.

– https://support.sas.com/content/dam/SAS/support/en/books/simulating-data-with-sas/65378_Appendix_A_A_SAS_IML_Primer.pdf

– https://blogs.sas.com/content/iml/2013/11/25/twelve-advantages-to-calling-r-from-the-sasiml-language.html

• Wooldridge, J. (2019). Introductory Econometrics: A Modern Approach (7th

ed.). Mason, OH: South-Western.