divide and conquer writing parallel sas ® code to speed up your sas program doug haigh, sas...

Post on 17-Jan-2016

251 Views

Category:

Documents

12 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Divide and ConquerWriting Parallel SAS® Code to Speed Up Your SAS Program

Doug Haigh, SAS Institute Inc.

2

Introduction Have you ever wanted to

Text and drive at the same time? Watch the big game and read a book at the same time? Be on vacation at the beach and get work done at the office?

Humans are not good at doing two things at the same time but your SAS code can be

3

Introduction Parallel code is when two or more streams of execution

occur at nearly the same time

Time

4

Introduction Parallel SAS code requires SAS/CONNECT

One CONNECT client to many CONNECT servers

Parallel SAS code using SAS/CONNECT created by SAS Data Integration Studio SAS Enterprise Miner SAS Enterprise Guide / PROC SCAPROC

5

Background SIGNON / SIGNOFF

Establish/terminate connection to CONNECT server on

» Same machine

» Remote machine

» SAS Grid machine

RSUBMIT / ENDRSUBMIT Sends SAS code to CONNECT server for processing May or may not wait for code to complete

6

Simple SIGNON 1) OPTIONS SASCMD="!SASCMD";

2) %let mySess=mySpawnerHost.myDomain.com 1234;3) %sysfunc(grdsvc_enable(mySess,server=SASApp));

SIGNON mySess;

RSUBMIT mySess; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;

SIGNOFF mySess;

7

Simple SIGNON

SIGNON RSUBMIT SIGNOFF

Time

CONNECTServer(s)

CONNECTClient

8

Multiple SIGNONsSynchronous

SIGNON mySess1;SIGNON mySess2;

RSUBMIT mySess1; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;

RSUBMIT mySess2; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;

SIGNOFF mySess1;SIGNOFF mySess2;

9

Multiple SIGNONsSynchronous

SIGNON RSUBMIT SIGNOFF

10

Multiple SIGNONsAsynchronous

SIGNON mySess1 SIGNONWAIT=NO;SIGNON mySess2 SIGNONWAIT=NO;

RSUBMIT mySess1 WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;

RSUBMIT mySess2 WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;

SIGNOFF _ALL_;

11

Multiple SIGNONsAsynchronous

SIGNON RSUBMIT SIGNOFF

12

Multiple SIGNONsAsynchronous

SIGNON RSUBMIT SIGNOFF

13

Reusing a Session

SIGNON mySess1 SWAIT=NO;SIGNON mySess2 SWAIT=NO;

RSUBMIT mySess1 WAIT=NO; data _NULL_;rc=sleep(10,1);run;ENDRSUBMIT;

RSUBMIT mySess2 WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;

WAITFOR _ALL_ mySess1;

RSUBMIT mySess1 WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;

SIGNOFF _ALL_;

14

Reusing a Session

SIGNON RSUBMIT SIGNOFF

15

Reusing a Session

SIGNON RSUBMIT SIGNOFF

16

Reusing an AvailableSession

SIGNON mySess1 SWAIT=NO;SIGNON mySess2 SWAIT=NO;

RSUBMIT mySess1 WAIT=NO CMACVAR=myVar1; data _NULL_;rc=sleep(10,1);run;ENDRSUBMIT;RSUBMIT mySess2 WAIT=NO CMACVAR=myVar2; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;

WAITFOR _ANY_ mySess1 mySess2;

%determineAvailableSession(2);

RSUBMIT mySess&openSess WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;

SIGNOFF _ALL_;

17

Reusing an AvailableSession

%macro determineAvailableSession(numSessions);

%global openSess;

%do sess=1 %to &numSessions; %if (&&myVar&sess eq 0) %then %do; %let openSess=&sess; %let sess=&numSessions; %end; %end;

%mend;

18

Reusing anAvailable Session

SIGNON RSUBMIT SIGNOFF

19

Reusing anAvailable Session

SIGNON RSUBMIT SIGNOFF

20

Reusing the Best AvailableSession

SIGNON mySess1 SWAIT=NO CMACVAR=mySignonVar1;…SIGNON mySessN SWAIT=NO CMACVAR=mySignonVarN;

%waitForAvailableSession(N);RSUBMIT mySess&openSess WAIT=NO CMACVAR=myVar&openSess; data _NULL_;rc=sleep(10,1);run;ENDRSUBMIT;

%waitForAvailableSession(N);RSUBMIT mySess&openSess WAIT=NO CMACVAR=myVar&openSess; data _NULL_;rc=sleep(1,1);run;ENDRSUBMIT;

…SIGNOFF _ALL_;

21

Reusing the Best AvailableSession

%macro waitForAvailableSession(numSessions); %global openSessID;

%let sessFound=0; %do %while (&sessFound eq 0); %do sess=1 %to &numSessions; %if (&&mySignonVar&sess eq 0) %then %if (&&myVar&sess eq 0) %then %do; %let openSess=&sess; %let sess=&numSessions; %let sessFound=1; %end; %end; %if (&sessFound eq 0) %then %let rc=%sysfunc(sleep(1,1)); %end;

%mend;

22

Reusing theBest Available Session

SIGNON RSUBMIT SIGNOFF

23

How about a macro to do all of this…

Perform SIGNONs as needed SASCMD or Grid

Retry SIGNONs if one fails

Manage RSUBMITs to available hosts

Retry RSUBMITs if one fails

Display progress of RSUBMITs

SIGNOFF hosts when no more work exists

Email user when done

24

%Distribute Determine code that needs to be executed once when SIGNON completes LIBNAME, FILENAME

Create code that can run at each iteration Base iteration differences on macro variables

provided

» Rem_Host, Rem_iHost, Rem_Seed, Rem_NIterAll, Rem_Niter, Rem_JobIters, Rem_JobID, GlobalNSub

Setup %Distribute parameters

Run

25

%Distribute Signing on...

GridDistribute: Maximum number of nodes is 4

Processing...

GridDistribute: Signing on to Host #1

GridDistribute: Signing on to Host #2

GridDistribute: Signing on to Host #3

GridDistribute: Signing on to Host #4

Stat: [0:00:00] ???? (0/0) 100000

GridDistribute: Host #1 is host2.mydomain.com

GridDistribute: Host #2 is host4.mydomain.com

GridDistribute: Host #3 is host1.mydomain.com

GridDistribute: Host #4 is host3.mydomain.com

Stat: [0:00:02] !!!. (0/0) 100000

Stat: [0:00:02] .... (8000/0) 100000

Stat: [0:00:05] !!!! (8000/8000) 100000

<similar lines deleted>

Stat: [0:00:14] ...! (100000/94000) 100000

26

Summary Writing parallel SAS code can significantly speed up

processing

Some SAS products will do it for you

See the paper for discussion of additional considerations Information movement Data movement Output management RSUBMIT and the SAS Macro Facility

27

Questions?

28

Session ID #1935

30

Additional Considerations

Information movement %SYSLPUT / %SYSRPUT for macro variables

%SYSLPUT remVar=&localVar /REMOTE=mySess1;

RSUBMIT mySess1; … %SYSRPUT localVar=&remVar;ENDRSUBMIT;

31

Additional Considerations

Data Movement Shared file system / RDBMS PROC UPLOAD/DOWNLOAD RLS

Output Movement Log and List files LOG=, LIST= PROC PRINTTO

32

Additional Considerations

RSUBMIT and the SAS Macro Facility

RSUBMIT mySess1; %SYSRPUT localVar=&remVar;ENDRSUBMIT;

needs to be quotedRSUBMIT mySess1; %NRSTR(%%)SYSRPUT localVar=&remVar;ENDRSUBMIT;

or wrapped in a macroRSUBMIT mySess1; %MACRO updateVar; %SYSRPUT localVar=&remVar; %MEND; %updateVar;ENDRSUBMIT;

top related