divide and conquer writing parallel sas ® code to speed up your sas program doug haigh, sas...
Post on 17-Jan-2016
251 Views
Preview:
TRANSCRIPT
Divide and ConquerWriting Parallel SAS® Code to Speed Up Your SAS Program
Doug Haigh, SAS Institute Inc.
2
Introduction Have you ever wanted to
Text and drive at the same time? Watch the big game and read a book at the same time? Be on vacation at the beach and get work done at the office?
Humans are not good at doing two things at the same time but your SAS code can be
3
Introduction Parallel code is when two or more streams of execution
occur at nearly the same time
Time
4
Introduction Parallel SAS code requires SAS/CONNECT
One CONNECT client to many CONNECT servers
Parallel SAS code using SAS/CONNECT created by SAS Data Integration Studio SAS Enterprise Miner SAS Enterprise Guide / PROC SCAPROC
5
Background SIGNON / SIGNOFF
Establish/terminate connection to CONNECT server on
» Same machine
» Remote machine
» SAS Grid machine
RSUBMIT / ENDRSUBMIT Sends SAS code to CONNECT server for processing May or may not wait for code to complete
6
Simple SIGNON 1) OPTIONS SASCMD="!SASCMD";
2) %let mySess=mySpawnerHost.myDomain.com 1234;3) %sysfunc(grdsvc_enable(mySess,server=SASApp));
SIGNON mySess;
RSUBMIT mySess; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;
SIGNOFF mySess;
7
Simple SIGNON
SIGNON RSUBMIT SIGNOFF
Time
CONNECTServer(s)
CONNECTClient
8
Multiple SIGNONsSynchronous
SIGNON mySess1;SIGNON mySess2;
RSUBMIT mySess1; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;
RSUBMIT mySess2; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;
SIGNOFF mySess1;SIGNOFF mySess2;
9
Multiple SIGNONsSynchronous
SIGNON RSUBMIT SIGNOFF
10
Multiple SIGNONsAsynchronous
SIGNON mySess1 SIGNONWAIT=NO;SIGNON mySess2 SIGNONWAIT=NO;
RSUBMIT mySess1 WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;
RSUBMIT mySess2 WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;
SIGNOFF _ALL_;
11
Multiple SIGNONsAsynchronous
SIGNON RSUBMIT SIGNOFF
12
Multiple SIGNONsAsynchronous
SIGNON RSUBMIT SIGNOFF
13
Reusing a Session
SIGNON mySess1 SWAIT=NO;SIGNON mySess2 SWAIT=NO;
RSUBMIT mySess1 WAIT=NO; data _NULL_;rc=sleep(10,1);run;ENDRSUBMIT;
RSUBMIT mySess2 WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;
WAITFOR _ALL_ mySess1;
RSUBMIT mySess1 WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;
SIGNOFF _ALL_;
14
Reusing a Session
SIGNON RSUBMIT SIGNOFF
15
Reusing a Session
SIGNON RSUBMIT SIGNOFF
16
Reusing an AvailableSession
SIGNON mySess1 SWAIT=NO;SIGNON mySess2 SWAIT=NO;
RSUBMIT mySess1 WAIT=NO CMACVAR=myVar1; data _NULL_;rc=sleep(10,1);run;ENDRSUBMIT;RSUBMIT mySess2 WAIT=NO CMACVAR=myVar2; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;
WAITFOR _ANY_ mySess1 mySess2;
%determineAvailableSession(2);
RSUBMIT mySess&openSess WAIT=NO; data _NULL_;rc=sleep(5,1);run;ENDRSUBMIT;
SIGNOFF _ALL_;
17
Reusing an AvailableSession
%macro determineAvailableSession(numSessions);
%global openSess;
%do sess=1 %to &numSessions; %if (&&myVar&sess eq 0) %then %do; %let openSess=&sess; %let sess=&numSessions; %end; %end;
%mend;
18
Reusing anAvailable Session
SIGNON RSUBMIT SIGNOFF
19
Reusing anAvailable Session
SIGNON RSUBMIT SIGNOFF
20
Reusing the Best AvailableSession
SIGNON mySess1 SWAIT=NO CMACVAR=mySignonVar1;…SIGNON mySessN SWAIT=NO CMACVAR=mySignonVarN;
%waitForAvailableSession(N);RSUBMIT mySess&openSess WAIT=NO CMACVAR=myVar&openSess; data _NULL_;rc=sleep(10,1);run;ENDRSUBMIT;
%waitForAvailableSession(N);RSUBMIT mySess&openSess WAIT=NO CMACVAR=myVar&openSess; data _NULL_;rc=sleep(1,1);run;ENDRSUBMIT;
…SIGNOFF _ALL_;
21
Reusing the Best AvailableSession
%macro waitForAvailableSession(numSessions); %global openSessID;
%let sessFound=0; %do %while (&sessFound eq 0); %do sess=1 %to &numSessions; %if (&&mySignonVar&sess eq 0) %then %if (&&myVar&sess eq 0) %then %do; %let openSess=&sess; %let sess=&numSessions; %let sessFound=1; %end; %end; %if (&sessFound eq 0) %then %let rc=%sysfunc(sleep(1,1)); %end;
%mend;
22
Reusing theBest Available Session
SIGNON RSUBMIT SIGNOFF
23
How about a macro to do all of this…
Perform SIGNONs as needed SASCMD or Grid
Retry SIGNONs if one fails
Manage RSUBMITs to available hosts
Retry RSUBMITs if one fails
Display progress of RSUBMITs
SIGNOFF hosts when no more work exists
Email user when done
24
%Distribute Determine code that needs to be executed once when SIGNON completes LIBNAME, FILENAME
Create code that can run at each iteration Base iteration differences on macro variables
provided
» Rem_Host, Rem_iHost, Rem_Seed, Rem_NIterAll, Rem_Niter, Rem_JobIters, Rem_JobID, GlobalNSub
Setup %Distribute parameters
Run
25
%Distribute Signing on...
GridDistribute: Maximum number of nodes is 4
Processing...
GridDistribute: Signing on to Host #1
GridDistribute: Signing on to Host #2
GridDistribute: Signing on to Host #3
GridDistribute: Signing on to Host #4
Stat: [0:00:00] ???? (0/0) 100000
GridDistribute: Host #1 is host2.mydomain.com
GridDistribute: Host #2 is host4.mydomain.com
GridDistribute: Host #3 is host1.mydomain.com
GridDistribute: Host #4 is host3.mydomain.com
Stat: [0:00:02] !!!. (0/0) 100000
Stat: [0:00:02] .... (8000/0) 100000
Stat: [0:00:05] !!!! (8000/8000) 100000
<similar lines deleted>
Stat: [0:00:14] ...! (100000/94000) 100000
26
Summary Writing parallel SAS code can significantly speed up
processing
Some SAS products will do it for you
See the paper for discussion of additional considerations Information movement Data movement Output management RSUBMIT and the SAS Macro Facility
27
Questions?
28
Session ID #1935
30
Additional Considerations
Information movement %SYSLPUT / %SYSRPUT for macro variables
%SYSLPUT remVar=&localVar /REMOTE=mySess1;
RSUBMIT mySess1; … %SYSRPUT localVar=&remVar;ENDRSUBMIT;
31
Additional Considerations
Data Movement Shared file system / RDBMS PROC UPLOAD/DOWNLOAD RLS
Output Movement Log and List files LOG=, LIST= PROC PRINTTO
32
Additional Considerations
RSUBMIT and the SAS Macro Facility
RSUBMIT mySess1; %SYSRPUT localVar=&remVar;ENDRSUBMIT;
needs to be quotedRSUBMIT mySess1; %NRSTR(%%)SYSRPUT localVar=&remVar;ENDRSUBMIT;
or wrapped in a macroRSUBMIT mySess1; %MACRO updateVar; %SYSRPUT localVar=&remVar; %MEND; %updateVar;ENDRSUBMIT;
top related