what’s new in cms pipelines
TRANSCRIPT
1
What’s New in CMS Pipelines
Rob van der Heij
Velocity Software, Incrvdheij @ velocity-software.com
Session V65
IBM System z Technical Conference,München 2007
2
Agenda
� Short History of CMS Pipelines
� CMS Pipelines Runtime Library
� New stages and enhancements in RTL
� And some neat tricks I found
� There’s much more than shown here
3
Short History of CMS Pipelines
� Started by John Hartmann in the early 80’s
� Internally in IBM on the VMTOOLS disk
� Program Offering 5785-RAC in Europe
� PRPQ P81059, 5799-DKF in US
� Integrated into VM/ESA 1.1
� Additional new function with VM/ESA 2.3
� After that functionally frozen, only bug fixes
� Continued development of CMS/TSO Pipelines� Available for VM customers as the
CMS Pipelines Runtime Library
5
CMS Pipelines Runtime Library
� Available from Pipelines Home Page
� http://vm.marist.edu/~pipeline
� Initially hosted by Princeton University
Jan 2007• New PIPELINE MODULE (110B0013)• Revised Author’s Guide• Updated HELPLIB
6
New stages and enhancements
� Input ranges – substring
� Special cases of spec
� lookup – floor and ceiling options
� track* - reading and writing tracks
� digest – checksum computation
� cipher – encryption with block ciphers
� IUCV stages
� Enhancements to spec
� Timestamp conversion
� Internet-related stages
Note: This is just my selection of the enhancements. Check the revision codes in The Book for the full list.
7
Input Ranges
Many stages take “input range” as an argument
� Often based on columns
� Starting position width: 72.8
� Starting and end position: 72-79
� Negative numbers for offset to end of record
� Use semicolon to separate start and end: -8;-1
� Based on words
� Selecting 3rd and 4th word: word 3.2 or word 3-4
� Shortcut notation (w3.2)
� Also negative numbers for words from righthand side
� You can set the separator with wordseparator (ws)
� Based on fields
� Similar to selecting words: field 3.2 or field 3-4
� Default field separator (fs) is the field mark character
8
Input Ranges
Selecting fields is similar to selecting words
� Uses different separators
� Can be combined in a single spec stage
� Consecutive separators treated differently
� Consecutive field separators cause empty fields
� Consecutive word separators (blanks) ignored
� Consecutive blanks are retained in a word range
twoword 2One two three four five
http://www.vm.ibm.com/siteinfo/change.html
http://www.vm.ibm.com/siteinfo/change.html
http://www.vm.ibm.com/siteinfo/change.html
One two three four five
http://www.vm.ibm.comfs / f1.3
siteinfows / w3
www.vm.ibm.comfs / f3
four fiveword 4.2
9
Input Ranges
� Selection of a substring
� Similar to REXX substr( ) function
� Can be repeated when necessary
� Can be combined with the other selections
� Avoids a lot of record manipulation
� Input ranges are not limited to the spec stage
� Can be used in almost all stages
� For selection stages zone can also be used
ibsubstr 1.2 of substr fs . field -2 of fs / f3
ibmsubstr fs . f-2 of fs / f3
ibmsubstr fs . f3 of fs / f3
vmsubstr 5.2 of ws / w2
http://www.vm.ibm.com/siteinfo/change.html
10
Special cases of spec
� For some things spec is overkill
� insert – Insert string in records
� Can insert before or after the specified range
� addrdw – Prefix Record Descriptor Word
� Alternative for spec v2c conversion
� substr – Write substring of record
� Provide same function at lower cost
� Simpler to write
� Less resources used during execution
Substr 73.8spec 73.8 not chop 72
substr fs / field 2.3spec fs / field 2.3 1
substr 1.72spec 1.72 1
11
Measuring your Pipelines
� Measure your pipelines with Rita
� Shipped on the MAINT 193 disk
� Use RITA instead of PIPE
� Produces detailed CPU and memory usage per stage
� Works best with “named” pipelines
� See Author’s Edition for restrictions
rita (name substr) < dmsgpi maclib | substr 72.8 | count bytes | cons
5663.807 ( 5663.807) ms total in "substr" (1 invocation)
14117.884 ms total.
rita (name spec ) < dmsgpi maclib | spec 72.8 | count bytes | cons
12712.117 ( 12712.117) ms total in "spec" (1 invocation)
21367.047 ms total.
rita (name notchop) < dmsgpi maclib | not chop 72 | count bytes | cons
2947.352 ( 2947.352) ms total in "not" (1 invocation)
5547.178 ( 2599.826) ms total in "notchop" (1 invocation)
13705.775 ms total.
12
lookup
� Reference Table Lookup
� First reads all master records to build a table
� One by one tries the detail records against table
� Outputs either a match or a missing match
� Finally outputs the unused master records
� Conceptually pretty easy
� Requires multi-stream pipelines (in many cases)
� Lookup alone is worth to learn multi-stream pipelines
lookup
master
detail
matched
unmatched detail
unused masters
13
lookup
� Advanced features� Counting the number of references
� Set the initial counter value
� Take increment from the detail record
� Dynamic changes to the reference table� Adding masters
� Removing masters
� Automatic insertion of masters
� Multi-stream plumbing gone berserk� 4 input streams
� 7 output streams
See http://vm.marist.edu/~pipeline/rmhlup.pdf
14
lookup
� Options floor and ceiling
� Match against nearest master
� Handy to search in a load map
� Remember to convert addresses for ordering
/* LOOKMAP EXEC Find entry point and offset in load map */
/* Author: Rob van der Heij, 27 Feb 2007 */
arg addr fn rest
if fn = '' then exit 24
fspec = fn rest subword('MAP *', words(rest)+1)
'PIPE (end \ name LOOKMAP.EXEC:5)',
'\ var addr ',
'| spec pad 0 w1 1.8 r',
'| spec w1 x2c 1.4',
'| l: lookup floor 1.4 master detail',
'| spec 5.8 1 1.4 c2d nw read 1.4 c2d nw ',
'| spec 1.8 1 a: w2 - b: w3 -',
'print (b-a) d2x nw w3 d2x nw',
'| cons',
'\ <' fspec '| locate 10.4 , SD , | spec w3 x2c 1.4 r w1 n | l:'
15
track* - reading and writing tracks
� Possible applications� Copying disks (including non-CMS)
\ trackread 200 0 0 *
| trackwrite 400 RACFDB 0 24
� Like DDR but more flexible and easy to automate
� Even works over a network� Bruce Hayden’s PIPEDDR package
� Backup of non-CMS data� Options for incremental (track wise) backup
� Could be useful for Linux on z/VM cloning and backup
� Processing of non-CMS data� Reading spool data straight from disk
16
track* - reading and writing tracks
� Core stagesProcess one disk track per pipeline record� trackread – Read full tracks from ECKD disks� trackwrite – Write full tracks to ECKD disks
� Helper stages� trackblock – Build track record� trackdeblock – Deblock track� ckddeblock – Deblock track data record� tracksquish – Squish (compress) track� trackexpand – Expand squished track� trackverify – Verify track format
17
ECKD Topology
Count Data
3390 Cylinder= 15 tracks
1 Track =12 Records
Count Data Count Data Count Data
Count Data
Count Data
Count Data
Number of records per track depends on block size. A track holds 12 records of 4K (common CMS and Linux format).
trackread trackwrite
ckddeblock
Count
Data
Key
track
blocktra
ckdeb
lock
1 disk record per pipeline record
18
digest – Compute a Message Digest
� Computes “digest” or “hash” over pipeline records� Verifies that data has not been modified
� Similar to existing crc stage (16 bit checksum)
� New digest types create longer checksum� Supports popular cryptographic hash standards
� MD5 (128 bit, RFC 1321)
� SHA1 (160 bit, RFC 3174)
� SHA256, SHA384 and SHA512 (FIPS 180)
� Some use hardware support (if available)
� Useful to interface with Internet applications
� Long checksum attractive for use in CMS as well
pipe < pipeline news | digest md5 | spec 1-* c2x 1 | cons
661913BF6328DD9A5B29C3A93CA60B70
pipe < pipeline news | digest sha512 | spec 1-* c2x 1 | cons
42FEF021EDB48AEBD1DB42071198E8241224A9F1E23DC15AC4958C837AF8FC62...
19
cipher – Encrypt and Decrypt
� Supports modern block ciphers� AES� Blowfish� DES� Triple-DES
� Use Hardware support when available� Almost all ciphers provide software fall-back
The cipher stage is not in the CMS Pipelines Runtime Library
20
IUCV Stages
� IUCV connection
� iucvlisten – Wait for connection request
� iucvdata – Transfer data on connection
� iucvclient – Communicate with server
� Stages are not documented in The Book
� Primarily connection between pipelines
� Not a generic interface for IUCV
� Conceptually very similar to TCP/IP support
� Brief description in PIPELINE NEWS
21
IUCV Stages
� Sample IUCV Server
� Listens for new requests
� Subroutine spawns worker thread
� Real server should also
� Deal with errors
� Check for authorization
/* DEMOSRV EXEC Sample IUCV server */
'PIPE (end \)',
'\ iucvlisten pipsample',
'| demosrv',
'| cons'
return rc
/* DEMOSRV REXX Spawn IUCV worker */
signal on error; signal on novalue
do forever
'peekto req' /* Get conn req */
'addpipe (end \)',
'\*.output:',
'| i: fanin',
'| iucvdata',
'| cms', /* Action process */
'| elastic',
'| i:'
'callpipe (end \)',
'\ *: ',
'| take ', /* Pass conn req */
'| c: count lines',
'| *:',
'\ c: | var ok'
'sever output' /* Cut server loose */
if \ok then leave
end
error: return rc * ( rc <> 12 )
pipe literal l demo* * (alloc | iucvclient demosrv name pipsampl | cons
FILENAME FILETYPE FM FORMAT LRECL RECS BLOCKS
DEMOSRV REXX A1 V 47 22 1
DEMOSRV EXEC A1 V 44 8 1
Ready; T=0.01/0.03 01:27:13
22
Enhancements to spec
� Positioning of the output field
� Position specified by expression
� Expression in parentheses
� Can be computed from data in the record
� Avoids use of c2v conversion
� Practical to arrange cells in sparse table
� Can also be used to pad to defined length
pipe literal 1 a/2 b/3 c |split /
|spec a:w1 - w2 (a*2)
|cons
a
b
c
23
Enhancements to spec
� Additional time stamp conversion routines
� Similar to other REXX-like conversions
� c2t – 8 byte TOD to human readable form
� t2c – human readable form to TOD format
� Watch out for the time zone
� Also practical with built-in TOD field
� Like time stage but more flexible
pipe literal test | spec tod c2t 1 1-* nw | cons
2007-03-24 13:57:01.535484 test
24
Enhancements to spec
� Additional functions
� Numeric functions
� exact( ) sqrt( ) min( ) max( ) abs( )
� c2d( )
� String functions
� datatype( ) string( ) strip( )
� Statistical functions
� average( ) variance( ) stddev( )
25
Timestamp Conversion
greg2sec – Convert Timestamp to Seconds
sec2greg – Convert seconds to Timestamp
Also timestamp conversion in spec available
pipe literal
| spec tod c2t 1 2007-03-24 16:29:17.028198
| dateconvert w1 iso rexxs 20070324 16:29:17.028198
| space 0 string /: / 20070324162917.028198
| chop 14 20070324162917
| greg2sec
| cons
1174753757
26
Internet related stages
� Interface with Internet applicationsSome of these are meant to operate on ASCII data
� httpsplit – Split HTTP Datastream
� qpdecode – Decode Quoted-printable
� qpencode – Encode Quoted-printable
� 64decode – Decode MIME Base-64 Format
� 64encode – Encode MIME Base-64 Format
pipe literal Hello, World!| xlate e2a | qpencode | xlate a2e | cons
Hello, World=5D
pipe literal Hello, World!| 64encode | cons
yIWTk5ZrQOaWmZOEWg==
27
Internet related stages
� httpsplit – Split HTTP Datastream
� Operates by default on ASCII data (optional EBCDIC)
� Separates the header and body in HTTP connection
� Header lines to primary output
� Body to secondary output
� Supports Content-Length header
� Pretty hard to do with pure pipeline stages'PIPE (end \ name WGET.EXEC:20)',
'\ var url',
'| spec ,GET, 1 1-* nw ,HTTP/1.0, nw',
'| append literal Host:' url_host,
'| append literal Connection: close',
'| append strliteral //',
'| xlate from 1047 to 850', /* EBCDIC to ASCII */
'| insert x0d0a after', /* .. and CRLF */
'| d: fanin',
'| tcpclient' host 'linger 10',
'| h: httpsplit', /* Split header and body */
'| xlate from 850 to 1047', /* ASCII to EBCDIC */
'| w: wget' out '(' option, /* Read body to output file */
'| cons',
'| hole',
'| d:', /* Keep it alive */
'\ h: | elastic | w:' /* Process the body */
http://www.rvdheij.nl/download/wget.vma
28
More New Stages
� space – Space words like REXX
� stsi – Store System Information
� threeway – Split record three ways
� trfread – Read a Trace File
� wildcard – Select Record Matching a Pattern
29
space
� Space words like REXX function space( )
� Isolates words in the record
� Specify word separator (default blank)
� Writes record with new separator between words
� Separator string and number of copies can be specified
� Also handy to parse on special characters
PIPE literal one two three four | space /*/ | cons
one*two*three*four
pipe literal one two three four | space 2 /(*)/ | cons
one(*)(*)two(*)(*)three(*)(*)four
pipe literal 12:34:56.123456 | space 1 anyof /:./ | cons
12 34 56 123456
30
threeway
� Convenient shortcut for “split cascade”
� Record to 3 outputs based on input range
� 0: portion before the range
� 1: the range itself
� 2: portion after the range
/* 3WAY EXEC The 3way split */
'PIPE (end \)',
'\ literal one two three four',
'| x: 3way w2.2',
'| insert ,0:, | cons',
'\ x: | insert ,1:, | cons',
'\ x: | insert ,2:, | cons',
3way
0:one
1:two three
2: four
/* 3LIGHT REXX Brother of zone */
parse arg range stage
'addpipe (end \ name 3LIGHT.REXX:6)',
'\ *:',
'| w: 3way' range ,
'| copy',
'| s: spec 1-* 1 select 1 1-* n select 2 1-* n',
'| *:',
'\ w:',
'|' stage ,
'| copy',
'| s:',
'\ w: | s:'
pipe literal one two three four | 3light w2.2 reverse | cons
one eerht owt four
31
trfread
� Reads Trace Files (TRF)
� Function similar to TRACERED
� With data in the pipeline you can do more things
� Easier than trying to post-process the TRACERED output
� Look at DTFBK for decoding the format
� No need to manipulate temporary files
� Sample program TRF2TCPD� http://www.vm.ibm.com/download/packages/descript.cgi?TRF2TCPD
� Reads VM TCP/IP Packet Trace to produce libpcap format
� Same format as used by tcpdump and other Linux tools
� Recent z/VM releases provide a sniffer interface for Linux
32
wildcard
� Performs pattern matching like LISTFILE
� When dealing with file names in another way
� Allow application to select resources via wildcards
PIPE listpds fplgpi maclib | chop 8| wildcard /PIPC*/ | member fplgpi maclib | …
/* WQN EXEC QUERY NAMES with wildcard */
arg wild . ; parse value wild '*' with wild .
'PIPE (end \ name WQN.EXEC:7)',
'\ cp query names',
'| split , ',
'| strip leading',
'| pad 15',
'| wildcard substr w1 of 1.8 /'wild'/',
'| join 3 /, /',
'| cons'
wqn ESB*
ESBMON - DSC , ESBMAP - DSC , ESBTCP - DSC , ESBWRITE - DSC
ESBSERVE - DSC
Ready; T=0.02/0.04 01:11:48
wqn LNX*
LNX00C04 - DSC , LNX00C03 - DSC , LNX00C02 - DSC , LNX00C01 - DSC
LNX00C00 - DSC
Ready; T=0.02/0.03 01:11:58
33
Many other enhancements
� pad – Expand Short Records � Provides modulo and offset
� pick – Select Lines that Satisfy a Relation� Supports numeric comparison as well
� collate – Collate Streams� Supports “stop anyeof”
� fillup – Pass Records to Output Streams� Variation on fanout and deal
� strliteral – Write an argument string� Additional options for conditional or default
� locate – Select Lines that Contain a String� Enhancements to test individual bits through mask
� reader – Read from Virtual Card Reader� Additional KEEP and PURGE options
� rexxvars – Retrieve Variables from REXX variable pool� Suppress error message when environment does not exist
� tolabel – Select records upto Label� Option to also select matching record
34
CMS Pipelines Home Page
http://vm.marist.edu/~pipeline
� Latest Runtime Library
� Author’s Reference
� PIPELINE HELPLIB
� Papers about CMS Pipelines
� Tools and pipeline stages
CMS Pipelines mailing list
listserv @ vm.marist.edu
http://www.rvdheij.nl/Presentations/