![Page 1: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/1.jpg)
Automating a Vendor File Load Process with Perl and
Shell Scripting
Roy Zimmer
Western Michigan University
![Page 2: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/2.jpg)
We needed to get Promptcat approval files, from OCLC’s ftp site.
![Page 3: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/3.jpg)
We needed to get Promptcat approval files, from OCLC’s ftp site.
Historically, we’ve done file retrieval and processing via shell scripting, with some supporting Perl software.
![Page 4: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/4.jpg)
We needed to get Promptcat approval files, from OCLC’s ftp site.
Historically, we’ve done file retrieval and processing via shell scripting, with some supporting Perl software.
In this case, we started out mostly manual, with some programmatic support.
![Page 5: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/5.jpg)
We needed to get Promptcat approval files, from OCLC’s ftp site.
Historically, we’ve done file retrieval and processing via shell scripting, with some supporting Perl software.
In this case, we started out mostly manual, with some programmatic support.
We kind of snuck up on the final method of retrieval and processing.
![Page 6: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/6.jpg)
The ftp site has quite a few files, including a number of different types: LBL, RPT, APPR, FIRM…
![Page 7: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/7.jpg)
Let’s use a representative sample for this presentation…
![Page 8: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/8.jpg)
Out of this large number of files, only one or a few will be of interest. For example, take the files for May 7.
![Page 9: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/9.jpg)
Out of this large number of files, only one or a few will be of interest. In this case, the files for May 7.
How do we pick them out?
![Page 10: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/10.jpg)
This is where Perl comes to the rescue.
With Perl, you can do many things.
![Page 11: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/11.jpg)
Code details, main program, ftp stuff
ftppcatappr.pl
Required when using ftp within Perl
![Page 12: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/12.jpg)
Site password is stored here.
Code details, main program, ftp stuff
ftppcatappr.pl
Required when using ftp within Perl
![Page 13: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/13.jpg)
Site password is stored here.
Code details, main program, ftp stuff
ftppcatappr.pl
Required when using ftp within Perl
- Site URL - Username - directory where files are - transfer mode
![Page 14: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/14.jpg)
Code details, main program, ftp stuff
ftppcatappr.pl
Self-explanatory
Required when using ftp within Perl
Site password is stored here.
- Site URL - Username - directory where files are - transfer mode
![Page 15: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/15.jpg)
Site password is stored in a file.
Code details, main program, ftp stuff
ftppcatappr.pl
![Page 16: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/16.jpg)
Site password is stored in a file.
Setting up for FTP
Code details, main program, ftp stuff
ftppcatappr.pl
![Page 17: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/17.jpg)
Retrieve ftp site file listing into a variable as an array of directory entries.
Code details, main program, ftp stuff
ftppcatappr.pl
![Page 18: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/18.jpg)
Set each line up to be split on the space character and then do so.
Code details, main program, ftp stuff
Retrieve ftp site file listing into a variable as an array of directory entries.
ftppcatappr.pl
![Page 19: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/19.jpg)
Set each line up to be split on the space character and then do so.
The last piece in each line will be the filename. Split this into pieces based on the period.
Code details, main program, ftp stuff
Retrieve ftp site file listing into a variable as an array of directory entries.
ftppcatappr.pl
![Page 20: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/20.jpg)
Set each line up to be split on the space character and then do so.
The last piece in each line will be the filename. Split this into pieces based on the period.
Look for the one(s) that correspond(s) with yesterday’s date and keep those.
Code details, main program, ftp stuff
Retrieve ftp site file listing into a variable as an array of directory entries.
ftppcatappr.pl
![Page 21: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/21.jpg)
Want the files to be processed in order
Code details, main program, ftp stuff
ftppcatappr.pl
![Page 22: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/22.jpg)
Code details, main program, processing each file
ftppcatappr.pl
Get the files
![Page 23: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/23.jpg)
Code details, main program, processing each file
Records will need some editing…
(Thanks to Birong Ho, our systems librarian, for originally supplying this editing code.)
Get the records
ftppcatappr.pl
![Page 24: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/24.jpg)
Records will need some editing…
“grab” the fields of interest
Get the records
Code details, main program, processing each file
ftppcatappr.pl
![Page 25: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/25.jpg)
Records will need some editing…
Some fields are deleted…
Code details, main program, processing each file
ftppcatappr.pl
![Page 26: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/26.jpg)
Records will need some editing…
…and others are edited.
More edits than this are performed; the basic syntax is the same for each of them.
Code details, main program, processing each file
ftppcatappr.pl
![Page 27: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/27.jpg)
File will need some splitting…
Split each file up based on the invoice number found in field 980 |f
Code details, main program, processing each file
![Page 28: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/28.jpg)
File will need some splitting…
Split each file up based on the invoice number found in field 980 |f
The next program takes care of this…
I did say we snuck up on this, didn’t I?
Code details, main program, processing each file
![Page 29: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/29.jpg)
File will need some splitting…split each file up based on the invoice number found in field 980 |f
Rather than using the familiar LF, the MARC format uses a different EOL character.
Code details, helper program, processing each file
oclc980.pl
![Page 30: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/30.jpg)
File will need some splitting…split each file up based on the invoice number found in field 980 |f
This section reads each MARC record, looking for the 980 field.
Code details, helper program, processing each file
oclc980.pl
![Page 31: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/31.jpg)
File will need some splitting…split each file up based on the invoice number found in field 980 |f
Get the subfields into an array.
Code details, helper program, processing each file
oclc980.pl
![Page 32: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/32.jpg)
File will need some splitting…split each file up based on the invoice number found in field 980 |f
Get the subfields into an array.
Code details, helper program, processing each file
Look for subfield f and read it to get the invoice number.
oclc980.pl
![Page 33: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/33.jpg)
File will need some splitting…split each file up based on the invoice number found in field 980 |f
Get the subfields into an array.
Code details, helper program, processing each file
Look for subfield f and read it to get the invoice number.
Determine if it’s a new or “existing” invoice number. This also lets us count records for each invoice.
oclc980.pl
![Page 34: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/34.jpg)
File will need some splitting…split each file up based on the invoice number found in field 980 |f
Get the subfields into an array.
Code details, helper program, processing each file
Look for subfield f and read it to get the invoice number.
Determine if it’s a new or “existing” invoice number. This also lets us count records for each invoice.
Use append mode to open, write a record, and close the file for each invoice number.
oclc980.pl
![Page 35: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/35.jpg)
There are usually several files after splitting the file being processed. Each one must be further processed and loaded into Voyager.
This is controlled via a small shell script.
Code details, helper program, processing each invoice file
![Page 36: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/36.jpg)
There are usually several files after splitting the file being processed. Each one must be further processed and loaded into Voyager.
This is controlled via a small shell script.
It calls another shell script for preprocessing and bulk loading of each of the invoice files.
Code details, helper program, processing each invoice file
(Thanks to Keith Kelley, director of systems, for creating this script.)
importall.sh
![Page 37: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/37.jpg)
Code details, helper program, importing each invoice file
$1 is the default first parameter to the script. Let’s use a more descriptive variable.
prodimport.script
![Page 38: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/38.jpg)
Code details, helper program, importing each invoice file
$1 is the default first parameter to the script. Let’s use a more descriptive variable.
Let’s also drop the filename extension, so that we can “reuse” the filename.
prodimport.script
![Page 39: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/39.jpg)
Code details, helper program, importing each invoice file
$1 is the default first parameter to the script. Let’s use a more descriptive variable.
Let’s also drop the filename extension, so that we can “reuse” the filename.
Get ready for the prebulk processing.
prodimport.script
![Page 40: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/40.jpg)
Code details, helper program, importing each invoice file
prodimport.script
1st use of file referenced by $1, so that use is OK
![Page 41: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/41.jpg)
Code details, helper program, importing each invoice file
Start with some final edits…
prodimport.script
![Page 42: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/42.jpg)
Code details, helper program, importing each invoice file
Start with some final edits…
We’ll use marcedit.pl to replace the contents of field 981 |a, as illustrated:
There are 59 such edits possible.
prodimport.script
![Page 43: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/43.jpg)
Code details, helper program, importing each invoice file
Start with some final edits…
Prep for bulkimport, too
prodimport.script
![Page 44: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/44.jpg)
Code details, helper program, importing each invoice file
Start with some final edits…
Prep for bulkimport, too
Self-explanatory
prodimport.script
![Page 45: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/45.jpg)
Code details, helper program, importing each invoice file
Prebulk output is bulkimport input.
Perform the bulkimport
prodimport.script
![Page 46: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/46.jpg)
Code details, helper program, importing each invoice file
Prebulk output is bulkimport input.
Perform the bulkimport
The final step for each file is to do some cleanup and moving files to the loaded directory.
prodimport.script
![Page 47: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/47.jpg)
Password maintenance
The ftp site requires us to change our password every 90 days.
We wanted all this to run hands-off, so that had to be automated, also.
The password gets changed every two months.
![Page 48: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/48.jpg)
Password maintenance, getpromptcatpw.ksh
![Page 49: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/49.jpg)
Password maintenance, getpromptcatpw.ksh
![Page 50: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/50.jpg)
Password maintenance, pwgen.pl
Want an 8-character password
![Page 51: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/51.jpg)
Password maintenance, pwgen.pl
Want an 8-character password
Password length defaults to 10
![Page 52: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/52.jpg)
Password maintenance, pwgen.pl
Want an 8-character password
Password length defaults to 10
Password consists of these characters
![Page 53: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/53.jpg)
Password maintenance, pwgen.pl
Want an 8-character password
Password length defaults to 10
Password consists of these characters
Seed the random number generator
![Page 54: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/54.jpg)
Password maintenance, pwgen.pl
Want an 8-character password
Password length defaults to 10
Password consists of these characters
Seed the random number generator
Generate the password
![Page 55: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/55.jpg)
ReviewRun ftppcatappr.pl
Login to OCLC ftp site for promptcat
Find desired files and retrieve them
Do process each file
remove unwanted 6xx, 938, 948 fields
edit some 856 fields
run oclc980.pl
do process each record in the current file
look at the 980 |f (contains the invoice number)
if it contains invoice NNN, (create and) put this record in file NNN.marc, etc.
end do
run importall.sh
do process each file created by oclc980.pl
run prodimport.script
use marcedit.pl to process 981 |a replacements (59 possible edits)
prebulk
bulk import
wait 1.5 minutes before continuing
end do
move all interim .marc, .preimp, and .imp files to /loaded
End do
Move all RCD* files to /loaded
![Page 56: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/56.jpg)
ReviewRun ftppcatappr.pl
Login to OCLC ftp site for promptcat
Find desired files and retrieve them
Do process each file
remove unwanted 6xx, 938, 948 fields
edit some 856 fields
run oclc980.pl
do process each record in the current file
look at the 980 |f (contains the invoice number)
if it contains invoice NNN, (create and) put this record in file NNN.marc, etc.
end do
run importall.sh
do process each file created by oclc980.pl
run prodimport.script
use marcedit.pl to process 981 |a replacements (59 possible edits)
prebulk
bulk import
wait 1.5 minutes before continuing
end do
move all interim .marc, .preimp, and .imp files to /loaded
End do
Move all RCD* files to /loaded
![Page 57: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/57.jpg)
The files listed below are available at
http://homepages.wmich.edu/~zimmer/files/eugm2008
fileload.ppt this presentation
ftppcatappr.pl gets the files and controls the processing
oclc980.pl splits retrieved files based on invoice number
pwgen.plgenerates a password
importall.sh ensures that each “split file” for a particular retrievedfile is processed
prodimport.ksh does the actual processing of each file
getpromptpw.ksh handles all the details of a password change
Resources
except for
marcedit.pl enables batch editing of MARC files
which is at http://homepages.wmich.edu/~zimmer/marc_index.html
![Page 58: Automating a Vendor File Load Process with Perl and Shell Scripting](https://reader033.vdocuments.mx/reader033/viewer/2022061207/548505c9b4af9f730d8b4d30/html5/thumbnails/58.jpg)
CPAN http://cpan.org
FTP http://search.cpan.org/~gbarr/libnet-1.22/Net/FTP.pm
I’m not sure if the FTP module is supplied on Voyager boxes or not. If you don’t have it, go to the above URL. It also has good documentation on this module.
Resources