art tabachneck, president myqna, inc. september 28, 2012 next: copy and paste almost anything art...
TRANSCRIPT
Art Tabachneck, PresidentmyQNA, Inc.
September 28, 2012
Next: Copy and Paste Almost Anything
Art holds a PhD from Michigan State University, has been a SAS user for more than 38 years, is president of the Toronto Area SAS Society, was the 2009 SAS Customer Value Award winner, was inducted into the SAS-L Hall of Fame in 2011 and, in 2012, was made a member of the SAS Circle of Excellence. He is also one of the top contributors to the SAS Discussion Forums and sasCommunity.org
Presenter: Arthur Tabachneck
Art Tabachneck, myQNA, Inc.
September 28, 2012
Art Tabachneck, PresidentmyQNA, Inc.
September 28, 2012
Copy and Paste Almost Anything
Arthur Tabachneck
Richard DeVenezia
Nate Derby
John King
Ben Powell
Randy Herbison
Art Tabachneck, myQNA, Inc.
September 28, 2012
The problem:you find a table or form on a web page you want/need to have in a SAS dataset
oryou want to import an Excel workbook, but you don't license SAS/Access for PC file formats
you want to import an XLSX file, but are on an older version of SAS (or don't license SAS/Access for PC file formats)
or
when importing you need more control than is available with proc import
or
Art Tabachneck, myQNA, Inc.
September 28, 2012
an html page
a pdf document
a workbook
a word document
a page from a wikior any other form that you can copy to your system's clipboard
The problem: the data you want to import might be in the form of:
Art Tabachneck, myQNA, Inc.
September 28, 2012
some columns may not have variable namesTable A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
variable names may take up more than one row
Row 3Row 2
Row 1Table A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
table may contain one or more blank rowsTable A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
rows may have some missing valuesTable A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
data may contain sub or superscript valuesTable A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
you may want to name or rename some variables
Country
Table A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
Table A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
you may want to add a variable name prefix/suffix
CountryRevenue
2006Revenue
2005Revenue
2004Revenue
2000Revenue
1995Revenue
1990Revenue
1985Revenue
1975
Art Tabachneck, myQNA, Inc.
September 28, 2012
you might want to add a variable labelTable A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
2007Guesstimate
Art Tabachneck, myQNA, Inc.
September 28, 2012
you might want to specify missing valuesTable A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
n.a.
Art Tabachneck, myQNA, Inc.
September 28, 2012
you may want to multiply a variable by a constantTable A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
.384
Art Tabachneck, myQNA, Inc.
September 28, 2012
you may want to specify formats or informatsTable A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
37.5%
Art Tabachneck, myQNA, Inc.
September 28, 2012
you may want to specify which row(s) should be used to guess formats and informats
Table A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
Table A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
you may want some data converted to upper case
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
you may not want all of the dataTable A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
you may want to drop one or more columns
Country
Table A. Total tax revenue as percentage of GDP
1975 1985 1990 1995 2000 2004 2005 20062007
Provisional
Korea 15.1 16.4 18.9 19.4 23.6 24.6 25.5 26.8 28.7
New Zealand 28.5 31.1 37.4 36.6 33.6 35.3 37.5 36.7 36.0
Austria 36.7 40.9 39.6 41.2 42.6 42.8 42.1 41.7 41.9
Belgium 39.5 44.4 42.0 43.6 44.9 44.8 44.8 44.5 n.a.
Czech Republic 37.5 35.3 37.8 37.5 36.9 36.4
Denmark1 38.4 46.1 46.5 48.8 49.4 49.0 50.7 49.1 48.9
Source: http://www.oecd.org/dataoecd/48/27/41498733.pdf
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
you might want merged cells to apply to more than one variable
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
so that you get a table that looks like:
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
the table might need to be transposed – Have:
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
Need: what you really want is
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
the table might copy in the following form – Have:
ProjNum1234ClaimNum419129PreAdv11(continued)
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
Need: what you really want is
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
the data might not even be in tabular form
Note: Ensure that use of these methods does not violate a site's terms of use
The problem (continued): things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
Need: what you really want is
The problem (continued): things that can complicate copying and pasting:
Would you like to know how to get such data into SAS correctly, quickly and painlessly?
"Excuse me. Is this the Society for Asking Stupid Questions?"
© goddard cartoons
Art Tabachneck, myQNA, Inc.
September 28, 2012
Solution: you can highlight and copy the table to your system's clipboard
Art Tabachneck, myQNA, Inc.
September 28, 2012
and, if only a simple extract is needed,and SAS/Access to PC-Files is licensed,
you might be able to paste it into an Excel file and then use proc import
Solution: you can highlight and copy the table to your system's clipboard
Art Tabachneck, myQNA, Inc.
September 28, 2012
or if the task requires some features thataren't currently offered with proc import?
but what if you don't haveSAS/Access to PC-Files?
Solution: you can highlight and copy the table to your system's clipboard
Art Tabachneck, myQNA, Inc.
September 28, 2012
access the system clipboardassign or rename variable names
parse a structured documentassign missing values
indicate which data rows to select
specify formats and/or informats
import multiple row variable names
transpose data
account for merged cells in variable names
add a prefix or suffix to variable names
specify variable labels
change a variable's unit of measurementupcase any variable's values
specify the rows to use to guess (in)formats
drop one or more columns
proc import currently doesn't provide a way to:
Art Tabachneck, myQNA, Inc.
September 28, 2012
but, for tables that you can paste, the code presented in this paper includes all
of those options and capabilities
A solution
Art Tabachneck, myQNA, Inc.
September 28, 2012
our Truth in Advertising commitment
may not work on all systems
WARNING: The code/method presented in this paper:
IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. The authors shall not be liable whatsoever for any damages arising out of the use of this documentation or code, including any direct, indirect, or consequential damages. In addition, the authors will provide no support for the materials contained herein.
is NOT production quality
are NOT substitutes for proc import
should NOT be used if such use violates any copyright or terms of agreement
Art Tabachneck, myQNA, Inc.
September 28, 2012
what you actually copy is extremelysoftware and system dependent
things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
actual data from a recent SAS Forums' post
things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
copied and pasted from IE to Window's Notepad
things that can complicate copying and pasting:
every variable name and data point on
separate row
blank line between every variable name
and data point
two blank lines between every record
Art Tabachneck, myQNA, Inc.
September 28, 2012
copied and pasted from Chrome to Notepad
things that can complicate copying and pasting:
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
clipbrd method lets you "paste" data from your system's clipboard, but
translates tabs into spaces
and
different systems translate tabs into
different numbers of spaces
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
account for multiple row variable names
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
account for merged cells in variable names
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
add and rename variable names
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
add a prefix or suffix to any variable name
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
specify desired variable labels
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
account for rows between data and variable names
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
specify missing values for any
variables
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
specify formats and informats
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
if needed, indicate amount data should
be multiplied by
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
drop any columns that you don’t want
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
upcase any variables
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
specify which row(s) should be considered in guessing formats
and informats
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
specify desired output filename
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
specify one variable
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_drop=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
a range of variables
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
or any combination you might neede.g.,
%let var_informats= 1~$20. 2-5~best12. 6~comma8. 7~percent8. 8~anydtdte21. 9~trailsgn8. ;
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solutionfirst, declare needed macro variables
%let spaces=" ";%let hrows=2;%let var_share=3~2 5~4;%let var_renames=;%let var_prefix=;%let var_suffix=;%let var_labels=;%let first_data_row=3;%let var_missing=10~n.a.;%let var_formats=2-5~best12.;%let var_informats=2-5~best12.;%let var_units=;%let var_upcase=;%let guessingrows=3-3;%let outfile=revenue;
or any combination you might neede.g.,
%let var_informats= 1~$20. 2-5~best12. 6~comma8. 7~percent8. 8~anydtdte21. 9~negative8. ;
Note: there are additional macro variablesto address structured layouts and files thatneed to be transposed
%let transpose=YES;%let columns=5;%let rows=80;%data_form=YES;
Art Tabachneck, myQNA, Inc.
September 28, 2012
our proposed solution
in the code, instructions are shown as comments. e.g.
%let var_prefix=; *specify any string that you want appended before any variable name. A ~ must be used to separate variable number(s) and variable prefixes, and either a space or separate line to represent additional entries. If you want the same prefix used for a range of variables, specify the range as #-#. E.g., if variables 2 thru 4 are named 1996, 1997 and 1998, and you want them to be named Revenue_1996, Revenue_1997 and Revenue_1998 you would specify: %let var_prefix=2-4~Revenue_;
Art Tabachneck, myQNA, Inc.
September 28, 2012
the code%flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
Art Tabachneck, myQNA, Inc.
September 28, 2012
macro flipfile preprocesses the data%flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
%macro flipfile;%if &columns. gt 0 and &rows. gt 0 %then %do;
%if &transpose. eq YES %then %do;data temp;
infile clippy;length temp $32767;input;_infile_=tranwrd(_infile_, &spaces., '09'x);j=_n_;do i=1 to &rows.;
temp=strip(scan(_infile_,i,,"HM"));output;
end;run;
%end;
Art Tabachneck, myQNA, Inc.
September 28, 2012
macro flipfile preprocesses the data (continued)%flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
%macro flipfile;%if &columns. gt 0 and &rows. gt 0 %then %do;
%if &transpose. eq YES %then %do;data temp;
infile clippy;length temp $32767;input;_infile_=tranwrd(_infile_, &spaces., '09'x);j=_n_;do i=1 to &rows.;
temp=strip(scan(_infile_,i,,"HM"));output;
end;run;
%end;
%else %do;data temp;
infile clippy;length temp $32767;input;temp=_infile_;if _n_ eq 1 then do;
i=0; j=1;end;i+1;output;if i eq &rows.+&hrows. then do;
j+1; i=0;end;
run;%end;
Art Tabachneck, myQNA, Inc.
September 28, 2012
macro flipfile preprocesses the data (continued)%flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
proc sort data=temp; by i j; run;
data _null_;length holdrec $32767;retain holdrec;file clippy;set temp;if mod(_n_,&columns.) eq 1 then holdrec=strip(temp);else holdrec=cat(strip(holdrec),"09"x,strip(temp));if mod(_n_,&columns.) eq 0 then put holdrec;
run;
proc delete data=work.temp; run;%end;
Art Tabachneck, myQNA, Inc.
September 28, 2012
macro flipfile preprocesses the data (continued)%flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
%if &data_form. eq YES %then %do;proc sql noprint;
select varname into :var_namesseparated by "~"
from form_varnames;quit; %let var_cnt=&sqlobs.;data _null_;
infile clippy;file revised lrecl=32767;length holdrec temp $32767;array varids(&var_cnt.) $32.;array findhead(&var_cnt.);array varpreskip(&var_cnt.);array varpostskip(&var_cnt.);retain varids findhead varpreskip varpostskip
newrec holdrec;
data form_varnames; informat varname $50.; input varname &; cards;1~Title~0~1~02~by~1~0~13~Type~0~0~04~Language:~1~0~05~Publisher:~1~0~1; variable #
1
variable name
Title
use variable name as search string
0
lines to skip before and after variable
1~0
Art Tabachneck, myQNA, Inc.
September 28, 2012
macro flipfile preprocesses the data (continued)%flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
input;_infile_=tranwrd(_infile_, &spaces., ' ');if _n_ eq 1 then do;
do j=1 to &var_cnt.*5-4 by 5;i=input(scan("&var_names.",j,"~"),best12.);varids(i)=strip(scan("&var_names.",j+1,"~"));if i eq 1 then holdrec=
strip(scan("&var_names.",j+1,"~"));else holdrec=cat(strip(holdrec),"09"x,strip(scan("&var_names.",j+1,"~")));findhead(i)=scan("&var_names.",j+2,"~");varpreskip(i)=
input(scan("&var_names.",j+3,"~"),3.);varpostskip(i)=scan("&var_names.",j+4,"~");if i eq &var_cnt. then put holdrec;
end;end;
Art Tabachneck, myQNA, Inc.
September 28, 2012
macro flipfile preprocesses the data (continued)%flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
var_counter+1;do i=1 to varpreskip(var_counter);
input;_infile_=tranwrd(_infile_, &spaces., ' ');
end;if findhead(var_counter) then do;
y=index(_infile_,strip(varids(var_counter)));z=y+length(strip(varids(var_counter)));
end;else z=1;temp=strip(substr(_infile_,z));if var_counter eq 1 then holdrec=temp;else holdrec=catx("09"x,holdrec,temp);do i=1 to varpostskip(var_counter);
input;_infile_=tranwrd(_infile_, &spaces., ' ');
end;if var_counter eq &var_cnt. then do;
put holdrec; var_counter=0;end;
run;
Art Tabachneck, myQNA, Inc.
September 28, 2012
macro flipfile preprocesses the data (continued)%flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
data _null_;file clippy;infile revised lrecl=32767;input;put _infile_;
run;%end;
%mend flipfile;
Art Tabachneck, myQNA, Inc.
September 28, 2012
read clipboard (or revised clipboard) %flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
Art Tabachneck, myQNA, Inc.
September 28, 2012
rearrange clipboard, read clipboard, count variables, and expand macro variable contents
%flipfiledata _null_;
length hold_rec $32767;infile clippy;input;_infile_=tranwrd(_infile_, &spaces., '09'x);var_count=countc(_infile_,,"H")+1;call
symput('var_count',strip(put(var_count,8.)))%expandr("var_formats",&var_formats)%expandr("var_informats",&var_informats)%expandr(("var_missing",&var_missing.)%expandr("var_units",&var_units)%expandr("var_prefix",&var_prefix)%expandr("var_suffix",&var_suffix)%expandr("var_upcase",&var_upcase.)%expandr("var_drop",&var_drop.)%expandr("var_labels",&var_labels.)%expandr("var_share",&var_share.)stop;
run;
%macro expandr (type,string); i=1; hold_rec=""; do while (scan("&string.",i," ") ne ""); if scan(scan(scan("&string.",i," "),1,"~"),2,"-") ne "" then do; start=scan(scan(scan("&string.",i," "),1,"~"),1,"-"); end=scan(scan(scan("&string.",i," "),1,"~"),2,"-"); end; else do; start=scan(scan("&string.",i," "),1,"~"); end=scan(scan("&string.",i," "),1,"~"); end; do j=start to end; hold_rec=catx(" ",hold_rec,cat(strip(j)||"~"|| strip(scan(scan("&string.",i," "),2,"~")))); end; i+1; end; call symput(&type.,strip(hold_rec));%mend expandr;
Art Tabachneck, myQNA, Inc.
September 28, 2012
f
d r
data _null_; file revised lrecl=32767; infile clippy end=eof; array headers(%sysfunc(max(&hrows.,1))) $32767.; array varnames(&var_count.) $32.; array formats(&var_count.) $32.; array informats(&var_count.) $32.; array renames(&var_count.) $32.; array prefix(&var_count.) $32.; array suffix(&var_count.) $32.; array labels(&var_count.) $32.; array miss(&var_count.) $255.; array upcases(&var_count.) $3.; array drops(&var_count.) $3.; array units(&var_count.) $32.; array share(&var_count.) $32.; array varlens(&var_count.); array vartypes(&var_count.); length hold_rec temp ivartype fvartype var_units var_names var_labels var_drop $32767; length missval $255; retain headers renames varnames vartypes varlens formats informats units prefix suffix labels drops miss upcases share grows_start grows_end;
Main Datastep
Art Tabachneck, myQNA, Inc.
September 28, 2012
parse the header row(s) and macro variables f
d r
data _null_;
le
input;_infile_=tranwrd(_infile_, &spaces., '09'x);if _n_ le &hrows. then headers(_n_)=tranwrd(tranwrd( tranwrd(_infile_, '%', 'percent'),'-','_to_'),'–','_to_');if _n_ eq &hrows. or (_n_ eq 1 and &hrows eq 0) then do; grows_start=scan("&guessingrows.",1,'-'); if missing(grows_start) then grows_start=&first_data_row.; grows_end=scan("&guessingrows.",2,'-'); if missing(grows_end) then grows_end=999999; do i=1 to &var_count.; %filarray(renames,&var_renames.) %filarray(prefix,&var_prefix.) %filarray(suffix,&var_suffix.) %filarray(units,&var_units.) %filarray(formats,&var_formats.) %filarray(informats,&var_informats.) %filarray(upcases,&var_upcase.) %filarray(drops,&var_drop.) %filarray(labels,&var_labels.) %filarray(miss,&var_missing.) %filarray(share,&var_share.) if &hrows. eq 0 then varnames(i)=cat("Col"||strip(i));
%macro filarray (type,string); if scan("&string.",i," ") ne "" then &type(scan(scan("&string.",i," "),1,"~"))= scan(scan("&string.",i," "),2,"~");%mend filarray;
Art Tabachneck, myQNA, Inc.
September 28, 2012
obtain and assign variable names
else do; varnames(i)=""; do j=1 to &hrows.; if j eq 1 and share(i) ne "" then do; if strip(scan(headers(j),share(i),,"HM")) ne "" then varnames(i)=strip(scan(headers(j),share(i),,"HM")); end; else do; if strip(scan(headers(j),i,,"HM")) ne "" then do; if strip(varnames(i)) ne "" then varnames(i)= strip(varnames(i))||"_"||strip(scan(headers(j),i,,"HM")); else varnames(i)=strip(scan(headers(j),i,,"HM")); end; end; if j eq &hrows. and varnames(i) eq "" then varnames(i)=cat("Col"||strip(i)); end;end;
Art Tabachneck, myQNA, Inc.
September 28, 2012
modify variable names and labels
if renames(i) ne "" then varnames(i)=renames(i);if prefix(i) ne "" then varnames(i)= strip(prefix(i))||strip(varnames(i));if suffix(i) ne "" then varnames(i)= strip(varnames(i))||strip(suffix(i));if strip(labels(i)) eq "" then labels(i)=strip(varnames(i)); else labels(i)=tranwrd(strip(labels(i)), '^', ' ');varnames(i)=tranwrd(strip(varnames(i)),'%', 'percent');varnames(i)=tranwrd(strip(varnames(i)),'-','_to_');varnames(i)=tranwrd(strip(varnames(i)),'–','_to_');varnames(i)=tranwrd(strip(varnames(i)),'#', 'number');varnames(i)=tranwrd(strip(varnames(i)), ' ', '_');varnames(i)=compress(varnames(i),,'kn');if anydigit(substr(varnames(i),1,1)) then varnames(i)=cat("_",strip(varnames(i)));var_names=catx(" ",var_names,strip(varnames(i)));var_labels=cat(strip(var_labels)||"label "|| strip(varnames(i))||"="||quote(strip(labels(i)))||";");
Art Tabachneck, myQNA, Inc.
September 28, 2012
if units(i) ne "" then var_units= catx(" ",var_units,strip(varnames(i))||"="|| strip(varnames(i))||"*"||strip(units(i))||";"); if drops(i) eq “YES” then var_drop= catx((" ",var_drop,strip(varnames(i))); end; if var_drop ne “” then var_drop= “(drop=)||strip(var_drop)||”)”; call symput('varnames',var_names); call symput('varlabls',var_labels); call symput('varunits',var_units); call symput('vardrop',var_drop); end; if _n_ ge &first_data_row. then do; if countc(_infile_,,"H")+1 eq &var_count. then do; do i=1 to &var_count.; temp=strip(scan(_infile_,i,,"HM")); if upcase(upcases(i)) eq "YES" then temp= upcase(temp); if strip(temp) ne "" then do; if miss(i) ne "" then do; k=1;
create macro variables for needed changes
Art Tabachneck, myQNA, Inc.
September 28, 2012
do while (scan(miss(i),k,"` ") ne ""); missval=tranwrd(strip(scan(miss(i),k,"` ")),'^',' '); temp=tranwrd(strip(temp),strip(missval), ''); k+1; end; end; if grows_start LE _n_ and grows_end GE _n_ then do; call missing(vartype); in_test = input(temp, ?? best12.); if not missing(in_test) then vartype=0; else do; in_test = input(temp, ?? anydtdte21.); if not missing(in_test) then vartype=2; else do; if index(temp,"$") then in_test = input(temp, ?? dollar21.); if not missing(in_test) then vartype=4; else do; if index(temp,",") then in_test = input(temp, ?? comma21.); if not missing(in_test) then vartype=5;
accomplish guessing rows and assign vartypes
Art Tabachneck, myQNA, Inc.
September 28, 2012
else do; if index(temp,"%") then in_test = input(temp, ?? percent21.); if not missing(in_test) then vartype=3; else vartype=1; end; end; end; end; if missing(vartypes(i)) then vartypes(i)=vartype; else if vartype ne vartypes(i) then vartypes(i)=1; if missing(varlens(i)) or length(temp) gt varlens(i) then varlens(i)=length(temp); end; end; if i eq 1 then hold_rec=strip(temp); else hold_rec=cat(strip(hold_rec),"09"x,strip(temp)); end; put hold_rec;end;
accomplish guessing rows
Art Tabachneck, myQNA, Inc.
September 28, 2012
if eof then do; ivartype=""; fvartype=""; do i=1 to &var_count.; if vartypes(i)=1 then do; itempvar=cat("$",strip(put(varlens(i),3.)),"."); ftempvar=itempvar; end; else if vartypes(i)=2 then do; itempvar="anydtdte21."; ftempvar="date9."; end; else if vartypes(i)=3 then do; itempvar="percent."; ftempvar="percent8.2"; end; else if vartypes(i)=4 then do; itempvar="dollar."; ftempvar=cat("dollar",strip(put(varlens(i),3.)),"."); end; else if vartypes(i)=5 then do; itempvar="comma."; ftempvar=cat("comma",strip(put(varlens(i),3.)),"."); end; else do; itempvar="best12."; ftempvar="best12."; end;
assign vartypes
Art Tabachneck, myQNA, Inc.
September 28, 2012
if strip(informats(i)) ne "" then itempvar=strip(informats(i)); if strip(formats(i)) ne "" then ftempvar= strip(formats(i)); ivartype=catx(" ",ivartype,"informat", varnames(i),itempvar,";"); fvartype=catx(" ",fvartype,"format", varnames(i),ftempvar,";"); end; call symput('informt',ivartype); call symput('formt',fvartype); end; end;run;
create macro variables for formats and informats
Art Tabachneck, myQNA, Inc.
September 28, 2012
data &outfile. &vardrop.; infile revised lrecl=32767 dsd delimiter="09"x; &informt.; &formt.; &varlabls.; input &varnames.; &varunits.;run;
proc delete data=work.form_varnames;run;
filename clippy clear;filename revised clear;
put everything together and read/write file
Art Tabachneck, myQNA, Inc.
September 28, 2012
and to complete the task
highlight the tableclick copy
enter desired macro variable settingsrun the code
Art Tabachneck, myQNA, Inc.
September 28, 2012
however, it may not be THAT simple in all cases!
not all tables are directly copyable(i.e., without losing critical table metadata)
Art Tabachneck, myQNA, Inc.
September 28, 2012
a useful free set of tools you might find helpful
Adobe Acrobat Reader 6http://www.oldapps.com/adobe_reader.php
Adobe Acrobat 5 TAPS Pluginhttp://www.pdfhacks.com/TAPS/
Art Tabachneck, myQNA, Inc.
September 28, 2012
how TAPS can be helpful (an example)
copy the first four lines from the table athttp://www.oecd.org/dataoecd/48/27/41498733.pdf
Art Tabachneck, myQNA, Inc.
September 28, 2012
how TAPS can be helpful (an example)
on your monitor the table will appear as:
Art Tabachneck, myQNA, Inc.
September 28, 2012
how TAPS can be helpful (an example)
but, if you paste it into notepad, it will appear as:
there is no indication that column #1's
heading is missing
the heading for column #10 appears in column #1 on the
2nd and 3rd rows
there is no indication that column #2 is missing
Art Tabachneck, myQNA, Inc.
September 28, 2012
how TAPS can be helpful (an example)
but if you open it with Adobe 6 Reader with TAPS:
Art Tabachneck, myQNA, Inc.
September 28, 2012
how TAPS can be helpful (an example)
drag a rectangle around the data you want
right click on Text-Flow, click on Table, then copy
Table
Art Tabachneck, myQNA, Inc.
September 28, 2012
how TAPS can be helpful (an example)
now, if you paste it into notepad, it will appear as:
Art Tabachneck, myQNA, Inc.
September 28, 2012
filename clippy clipbrd;filename revised temp;%let hrows=3;%let spaces=" ";%let first_data_row=4;%let var_renames=1~Country;%let var_labels=;%let var_prefix=2-10~Revenue_;%let var_suffix=;%let var_share=;%let var_formats=2-10~percent8.1;%let var_informats=2-10~best12.;%let var_units=2-10~.01;%let var_drop=;%let var_upcase-;%let var_missing=;%let guessingrows=;%let outfile=revenue;
how TAPS can be helpful (an example)then, using the following settings:
Art Tabachneck, myQNA, Inc.
September 28, 2012
how TAPS can be helpful (an example)
you will obtain the following SAS dataset:
Art Tabachneck, myQNA, Inc.
September 28, 2012
sometimes a table will copy as a single column
e.g., copy the four column table athttp://www.thelawyer.com/directory/uk-200-table-top-100/
Art Tabachneck, myQNA, Inc.
September 28, 2012
sometimes a table will copy as a single column
you'll find that entire columns get highlighted as you drag your mouse from left to right
although it appears as a 101 row 4 column tableit actually copies as a 404 row 1 column table
Art Tabachneck, myQNA, Inc.
September 28, 2012
adding the following macro variable assignmentswill transpose the data as it is being "pasted"
%let columns=4;%let rows=100;
and it will paste correctly
Art Tabachneck, myQNA, Inc.
September 28, 2012
importing non-tabular datafor example, if you wanted to copy the results
of a search from: http://www.worldcat.org/
Art Tabachneck, myQNA, Inc.
September 28, 2012
importing non-tabular datawhere each result contains the book title,
author, type, language and publisher
Art Tabachneck, myQNA, Inc.
September 28, 2012
our solution:one extra macro variable
%let hrows=1;%let first_data_row=2;%let data_form=YES;%lrt var_renames=2~Author;
let SAS know that the data appears as a form
rather than a table
+ the same macro variables used previously
Art Tabachneck, myQNA, Inc.
September 28, 2012
a datastep to describe the form's layout
data form_varnames; informat varname $50.; input varname &; cards;1~Title~0~1~02~by~1~0~13~Type~0~0~04~Language:~1~0~05~Publisher:~1~0~1;
Variable number Variable name or field header
Number of rows to skip before reading data
Whether the field has a header in the data (i.e.,
0=no, 1=yes)
Number of rows to skip after reading data
Art Tabachneck, myQNA, Inc.
September 28, 2012
and to complete the task
highlight the formclick copy
enter macro variable settings and field definitionsrun the code
Art Tabachneck, myQNA, Inc.
September 28, 2012
resulting in the following SAS dataset
Art Tabachneck, myQNA, Inc.
September 28, 2012
and, for those who want anon-point-and-click solution
Art Tabachneck, myQNA, Inc.
September 28, 2012
filename ddecmds dde "excel|system";options noxwait noxsync;x '"C:\Program Files\Microsoft Office\Office11\EXCEL.exe"';
data _null_; z=sleep(3);run;
data _null_; file DDEcmds; put '[open("c:\yourfilename.xls")]'; x=sleep(3);run;
use DDE to automate the select/copy process
Art Tabachneck, myQNA, Inc.
September 28, 2012
data _null_; file DDEcmds; put '[workbook.activate("Sheet1")]'; put '[select("C1:C6")]'; put '[copy()]';run;
data _null_; file DDEcmds; put '[error(false)]'; put '[quit()]';run;
use DDE to automate the select/copy process
Art Tabachneck, myQNA, Inc.
September 28, 2012
and to complete the task
enter macro variable settingsrun the code
Art Tabachneck, myQNA, Inc.
September 28, 2012
copy and paste almost anything
This presentation, the code and paperare available at:
http://www.sascommunity.org/wiki/Copy_and_Paste_Almost_Anything
Art Tabachneck, myQNA, Inc.
September 28, 2012
should work on any operating system
has a number of useful import features
tables can be pasted from any source that you can copy to your system's clipboard
modifiable (new features can be added)
benefits
avoid extra datasteps
only requires base SAS
Art Tabachneck, myQNA, Inc.
September 28, 2012
unsupported - can't complain to anyone if it doesn't work correctly
extremely software dependent
limitations
requires you to know your data
may require some additional software
Art Tabachneck, myQNA, Inc.
September 28, 2012
Your comments and questions
are valued and encouraged
Ben Powell London, [email protected]
Randy HerbisonRockville, [email protected]
John KingMount Ida, [email protected]
Richard A. DeVeneziaRemsen, [email protected]
Contact the Authors
Arthur Tabachneck, Ph.D.Thornhill, ON [email protected]
Nate DerbySeattle, [email protected]