import from excel

6
Importing Data from Excel Spreadsheets Dos, don’ts, and updated procedures for ArcGIS 10 By Mike Price, Entrada/San Juan, Inc. This exercise models data from a well-known gold and base metals mining area in northern Nevada located near the town of Battle Mountain. Although the sample data is synthetic, it is true to the underlying geology of Battle Mountain, Nevada. .4•,• .-~ - ~ A,~Mq. N. 1* ~ —t 5~ ~ O~ ~ .:‘~ n~• ~ ,ac.~.t - Vt a n •- ,, . . . . . . . .. . ~ Many organizations keep valuable data ~ in Microsoft Excel and comma-separated ~““ ~.—t I values (CSV) files. Learn a methodology ~ . for importing data kept in Excel and CSV ;. t~ files into ArcGIS that has been updated for ~ ~. £i~ ~ ArcGIS 10 and Microsoft Office 2007/2010. ~ ~ :i~ Excel spreadsheets have been used since ~ ft ~ “—k” , the release of ArcGIS 8 to prepare and ~ ~ import tabular data into a GIS. Previous - . . L ArcUser articles described the benefits and limitations of spreadsheets in the version ~ , of ArcGIS current at that time. In early at’” ‘~ r~ 2004, ArcUser editor Monica Pratt wrote (P “Working with Excel in ArcGIS” (esri.com ~ news/arcuser/0104 files excel.pdf). In 2007, the author wrote another article on the same topic, “Mapping and Modeling Groundwater Geochemistry” tool, focusing on ArcGIS 10 and Excel 2007/2010. The tutorial uses (esri.com/news/arcuser 0207 files groundwater.pdf). spreadsheets to create and enhance geologic data. Field samples in- Since these articles were published, Microsoft has released two dude Hydrogeochemical Stream Sediment Reconnaissance (HSSR) new versions, Excel 2007 and Excel 2010. With each release, spread- points plus custom soil and rock data. In this exercise, we will model sheet capabilities have improved and the processes for importing a well-known gold and base metals mining area in northern Nevada, data into ArcGIS have changed. This article updates and refines located near the town of Battle Mountain. The custom samples are rules and procedures for importing Excel 2003 files into ArcGIS 9.x. typical of data that might come from the field, assayed by a modern This exercise reexamines the Excel spreadsheet as a data import analytic laboratory. su Spring 2012 esri.com

Upload: brian-meyers

Post on 22-Oct-2014

71 views

Category:

Documents


0 download

TRANSCRIPT

Importing Data from Excel SpreadsheetsDos, donts, and updated procedures for ArcGIS 10By Mike Price, Entrada/San Juan, Inc.

This exercise models data from a well-known gold and base metals mining area in northern Nevada located near the town of Battle Mountain. Although the sample data is synthetic, it is true to the underlying geology of Battle Mountain, Nevada.~ A,~Mq. N. 1*

.4,

.-~

~

t

5~

~ ~

O~

.:~

n~

a

n

~

,ac.~.t

-

Vt

-

,, .

.

. .

.

.

.

~

..

.

Many organizations keep valuable data in Microsoft Excel and comma-separated ~ ~.t values (CSV) files. Learn a methodology ~ for importing data kept in Excel and CSV files into ArcGIS that has been updated for ~ ArcGIS 10 and Microsoft Office 2007/2010. ~ Excel spreadsheets have been used since ~ k the release of ArcGIS 8 to prepare and ~ ~ import tabular data into a GIS. Previous ArcUser articles described the benefits and limitations of spreadsheets in the version ~ of ArcGIS current at that time. In early at 2004, ArcUser editor Monica Pratt wrote Working with Excel in ArcGIS (esri.com ~ news/arcuser/0104 files excel.pdf). In 2007, the author wrote another article on the same topic, Mapping and Modeling Groundwater Geochemistry (esri.com/news/arcuser 0207 files groundwater.pdf). Since these articles were published, Microsoft has released two new versions, Excel 2007 and Excel 2010. With each release, spreadsheet capabilities have improved and the processes for importing data into ArcGIS have changed. This article updates and refines rules and procedures for importing Excel 2003 files into ArcGIS 9.x. This exercise reexamines the Excel spreadsheet as a data import

~

I;. t~

.

~.

~

i~ ~ :i~ ,. .

~-

ft

L

,

~

r~

(P

tool, focusing on ArcGIS 10 and Excel 2007/2010. The tutorial uses spreadsheets to create and enhance geologic data. Field samples indude Hydrogeochemical Stream Sediment Reconnaissance (HSSR) points plus custom soil and rock data. In this exercise, we will model a well-known gold and base metals mining area in northern Nevada, located near the town of Battle Mountain. The custom samples are typical of data that might come from the field, assayed by a modern analytic laboratory.

su Spring 2012 esri.com

Hands On

What You Will Need ArcGIS 10 for Desktop Microsoft Excel 201 0/2007 or 2003 or the 2007 Office System Driver Sample dataset from the ArcUser website

Best Practices When Using Excel Files with ArcGISHow many times have you imported spreadsheet data into ArcGIS only to find some cells empty or formatted with an unwanted field type? To avoid these and other problems, follow these practices when creating and maintaining data in Excel you will use in ArcGIS.

1. When creating spreadsheets, make sure fields are fewer than 255 characters. A Word about Microsoft Excel VersionsIf you have installed Office 2007, you can read .xls and .xlsx files. If you have Office 2003 or 2010 installed, you can read .xls files, but you will need to install the 2007 Office System Driver to read .xlsx files. If you do not have Microsoft Excel installed, you must install the 2007 driver before you can use either xis or .xlsx files. The 2007 Office System Driver can be downloaded from the Microsoft Download Center at links.esri.com/office_07_sysdriver. Carefully follow the installation instructions before you restart ArcGIS. Also, if you have previously specified on the File Types tab of the Customize > ArcCatalog Options dialog box that ArcCatalog show you .xls files, youll need to remove this file type to be able to access Excel files directly. Before beginning to work the exercise, read the accompanying ar ticle, Best Practices When Using Excel Files with ArcGIS, for valu able tips on working with Excel data. ArcGIS reads the first 255 field characters. Fields with more than 255 characters are converted to BLOB fields and are not read able. Abbreviate, manually truncate, or split any fields longerthan 255 characters.

2. Check the numeric field type before and after importing Excel data.ArcGIS typically converts spreadsheet numeric fields to double precision (Double), which may not meet your needs. If necessary, create new fields of the desired type and calculate values into them.

3. Check the format for date fields.ArcGIS 10 uses the Lotus date/time format. In this format, the calendar date is represented by a whole number value that rep resents the number of days since January 1, 1900, plus one day (due to a bug in Lotus 123 and carried over to Excel). Time is rep resented as the decimal portion of a 24-hour day. If date/time data is important, format the input spreadsheet using a standard Excel date/time format.

Getting Started: Examining Files in ArcCatalogTo begin this exercise, go to the Arc User website and download the training data, excelmagic.zip. Unzip the excelmagic.zip data into a project area on your local machine and start ArcCatalog. Navigate to the Battle_Mountain folder and locate the XLSFi1es folder. When ArcCatalog dis plays an Excel file, it adds a dollar B ~ Folder Connections ~ sign ($) to each worksheet name. 51 ~ a~ Inside this folder, expand all files. Locate Sample_Locations. S ~ Battle_Mountain 51 ~ GDBFiIes xlsx and preview Rock$. This S ~ XLSFiles B ~1 H~5R_LonLat83 Excel 2010 spreadsheet contains SD HSSR$ B ~j HSSR_LonLat83_2003 two worksheets named Rock$ SD HSSR$ and Soil$. Rock$ and Soil$ con ~ RockData 9 ~ Sample_Locations tam sample numbers, universal SD RockS SD SoilS transverse Mercator (UTM) ~ Samplej.ocations_2003 SD RockS coordinates, and field informa [ID SoilS tion that allow this data to be Soil_Data Qj Battle_MountainOl posted on a map. Next, preview 51 Toolboxes 51 ~[I Database Servers HSSR_LonLat83.xlsx and study 51 ~ Database Connections W~GlS Servers its only worksheet, HSSR$. 51 ~ Tracking Connections Next, locate and preview two CSV files, Rock_Data Preview the sample data in and Soil_Data. These files 0 ArcCatalog.

4. Follow ArcGIS field naming rules when creating Excel column names.The first row of an Excel worksheet sets the name for each column. Column names become field names when an Excel worksheet is imported into ArcGIS. Always follow these naming rules: Column/Field names must begin with a letter. Column/Field names must contain only letters, numbers, and the underscore character. Column/Field names must be no more than 64 characters. If a name is longer than 64 characters, ArcGIS retains the first 63 characters. Column/Field names may not consist solely of reserved words. Do not use these words in field names. See support.microsoft. com/kb/286335 for the list of reserved words. ArcGIS typically adds trailing underscore to reserved word field names added by copying and pasting from other sources. Column/Field naming rules also apply to named ranges im ported from Excel. For more information, search ArcGIS 10.0 web help for Excel worksheet.

euri.Com Spring 2012

au

lflb.Li.4.fl~05.1~

-

fo~

D~,

f25 00 2~I 49352010 4~21Th0024 45~4099 0,0

~=.

~z~~. !0 21

0~

K

I

,,

510(III

:~

II 12912

s.As

3006, I64~

4219 52075 4239 22610 41509303912027 5fl 22250 32 12101 4153 22191 ~ 12195 570 1,95 570 12101 4250 22195 4240 2193 4141 I~4 45 40 2512 4145 1~4221l5 4140 ~4125l9 0 22344 4110 22545 4251 19546 4250 12549 570 12200 570 27954 4l~i2243 4257 4254 4224 429100 22595

~

20034 25535 2~1 20035

~~42 30043 29044 29045 20046 91047 20990 26052 ~52

49379322 44350056 44355221 4~035 4903493 495344.20

41~ll2O 995* 4~3.03995* 44959509540,4 442194250,4 44056450240042 ~ 440095*039010 4496155049509 449951075940* 441950220 4025 54002019507-45 4490495209920

5,441.4964 ~900I~ 0~IAl296d LI0p07&4~ 49095w 430., 0I~

Sample Locations.xlsx. Open the Soil worksheet and inspect the data. This lo cation table contains 20,096 soil sample points posted in UTM North American aLum~,o303)

one

.11 ivieiers.

41910242 4~7414 4~1 91 499030.32 49011101

~

5*90,400264

iW0,~230.~0.l~ 149p., 00264 152~0641 11.525~

Coordinates are posted and displayed using a precision of 0.01 meters. Many 1 (1 samp es are co e y aim roup. m.-

~4~.43 44029521 4~04 95 44~5O954050 4499250239954 4495~91 6454 T0004~ T~0,~

m5*o.s

~

Save this spreadsheet as a new file so you can retain the original data as an archive. Name the new file Sample_Locations Import.xls.P1 ic e 301.10 wor.s.s eeu anu ioo.~. aus. e first row of data. Many text strings in this 1 0.C

~i10

29520 26996 2~7 ~6 29559 20~ 1~l 13 2~4

42994940 44951095 4~4412 49512025 47952595 41520101 4~l407 40040301 49034301 496~20

4490120340040 4410412.35 4045 4490203.29 P.04k 4499134950044 4449595959949 441035635100*

~

T~07~ oit.rr 321 164020~ ~e~_91_ 1000,~ 1btI,, m0095I4~7 14049,~1

15*

1

1..

.1 1

1

5.

10704203

12570 22522

2~5 29595 20950 2~~70

44~03l 0024 4420~42 4,04 44)9595290049

440004.09 44953420 419654451156

~ 4444010959904 4400190179049440701 111 ~

0_S

~ . ~ ~

row contain spaces. Change these spaces to underscores. (Hint: Select onlythe first35-

4203

720002064I

010~

rowanduseFindandReplace.) Next, clarify the coordinate system columns. Change Easting to UTM83Z11_E and Northing to UTM83Z11_N. Now define a named range. Move to cell

After field names have been corrected, create a named range in Excel called Rock_ Locations_lmport_R.

contain companion analytic data for the Rock$ and Soil$ worksheets. The [SAMPLENO] field in both CSV files will support a one-to-one tabular join with the same field in the Soil$ and Rock$ worksheets. Closely inspect the alignment of data in Soil_Data columns. Notice that [SAMPLENO] and [SBPPM] are aligned on the left side of the column while [AU_PPB], [AG_PPM], [AS_PPM], and [HG_PPB] are aligned on the right. Scroll down through the table and observe that many fields in the right-aligned columns are empty. In the source CSV file, many of the fields contain nonnumeric strings that do not display properly. Notice that [SB PPM], a left-aligned field, contains many fields that begin with a less than () symbol (e.g., >10,000 for gold). Fortunately, the over-limit samples in this dataset have already been resolved, so only the less than values need fixing. Since it is statistically meaningful to recognize that some small amount of each element exists in all samples, it is not appropriate to change all < values to zero. Instead, change them to a smaller ab solute value, typically 20 to 50 percent of the detection limit. Take a more conservative approach and use 20 percent. Table 1 lists the current value and smaller absolute value for elements below the minimum detection limit. Detection Limit5.0 ppm 5.0 ppm 5.0 ppb 10.0 ppb 0.5 ppm

ElementAntimony Arsenic Gold Mercury Silver

Abbr.Sb As Au Hg Ag

UnitPPM PPM PPB PPB PPM

Change From