a successful system for managing water quality and biological montioring data using ms access;...
TRANSCRIPT
A SUCCESSFUL SYSTEM FOR MANAGING WATER QUALITY AND BIOLOGICAL MONTIORING DATA
USING MS ACCESS; CT’S EXPERIENCE
AKA: Database-Smatabase: We don’t need no stinkin’ database
Mike BeaucheneCT DEPShadow IT Division
Where this talk will go…
• Pros/cons of different data management systems
• The nuts and bolts of a relational database
• CT’s ambient water quality data management system
…an endorsement for any particular commercially availableData management product.
…an infomercial for a product that can be purchased on your creditcard at the poster session for 2 easy payments of $19.95
This presentation is not …
…to encourage you to develop a relational database so you can better organize, store, maintain, and use your water quality data.
This presentation is…
By a show of hands….
• Who collects samples and then waits patiently for the lab to send the results ?
• Do you manage these results ?
• In an electronic format ?
• Do you enforce referential integrity?
• Do you “pivot tables” ?
A Basic Data Management System
Stuff GoesIN
Stuff ComesOUT
A Basic Data Management Slogan
GIGO
Examples of Basic Data
Management Systems
No IT support required
Usually abundant
Operates on coffee not oil
Impossible to summarize
Difficult to assimilate
Easily lost due to
Early Retirement
Lotto
Greener Pastures
“+”
“-”
Institutional Knowledge
No IT support required
Final report may look nice
Works well with “Institutional Knowledge”
“+”
Impossible to summarize
Difficult to assimilate
Very Dusty
The photocopiers are always broken
“-”
HARD COPY
Easy to use
Easy to distribute
Can make a report look good
False sense of security
Easy to shuffle your data Difficult to summarize
Difficult to assimilate
Impossible to ask “?”
“+”
“-”
Electronic Files(spreadsheets, documents,
etc.)
There really aren’t any but..
Moderate learning curve
Get what you ask for
Still need to know your data
Easy to share data
Stores lots of metadata
Answers complicated “?”
Keeps the data safe and secure
Never loses or shuffles results
Links to mapping software
Allows you to sleep at night
Helps you look really good
Stays with the agency when staff does not
“+”
“-”
Relational Database
The nuts and bolts of a Relational Database
WHAT ARE THE NUTS?
• Tables – Place holders for information – Organize the information by similarity– Store the information
• Queries– make demands upon the tables– manipulate data into ratios, indices, calculations– Add, update and delete records in a table(s)
WHAT ARE THE NUTS?• Forms
– User friendly version of a table(s)– Can be a more convient way to enter data
• Main form sub form– Can use features to help data entry
• Pick lists
• Reports– User friendly version of a query– Print out of data– Make labels for sample containers– Send data to the public
WHAT ARE THE BOLTS?
• Referential Integrity– Rules that allow your tables to play nice together
• Primary Key– A field(s) that makes each row unique
–USE A NON-INTELLIGENT CODE
WHAT ARE THE BOLTS?
• Input Mask / Validation Rule
– Templates for data entry• Dates/times• Appropriate values (between 1.0-14.0)
• Cascading Updates & Deletes
– Global changes to a dataset• Change a name, sample number, station name• Remove an entire set of data for a sample
• .
HOW DO THE NUTS AND BOLTS GO TOGETHER?
• “Raw Data” = “Result”– Dissolved oxygen = 8.5 ppm– Pteronarycs spp. = 12 individuals– Fragilaria leptostauron = 5 cells– Instantaneous discharge = 152 cfs
• “Metadata” = “Attributes or info to describe a result”
MORE ON METADATAYou can never have too much!!!
– Provides info to a secondary data user – Establishes data quality– Used in queries
• Manipulate data• Restrict or define data limits• Describe data
– Jogs your memory when some asks:• Where• When• What• Why
“We do not care as much about the accuracy of a result contained within as we do about not having enough information about the result….
…the metadata allows the secondary user to make the appropriate decision as to whether or not the data will be meaningful for their application.”
- Bob King and Lee Manning- STORET Architects and founders.
Connecticut’s Data Management System
Then (Pre 1998)
• NO AGENCY IT SUPPORT OR VISION (only Ernie’s)
• Existed as– Institutional knowledge– Hard copy– Lotus/SAS/word perfect format
• STORET as an option?
Between Then & Now
• NO AGENCY IT SUPPORT OR VISION (only Ernie’s)
• The relational ambient monitoring database started in July of 1998 using MS Access 2.0
• It was based upon the STORET model
• It would function as our day-to-day working database with periodic uploads to STORET
• Staff begin to create innovative nick-names for the DBA (me).
ThenNOW
NOW (2006)• Our Agency IT calls us “SHADOW IT”
• WE Have a data management policy– reduce reliance on all other data mgt systems
• MS Access 2000 – Front end for staff
• Data input forms• Generic buttons for query options
• STORET has…– Monitoring stations– Beach monitoring data– Lots more to do
• The DBA (me) has been removed from staff Christmas card lists
CT’s Data Management System
Stuff GoesIN
Stuff ComesOUT
Trip Info
Site Info
Sample Info
Results
Raw Data
Overdue results
WQS Exceedances
Project $$
Summary Calculations
QA
CT’s Relationships
Trip Info
Site Info
Sample Info
Results
Stuff Going IN
CT’s Relational Database Is..
Just Like A Pizza!!!!
Hierarchal Relationships
Trip Info
Sample Info
Results
Site Info
Data Management In CT- Now
Our database has…. Station information and lat. & long. (1800 sites) Physical/Chemical (175,000 data points) Macroinvertebrate (32,000 names & counts) Fish (267 samples 12,000 records) HOBO water temp. (lots and lots) Lots of other stuff
Our system…. Is an electronic log book of all samples
collected Is linked to ADB for 305(b) assessment updates Is linked to ArcGIS/ArcView for mapping Can be linked to SIM for uploads to STORET Needs IT support to go become a real data
management system
IN
Out
IN
IN
IN
IN
Out
IS NOT A Relational Database
A SERIES OF WORKSHEETS IN A SPREADSHEET OR A SERIES OF
SPREADSHEETS ORGANIZED IN A FOLDER
trip
date
Alk
ali
nit
y
Alu
min
um
, D
isso
lved
Alu
min
um
, T
ota
l
Am
mo
nia
Nit
rog
en
BO
D 5
day
Cad
miu
m,
Dis
so
lved
Cad
miu
m,
To
tal
Ch
lori
de
Ch
lori
ne,
To
tal
Ch
rom
ium
, D
isso
lved
Ch
rom
ium
, T
ota
l
Co
pp
er,
Dis
so
lved
Co
pp
er,
To
tal
Dete
rgen
t cm
plx
En
tero
co
cci
Esch
eri
ch
ia c
oli
Fecal
Co
lifo
rm
Flu
ori
de
Hard
ness
Hexavale
nt
Ch
rom
ium
, T
ota
l
Iro
n,
Dis
so
lved
Iro
n,
To
tal
Lead
, D
isso
lved
1/11/1999 23 0.071 0.112 0.1 1.6 0.001 0.001 75 0.15 0.001 0.001 0.006 0.008 0.1 29 0.25 0.365 0.0011/13/1999 14 0.1 1 0.001 0.001 48 0.001 0.001 0.007 0.009 120 1000 28 0.0011/15/2002 10 0.033 0.033 0.1 1 0.001 0.001 74 0.05 0.001 0.001 0.008 0.009 0.1 10 490 0.2 95 0.003 0.192 0.271 0.0021/16/2001 42 0.064 0.064 0.1 1 0.001 0.001 63 0.05 0.001 0.002 0.003 0.008 0.1 32 0.089 0.293 0.0011/2/2001 52 0.051 0.051 0.2 1 0.001 0.001 32 0.05 0.002 0.002 0.006 0.007 0.1 26 0.218 0.25 0.0011/20/1999 10 0.189 0.434 0.5 1.1 0.001 0.001 37 0.05 0.002 0.009 0.01 0.037 0.5 21 0.22 0.753 0.0021/24/2000 21 0.06 0.06 0.1 1 0.001 0.001 39 0.05 0.002 0.002 0.011 0.018 0.1 35 0.18 0.31 0.0011/27/1999 13 0.11 0.156 0.1 1 0.001 0.001 21 0.05 0.001 0.002 0.006 0.01 0.1 17 0.133 0.265 0.0011/29/2001 10 0.058 0.058 0.1 1 0.001 0.001 52 0.16 0.002 0.002 0.006 0.008 0.1 26 0.208 0.265 0.0011/3/2000 10 0.063 0.063 0.1 1 0.001 0.001 30 0.05 0.001 0.001 0.011 0.011 0.1 21 0.185 0.24 0.0011/31/2000 12 0.064 0.064 0.2 1.4 0.001 0.001 120 0.05 0.002 0.002 0.009 0.01 0.1 39 0.163 0.288 0.0011/4/1999 15 0.062 0.204 0.1 1.4 0.001 0.001 53 0.06 0.001 0.003 0.007 0.014 0.1 19 0.213 0.645 0.001
10/11/2000 62 0.04 0.04 0.1 1 0.001 0.001 51 0.002 0.002 0.013 0.018 62 98 45 0.258 0.303 0.00110/12/1999 25 0.11 0.11 0.1 1 0.001 0.001 29 0.05 0.001 0.001 0.009 0.013 0.1 41 170 32 0.23 0.333 0.00110/16/2000 25 0.03 0.031 0.1 1 0.001 0.001 56 0.05 0.002 0.002 0.013 0.015 0.1 34 0.253 0.318 0.00110/18/1999 16 0.121 0.302 0.1 1.5 0.001 0.001 31 0.05 0.001 0.002 0.008 0.014 0.1 24 0.15 0.698 0.00110/2/2002 64 0.056 0.056 0.2 1 0.001 0.001 59 0.001 0.001 0.013 0.013 10 220 29 0.116 0.227 0.001
10/22/1998 16 0.1 1 0.001 0.001 27 0.001 0.001 0.01 0.014 10 300 49 0.00110/23/2001 20 0.023 0.023 0.1 1 0.001 0.001 60 0.001 0.001 0.011 0.013 0.1 10 110 54 0.173 0.226 0.00110/25/1999 12 0.062 0.071 0.1 1 0.001 0.001 30 0.06 0.001 0.001 0.009 0.011 0.1 22 0.188 0.338 0.00110/30/2000 30 0.031 0.031 0.1 1 0.001 0.001 58 0.05 0.002 0.002 0.015 0.041 0.1 31 0.305 0.36 0.00110/4/1999 10 0.112 1.3 0.1 3.9 0.001 0.001 18 0.05 0.002 0.006 0.01 0.05 0.1 21 0.14 2.11 0.00210/5/1998 29 0.1 6.5 0.001 0.001 61 0.001 0.001 0.011 0.023 10 890 57 0.00111/1/1999 10 0.069 0.069 0.1 1 0.001 0.001 23 0.05 0.002 0.002 0.008 0.012 0.1 31 0.25 0.328 0.001
11/13/2000 38 0.107 0.107 0.2 1 0.001 0.001 26 0.06 0.001 0.001 0.007 0.009 0.1 14 0.283 0.393 0.00111/15/1999 10 0.064 0.064 0.1 1 0.001 0.001 39 0.05 0.002 0.001 0.008 0.009 0.1 32 0.223 0.3 0.00111/16/1998 20 0.036 0.036 0.1 1 0.001 0.001 39 0.05 0.001 0.001 0.017 0.017 0.1 37 0.215 0.255 0.00111/19/2001 34 0.036 0.036 0.2 1 0.001 0.001 28 0.001 0.001 0.014 0.014 39 0.216 0.247 0.00111/22/1999 18 0.058 0.058 0.1 1 0.001 0.001 37 0.05 0.002 0.002 0.012 0.008 0.1 35 0.205 0.27 0.00111/23/1998 10 0.045 0.045 0.1 1 0.001 0.001 45 0.17 0.002 0.002 0.013 0.015 0.1 42 0.208 0.265 0.00311/27/2000 30 0.084 0.084 0.1 1 0.001 0.001 34 0.05 0.003 0.003 0.007 0.009 0.1 29 0.245 0.39 0.00311/3/1998 32 0.051 0.035 0.1 1 0.001 0.001 50 0.13 0.001 0.001 0.013 0.017 20 150 47 0.218 0.29 0.001
11/30/1998 44 0.065 0.076 0.1 1 0.001 0.001 1 0.05 0.001 0.001 0.018 0.021 0.1 10 0.253 0.373 0.00111/8/1999 14 0.078 0.078 0.1 1 0.001 0.001 37 0.05 0.001 0.001 0.007 0.011 0.1 23 0.223 0.288 0.00111/9/1998 30 0.03 0.03 0.1 1.3 0.001 0.001 52 0.05 0.001 0.001 0.013 0.015 0.1 45 0.243 0.27 0.001
12/13/1998 16 0.048 0.048 0.1 1 0.001 0.001 39 0.05 0.001 0.001 0.015 0.019 0.1 31 0.235 0.293 0.001
On and on to column AAZZ
Dow
n and down to row
63,999
Out
StreamName/FacilityName Sages Ravine Brooksitenumber CT 01-08proximity 500 feetlandmark/facility name upstream route 41Municipality Salisbury
Max of value tripdateChemParameter unit 5/23/2002 8/8/2002 10/21/2002 4/10/2003 Grand TotalAlkalinity ppm 10 10 17 5 17Aluminum, Dissolved ppm 0.06 0.035 0.078 0.078Aluminum, Total ppm 0.06 0.035 0.078 0.078Ammonia Nitrogen ppm 0.1 0.1 0.1 0.002 0.1BOD 5 day ppm 1 1 1 0.1 1Bromide ppm 0.1 0.1Cadmium, Dissolved ppm 0.001 0.001 0.001 0.001Cadmium, Total ppm 0.001 0.001 0.001 0.001Calcium ppm 3.33 3.33Calcium, Total ppm 2.1 3.4 2.2 3.4Chloride ppm 10 3.7 1 1.5 10Chlorophyll-a Periphyton mg/m2 20.57 20.57Chromium, Dissolved ppm 0.001 0.001 0.001 0.001Chromium, Total ppm 0.001 0.001 0.001 0.001Copper, Dissolved ppm 0.005 0.001 0.003 0.005Copper, Total ppm 0.007 0.004 0.007 0.007Enterococci MPN colonies per 100 mls 10 10 10 10 10Escherichia coli MPN colonies per 100 mls 10 10 10 10 10Fluoride ppm 0.2 0.2 0.2 0.1 0.2Hardness ppm 13 12 12 13Hardness, Total (as CaCO3) mg CaCO3/L 10.7 10.7
ppm 6.818012 6.818012Iron, Dissolved ppm 0.025 0.032 0.055 0.055Iron, Total ppm 0.03 0.032 0.058 0.058
Export to Microsoft Excel & use the Pivot Table tool
Out
Take the plunge…..
START SIMPLE !!!TABLE #
1. Trips (date, who, why, what)
2. Sites (id, location, drainage, lat & long)3. Samples (lab number, field methods, gear,)
4. Results (lab number, value, unit, method)
1 2 3 4
• Define your KEY FIELDS – The combination of which are will be unique for that record.
USE NON-INTELLIGENT CODING AMAP!
• Develop strong RELATIONSHIPS– Enforce REFERENTIAL INTEGRITY– Encourage CASCADING UPDATES
• Use validation rules and input masks – Restricts entry to appropriate values
Build based on your needs !!!
• Lookup Tables (use in pick lists and queries)
• Staff info• Method info• Equipment specs• Ecological attribute stuff
USE YOUR DATA !!!• Queries and reports
Quality control/Quality assurance (DQO’s)
Summary ReportsWater quality assessmentsTaxonomic distributionsTMDL development and
implementationFind where or where not to go
fishingShare with othersBudget review and or planningStaff performance evaluation
%a
%a
%a
%[%a
%a%a%a
%a%[
%[
%a
%[
%a
%a
%a%a%a
%a
%[
%[
%a%[
%a%a
%[%[ %[
%[
%a
%[
%[
%a
%[
%a
%a%[
%a
%a
%[
%a
%[
%a
%a
%a
%a
%a
%a
%[
%[
%[
%[%[
%[
%[%a
%[%a
%a%a
%a
%[
%[
%a
%[
%[
%a%a
%[
%a
%[%a
%[%[
%a
%a%a%[ %a%[
%a
%[%a
%a
%a%a
%[
%[
%[%a
%a
%a
%[
%a%[
%[
%a
%a%a%a%a
%[
%[%[
%a%a%a
%a%a%a
%a
%[
%[ %[
%[
%[ %[
%a
%[
%a%a%a
%[
%[
%a%a
%a%[
%[%a%[
%[
%a
%a
%[
%[ %[
Major BasinPawcatuckSE CoastalThamesConnecticutS. Central CoastalHousatonicSW CoastalHudson
Total Nitrogen as N (ppm)
1.76 - 63.75
%[ 1.06 - 1.75
%a 0.66 - 1.05
0.15 - 0.65Median total nitrogen as N valuesData Distribution Legend
Value is above the 75 percentile
%[ Value is between the median and the 75 percentile
%a Value is between the 25 percentile and the median
Value is below the 25 percentile
Data Management Tools Database vs. Spreadsheet vs. Hardcopy
Database Spreadsheet Hardcopy
Designed to store information
Yes and very efficiently
No, better to use to present and analyze
data.
No, One page at at time
Stores Metadata Yes and very efficiently
No Not really No, One page at a time
Prevents mistakes Yes, If told to do so No Encourages mistakes
No, Does not speak
Query capability
(Can ask questions)Yes many often
complicatedNo Somewhat but limited mostly to
calculations
No, does not speak
Shareable Yes everyone has the most recent data
at their desktop
Somewhat- must either setup shared folder or
email specific files based on request
No, waste of photocopy toner and
paper. Not enough file cabinets. Can get lost.
Makes our job easier
Yes, but can make more work because
we use the data
Yes if used in conjunction with a
database (pivot table, graphs, etc)
No, waste of time and effort unless used to
QA data in the database.
TAKE BACK TO WORK MESSAGES
• TAKE THE LEAP!!!! IT IS EASIER THAN ONE WOULD THINK
• YOU WILL BE SURPRISED AT HOW MANY INCONSISTENCIES YOU ACTUALLY FIND
• YOU WON’T BE ABLE TO LIVE WITHOUT ONE
• EVERYTHING IN THE WORLD TURNS INTO EITHER A “1” OR A “0”
The Last Word
• STORET: http://www.epa.gov/storet/
• National Data Standards: http://wi.water.usgs.gov/methods/tools/wqde/index.htm