who am i and why am i here: lakes ... · semantic web/linked data users community ... –...

34
Safe and Sustainable Water Resources Research Program Safe and Sustainable Water Resources www.epa.gov/research Safe and Sustainable Water Resources Research Who am I and why am I here: Lakes, Linked Data, and R Jeffrey W. Hollister US EPA, Atlantic Ecology Division Semantic Web/Linked Data Users Community 15 August 2012

Upload: others

Post on 31-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

www.epa.gov/researchSafe and Sustainable Water Resources Research

Who am I and why am I here: Lakes, Linked

Data, and R

Jeffrey W. HollisterUS EPA, Atlantic Ecology Division

Semantic Web/Linked Data Users Community

15 August 2012

Page 2: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Talk Outline

• Introduction

– Who am I?

– Why am I here?

• Lakes• Lakes

– Lake Volume

– Lake Depth

– Linked Data Plans

• R

– Overview

– How I use it

2

Page 3: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Who am I?• Landscape Ecologist

– GIS and Statistics

– Research: Landscape Structure and Water

Quality

• Past Work Experience– J.W. Jones Ecological Research Center

– National Ecological Observatory Network

(NEON)

– Introduced to Ecoinformatics

• Current Position– Research Ecologist with US EPA

– Introduced to Reproducible Research 3

Page 4: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Who am I?

• My Agency and Mission

– US Environmental Protection Agency

• Office of Research and Development• Office of Research and Development

– National Health and Environmental Effects Research Lab,

» Atlantic Ecology Division

• Monitoring and Assessment Branch

• Research Ecologist

4

Page 5: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Why am I here?

• New Focus

– Informatics, Decision Support, Tool Development

5

Page 6: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Northeast Lakes Projects• Mulitple Research Plans and Years

– 2007-Present

• Common Denominators

– Lakes– Lakes

– Nutrients

– Research Questions:

• How do changes in nutrients change

delivery of ecosystem services/risk

associated with Cyanobacteria?

– Project Goals:

• Data Sharing

• Reproducible Research

• Decision Support6

Page 7: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Ecosystem Services in Lakes

• Swimming

• Fishing

• Drinking Water

• Drinking Water

• Property Values

• Existence Value

• Aesthetics

7

Page 8: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

�� Aesthetic AppealAesthetic Appeal

�� Disturbance Disturbance

�� Biotic IntegrityBiotic Integrity

�� Recreational ValueRecreational Value

How are lakes perceived in the National Lakes

Assessment?

Written Comments from Lakes in Highest Appeal Categories

�� Recreational ValueRecreational Value

�� SwimmabilitySwimmability

8

Page 9: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Nutrients and Ecosystem Services

9

Page 10: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

1.) Start with field

data

2.) Combine with

landscape data in

function

3.) Predict Appeal for

~18,000 Lakes

10

Page 11: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Lake Morphometry

• Ordered Logit Models

– Need Residence time

• Existing data• Existing data

– Limited resources

– ~18,000 Lakes

11

Page 12: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Question #1

What is the best way to estimate lake volume

given, lake shoreline and maximum depth?

Citation: Hollister, J. W., W.B. Milstead (2010). Using GIS to

Estimate Lake Volume from Limited Data. Lake and

Reservoir Management. 26(3)194-199. Contribution no.

AED-10-018.

12

Page 13: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Hig

h D

ata

Ne

ed

s

Frustrum

Depth = f(Distance)

Actual

Low Realism

Low

Da

ta N

ee

ds

High Realism

Cone

Depth = f(Distance)

13

Page 14: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Methods

• Accuracy Assessment

– Bathymetry data

Partridge Lake

Bathymetry Data

– Bathymetry data

• NH DES for 132 lakes

– Created TIN for each lake

– Calculated volumes

• Cone v TIN

• GIS Method v TIN

14

Page 15: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Results - Volume Error Analysis

MethodMethod RMSDRMSD MDMD MADMAD P(Better)P(Better)

GIS GIS –– All All LakesLakes

3,287,3603,287,360 86228622 200734200734 0.590.59

Cone Cone –– All All LakesLakes

6,975,7406,975,740 608967608967 225502225502 0.410.41

15

Page 16: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Estimating Maximum Lake Depth:

Question #2• Method in Question #1 assumes a measurement of

maximum lake depth is available

• Is it possible to create a reasonable estimate of lake

depth from the topography surrounding a lake?

Citation: Hollister, J. W., W.B. Milstead, M.A. Urrutia (2011). Predicting

Maximum Lake Depth from Surrounding Topography. PLoS ONE

6(9): e25764. doi:10.1371/journal.pone.0025764. Contribution

no. AED-11-013

16

Page 17: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Predicting Maximum Lake Depth

• Select surrounding

topography

Determine median slope• Determine median slope

• Determine maximum

distance in lake

• Depth

– Max.Dist * Median.Slope

17

Page 18: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Assessing the method

• Compare to measured data– National Lakes Assessment Data

– Web reported depths

Over predicts• Over predicts

• Fit NLA model

• Use NLA model to correct

18

Page 19: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Assessing the method

• Compare to measured data– National Lakes Assessment Data

– Web reported depths

Over predicts• Over predicts

• Fit NLA model

• Use NLA model to correct

19

Page 20: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Web Depth Comparisons

20

Page 21: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Linked Data and Lakes

• Starting a pilot– Convert Lake Morphometry data to Linked Data

– Working with Mike Pendelton and David Smith

• Why?• Why?

– Education (mine)

– Interest

– Potential for many more datasets

• NLA

• Wildlife

• Merganser

21

Page 22: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

R

• Overview

– Background

Extending R– Extending R

– Linked Data and Semantic Web

• How I use it?

22

Page 23: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

R: Overview

• What is R?• “Language and environment for statistical

computing and graphics”

• Similar to the S language• Similar to the S language

• Extensible

• Why use R?• FREE!

• Publication quality graphics

• Get only the results you want (i.e. no reams of

output)Examples borrowed from: http://addictedtor.free.fr/graphiques/

23

Page 24: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

R: Overview

• Why is it named R?

• First name of the original authors

• Ross Ihaka and Robert Gentleman

• Play on the S+ language• Play on the S+ language

• S- as a opposed to S+

• Who wrote/writes/contributes to R?

• R Core Team

• Anybody

• Is it reliable?

• Open source = Peer review24

Page 25: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

R: Overview• Expanding R:

– Packages• Contributed collections of analytical tools

• Extends the scope and utility of R

• Currently 3978 packages available• Currently 3978 packages available

– Task Views• Currently 30 (e.g. Environemetrics, HighPerformanceComputing,

Spatial, etc.)

• None (yet) related directly to Semantic Web and Linked Data…

– R, not just for stats anymore• GIS (sp, rgdal, raster, rgeos, …)

• Web and Related (Brew (+ rApache module), XML, rCurl, …)

• Linked Data Packages: (SPARQL , rrdf) 25

Page 26: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

More Information– R Install

• http://cran.r-project.org/doc/manuals/R-intro.pdf

– Websites

• R - http://www.r-project.org/

• CRAN - http://cran.r-project.org/

• simpleR - http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf• simpleR - http://cran.r-project.org/doc/contrib/Verzani-SimpleR.pdf

– Listservs

• R-help - https://stat.ethz.ch/mailman/listinfo/r-help

• R-sig-ecology - https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

– Books

• Venables and Ripley, Modern Applied Statistics with S

• Dalgaard, Introductory Statistics with R

• Many, many Others!

– Journal

• Journal of Statistical Software - http://www.jstatsoft.org/

26

Page 27: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

R: How I use It?

• Basic Scripts and New Functions

• Demo

27

Page 28: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

R: How I use It?

• Managing scientific workflow

– From R

From Python– From Python

28

Page 29: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

R Script

29

Page 30: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Python Script

30

Page 31: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

R: How I use It?• R for

Web Apps• Check out ggplot2

• http://yeroon.net/ggplot2• http://yeroon.net/ggplot2

31

Page 32: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

R: How I use It?• R for GIS

• Demo

32

Page 33: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

• NLA Field Crews, Collaborators, & Analysis Team

• Richard Moore, USGS, MRB1 SPARROW

Acknowledgements

Richard Moore, USGS, MRB1 SPARROW

• Hilary Snook, Toby Stover & Carol Elliot, EPA, NELP

• John Kiddon, Jane Copeland, & the AED Aquatic Ecosystem Services Research Group

• Mike Pendleton, David Smith and George Thomas

33

Page 34: Who am I and why am I here: Lakes ... · Semantic Web/Linked Data Users Community ... – Introduced to Ecoinformatics • Current Position – Research Ecologist with US EPA

Safe and Sustainable Water Resources Research ProgramSafe and Sustainable Water Resources

Questions?

34