global modeling of biodiversity and climate change
TRANSCRIPT
Global Modeling of Biodiversity and Climate Change
Falk Huettmann et al.-EWHALE lab-Biology and Wildlife DepartmentInstitute of Arctic BiologyUniversity of Alaska-FaibanksFairbanks Alaska
Scientific Landmines ?!
Data Mining
Machine Learning
Predictions
Metadata
Data Sharing (online)
Economic Growth problem
Diseases(Influenza)
Spatial/Geographic Information Systems (GIS) and…
SustainabilityManagement
Central to our work:Predictions in Space and Time, e.g. done best with Machine Learning
-quantitative-spatial-statistical interactions included-one formula-one algorithm-repeatable-testable-transparent-open access
How GIS and machine learning connect… A Work Flow
ArcGIS 10.2
SalfordRGMEPython etc
Tree/CART - Family
Binary recursive partitioning
Leo Breiman 1984, and others
YES NO
Temp>15
Precip <100
Temp<5
PURITY METRIC OF NODES
TreeNet(~A sequence of CARTs) ‘boosting’
Variable Score LDUSE 100.00 ||||||||||||||||||||||||||||||||||||||||||TAIR_AUG 97.62 |||||||||||||||||||||||||||||||||||||||||HYDRO94.35 ||||||||||||||||||||||||||||||||||||||||DEM94.01 |||||||||||||||||||||||||||||||||||||||PREC_AUG 90.17 ||||||||||||||||||||||||||||||||||||||POP 82.54 ||||||||||||||||||||||||||||||||||HMFPT81.46 ||||||||||||||||||||||||||||||||||
+ + + +
Importance Value ROC curves for accuracy tests
e.g. correctly predicted absence app. 97%
e.g. correctly predicted presence app. 92%
=>Apply to a dataset for predictions
The more nodes…the more detail…the slower
ROCeach explains the remaining variance til the end…
Difficult to interpretbut good graphs
DEM Slope Aspect Climate Land-cover
1
2
3
4
5
Ran
dom
set
of
Row
s(C
ases
)
Random set of Columns(Predictors)
Random set 1
Random set 2
RandomForest (Prasad et al. 2006, Furlanello et al. 2003 Breiman 2001)
Bagging: Optimization based on In-Bag, Out-of Bag samples
In RF no pruning => Difficult to overfit (robust)
Boosting & Bagging algorithms
Difficult to interpretbut good graphs
Handles ‘noise’, interactionsand categorical data fine!
Average Final Treefrom e.g.>2000 treesdone by VOTING
Machine Learning example with GIS:Spoon-billed Sandpiper and Predictions (where are the wintering grounds of ca. 1000 highly endangered birds…?)
Engler et al.(in prep)
(winter)
(breeding,Kamchatka)
Data means Metadata and Data Management (specifically for GIS, for science projects, machine learning and for graduate students)
___________Field Season 1_________
Raw Dataset 1 Metadata
Raw Dataset 2 Metadata
Raw Dataset 3 Metadata etc.
Raw Dataset 4 Metadata
Raw Dataset 5 Metadata
http://mercury.ornl.gov/clearinghouse/
=> Digital Publications
___________Field Season 2 & 3_________
A. Baltensperger
Two books by the EWHALE lab re. Predictions and related Philosophiesas presented here
Students & Projects of the EWHALE lab
Andy Baltensperger
Shana Losbaugh
Keiko Akasofu Herrick
Katherine Miller
Sue Hazlett
Tim Mullet
Betsy YoungBrian Young
Zach Meyers
Ben BestImme Rutzen
Michal Lindgren
Students & Projects of the EWHALE lab
Laszlo Koever(Uni Debrecen, Hungary)
Dmitry Korobitsyn(Uni Archangelsk, Russia)
Cynthia Resendiz(Mexico)
Moritz Schmid(Uni Goettingen, Germany)
Students & Projects of the EWHALE lab: Visitors
David Lieske(Mount Allison, Canada)
Some Examples of what the EWHALE lab does, internationally
(~how Falk spent his sabbatical and time)
Ocean View I: A Global Benthos Model…(RandomForest Predictions)
Wei et al. (2011). Global Patterns and Predictions of Seafloor Biomass using Random Forests. PLOS 5(12): e15323.
Spatial Predictions of Arctic (Pelagic) Seabirds
What Data are used: Pelagic Seabird Data ?!
Relevance of ArcticSpecimen Collections vs.
(Polarstern)
Public data
+ High Quality Content
+ Metadata ?!
What Environmental Data were Used (Listed in no order)
Spatial Predictions of Arctic (Pelagic) Seabirds
Public Sources &Availability
Huettmann &Hazlett (2009)for 50 layers
1. Distance to ice edge2. Sea temperature at 10m depth3. Sea temperature at 0m depth4. Phosphate concentration at 10m depth5. Silicate concentration at surface6. Phosphate concentration at surface7. Salinity at 20m depth8. Distance to Settlements (!)9.Salinity at surface 10.Silicate10m depth11. Discharge from rivers12. Distance to shelf edge13. Seaice thickness14. Nitrate concentration at surface 15. DMS (Di-Methyl Sufide) at surface (G. Humphries in prep.)16. Nitrate concentration at 10m depth17. Bathymetric slope
Spatial Predictions of Arctic (Pelagic) Seabirds
How it looks like: Training and Assessment Data
Presence (blue)vs.Random (red)(Pseudo- absence)
+
…Algorithm
Env. Data
=>Predictions
Spatial Predictions of Arctic (Pelagic) Seabirds
How it looks like: Training and Assessment Data
Presence (blue)vs.Random (red)(Pseudo- absence)
Assessment(green; telemetry O. Gilg)
+
…
Env. Data
=>Predictions
…Algorithm
Spatial Predictions of Arctic (Pelagic) Seabirds
How it looks like: Predictions
Prediction SurfaceLegend Red/Yellow=Presence
Light blue: Weak Presence
Dark blue: Pseudo- absence
Draft
1
Spatial Predictions of Arctic (Pelagic) Seabirds
How it looks like: Predictions and its data
Prediction SurfaceLegend Red/Yellow=Presence
Light Blue: Weak Presence
Dark Blue: Pseudo- absence
Green: Assessment Data (O.Gilg)
Draft
1
Circumpolar Arctic: 27 Seabird Open Access Predictions
Ivory Gull Ross’s Gull Black-legged Kittiwake
Tufted Puffin Horned Puffin Northern Fulmar
…add up all predictions…
Huettmann et al. (2011)
Circumpolar Arctic: Putting Models to Use
vs.
=>We are running out of space and time in the Arctic (and anywhere else)
Seabird
Circumpolar Arctic: Alaskan Crab Ensemble Model => Open Access (Raw Data + Model) in a highly commercial setting!
Compiled Raw Crab Data Predicted Crab Pres/Abs (and Abundance)
Snow Crab off Alaska(Hardy et al. 2011)
Circumpolar Arctic: Marine Protected Areas (MPA) and Biodiversity
Huettmann and Hazlett (2010)
MARXAN optimizationbased on over 60 GIS layers
=>Over 20 GIS data layers for eachPole (Arctic and Antarctic)
Antarctica: MPA by WWF-Australia for the Scientific Committee on Antarctic Research (SCAR)
WWF-Australia,SCAR 2012
What is a Soundscape?
• Biological Sounds– Biophony
• Geophysical Sounds– Geophony
• Anthropogenic Sounds– Anthrophony
Mullet et al. (in prep)
Model-Predicting Sound(‘Soundscapes’)
Models based on:- 7 permanent sound stations - Stratified according to expected
sounds
- Rotate 6 sound stations– Input GPS coordinates and related sound
data into TreeNet modeling software– Include environmental and human-related
covariates (e.g., vegetation, distance to roads, aspect)
– Extrapolate sound levels and sound source data to rest of Refuge
Mullet et al. (in prep)
2008
2099
Temperature (August and January) (SNAP UAF data)
Regionalized IPCC models,e.g. Alaska
Murohy et al. (2010)
Alaskan Caribou:Summer & winter ranges 2008 & 2099
20082008 2099
SummerRange
WinterRange
Murphy et al. (2010)Model in RFwith IPCC
RandomForest: Supervised and Unsupervised Classification
Supervised Classification: -Multiple Regression (classification or continuous)
-Multiple Response e.g. YAIMPUTE
Unsupervised Classification: 1. Proximity Matrix via Bagging/Voting (RF)
2. Similarity Matrix
3. e.g. Regular Clustering (mclust, PAM)
3. Visualize Result
RandomForest
Now, a topical shift to Circumpolar Arctic and Zooplankton Forecasting til 2100
Metridia longa showed the highest increase in the copepodite life stage from 2010 to 2100.
Credit: M. Schmid et al.
Calanus hyperboreus showed the highest change in the predicted relative index of depth from 2010 to 2100.
Credit: M. Schmid et al.
GMBA Case Study: Himalaya Uplands Plant Database
Bernhard Dickoré et al.
(red: sampling points)
+ FGDC NBII/ISO Metadata
A High Priority Ethnomedicinal Plant in Nepal
Dactylorhiza hatagirea (Marsh Orchids)
81 “points”
Ethnobotanical Use: Tubers are used as nervine tonic and aphrodisiac. It is also used to treat cuts, wounds, cough and anemia.
Prediction of a a High Priority Ethnomedicinal Plant in Nepal
Dactylorhiza hatagirea (Marsh Orchids)
x
Note: Terrestrial areas of Arctic & Antarctic are not included, yet
MARXAN Solution for the Three Poles: 50% Protection Scenario(birds, glaciers/ice and freezing temperatures)
Legend:SelectionFrequency!(the darker the more frequently selected)
Global AI model (Ecological Niche) based on K.Herrick-Akasofu,F. Huettmann,J. Runstadler et al.(unpublished; forthcoming thesis chapter)
Avian Influenza (AI) Prediction globally… (all based on Machine Learning!)
Acknowledgements
L. Strecker, all co-authors, all EWHALE lab students, NCEAS, Universityof Alaska-Fairbanks, D. Steinberg (Salford Systems Ltd), COML, CAML, ArcOD, GMBA, IPY, A.W. Diamond, and many colleagues worldwide (a 20 years summary...) AND HUGE THANKS TO SALFORD SYSTEMS & Dan Steinberg’s team