two approaches to species distribution modeling and how ... barrio… · two approaches to species...

27
Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios November, 1st, 2017 National Commission for the Knowledge and Use of Biodiversity (CONABIO)

Upload: others

Post on 17-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Two approaches to species distributionmodeling and how the climate change isincorporated

Juan M. BarriosNovember, 1st, 2017

National Commission for the Knowledge and Use of Biodiversity (CONABIO)

Page 2: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Outline

About CONABIO

Species Distribution Modeling

Let’s talk about data

1

Page 3: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

About CONABIO

Page 4: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

About CONABIO

• CONABIO was founded on 1996. Since then it has beendevoted to organize the existing information of the biotaof Mexico.

• Most of that information, at a time, were recorded byacademic institutions and some private people. Therewere some missing opportunities to analize the biota databecause there was not a consolidate data facility.

• That primary goal give as a result the birth of the NationalBiodiversity Information System (SNIB).

2

Page 5: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

SNIB

• There are approximately8.8 M of curated datapoints

• Observations are as old as1579, those registries areinferred by literature.

• A big part of CONABIOpersonnel and currentprojects are responsible tointegrate and curate newinformation.

● ● ●● ● ● ●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●

●●●●●●●●

●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

0

2500000

5000000

7500000

1600 1700 1800 1900 2000

year

obse

rvat

ions

by year

Cummultaive species observation counts in SNIB

3

Page 6: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

SNIB is more than species registries

There are about 13,000 digital maps of various subjects:vegetation maps, species distribution, satellite imagery, soiltypes, burning risk maps, water temperature maps, and muchmore. Most of these can be currently access onhttp://www.conabio.gob.mx/informacion/gisLater this year we are going to release a updated tool toexplore and download all these information.

4

Page 7: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

There are many more projects on CONABIO

Some of them are...

• Genetic diversity project.• MADMEX-REDD+, project to monitor deforestation andforest degradation.

• Enviromental suitability protection.• Algal blooming and sea biodiversity programs.• Invasive species studies.• Mangrove monitoring, phenotypical data and chemicalvariables.

• Biodiversity Monitoring Network

A common goal:

Try to understand the biological processes with data

5

Page 8: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

There are many more projects on CONABIO

Some of them are...

• Genetic diversity project.• MADMEX-REDD+, project to monitor deforestation andforest degradation.

• Enviromental suitability protection.• Algal blooming and sea biodiversity programs.• Invasive species studies.• Mangrove monitoring, phenotypical data and chemicalvariables.

• Biodiversity Monitoring Network

A common goal:

Try to understand the biological processes with data

5

Page 9: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Species Distribution Modeling

Page 10: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

What is the Species Distribution Modeling (SDM)?

• The goal is to determine where a given specie is present.• To achieve this, most studies look a set of meaningfulclimate and topographic variables to find suitabilyconditions based on current species observations(Ecological Niche Modeling).

• But the Species Distribution Modeling also shouldconsider if a suitable niche can really be a habitat for aparticular specie. [Peterson and Soberón, 2012]

6

Page 11: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Classical approach: MaxEnt, problem statment

Given an area of interest D where some environmentalvariables are defined x1, x2, . . . , xm, and a set of sitesz1, z2, . . . , zn where individuals of a particular specie wereobserved. We intend to estimate the range of specie habitat.

7

Page 12: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

MaxEnt: How to do that?

Then it is assumed that the sites {zi} where selectedindependently from a unknwon probability measure p on D.

A principle of maximum entropy states that this probability isuniform over D. The uniform distribution is the ”most random”of all.

ConstraintsWe have to consider that the specie might prefer someenvironmental features.

Then one needs to find the probability distribution p̂ thatmaximize the entropy subject to: the expectation of thefeatures x(z) under p̂ matches the sample means of thosefeatures,

1n∑

x(zi) =∫Dx(z)p̂(z)d z = Ep̂x(z).

8

Page 13: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

MaxEnt: How to do that?

Then it is assumed that the sites {zi} where selectedindependently from a unknwon probability measure p on D.

A principle of maximum entropy states that this probability isuniform over D. The uniform distribution is the ”most random”of all.

ConstraintsWe have to consider that the specie might prefer someenvironmental features.

Then one needs to find the probability distribution p̂ thatmaximize the entropy subject to: the expectation of thefeatures x(z) under p̂ matches the sample means of thosefeatures,

1n∑

x(zi) =∫Dx(z)p̂(z)d z = Ep̂x(z). 8

Page 14: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

MaxEnt

This criteria is equivalent to maximizing the likelihood of theparametric model

p(z) = eλx(z)∫D eλx(u)du.

This likelihood is the same as a IPP with log-linear ratefunction.

MaxEnt is a software for modeling species distribution frompresence-only data1 using this approach.

1https://biodiversityinformatics.amnh.org/open_source/maxent/

9

Page 15: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Practical application

• Clean data• Select the spatial region basedon biological meaningful way

• Fit the MaxEnt models• Given the fitted model as a map,work with some experts in thefield to produce moremeaningful output.

10

Page 16: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

What about the climate change?

• From the model point of view, we only have a new set ofvalues for the environmental variables. So just evaluatethat new values on the fitted model.

• It rise a new problem. What happen with values neverobserved?

• Just delete this regions.• Evaluate them and deal in some manner with thoseregions.

11

Page 17: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

A second approach

Now we try to estimate the probability of an observation givena set of features p(· | x). Instead, we consider the log odds,

S(c | x) = ln

(p(c | x)p(c̄ | x)

)Using the Bayes theorem and assuming conditionalindependence of features given specie c, p(x | c) =

∏i p(xi | c),

we haveS(c | x) =

∑iS(c | xi) + ln

(p(c)p(c̄)

).

12

Page 18: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Marginal scores

Then we estimate the marginal log oddsS(c | xi) = ln

(p(xi | c)p(xi | c̄)

)as

p(xi | c) = |{c ∩ xi}|/|{c}|p(xi | c̄) = |{c̄ ∩ xi}|/(N− |{c}|)

in order to smooth these estimations, when |{c ∩ xi}| = 0 or|{c}| = N, we use a Laplace smoothing.

13

Page 19: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Some discussion

• This approach only looks for co-occurrences on a gridedspace.

• We can consider a space-time grid to incorporate timeinto the model.

• Hence it is easy to incoporate some time-dependentclimate variability.

• We can also incorporate different data like other taxa.occurences.

14

Page 20: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

SPECIES: Plataform to explore ecological data

We developed a platformhttp://species.conabio.gob.mx/candidate/

• Fast prototyping• Create reproducible experiments,You can ask for a unique linkwith your setup

• Incorporate some performancemetrics and statistics

15

Page 21: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Some technical advantages of SPECIES

• There is an API that leverage all the calculations of theapplication... So You can use it with Python and R.

• The API also have some endpoint that gives you cleandata to use on another workflow.

• Our database design was robust enough to perform crossvalidation in real time (2 min or less).

• We plan to release all the parts of the application as anOpen Source.

16

Page 22: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Let’s talk about data

Page 23: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Species data is messy

• Species can be misidentified.• We can have atypical datapoints... In Mexico we haveobservations of lions but there’sno lion population in Mexico.

• Species taxonomicalclassificaction is variable overthe time.

• There is bias: taxon groups biasand spatial sample bias.

0e+00

2e+06

4e+06

6e+06

1900 1925 1950 1975 2000

year

obse

rvat

ions

classAmphibia

Aves

Liliopsida

Magnoliopsida

Mammalia

Reptilia

by year

Some representative taxonomical classes

17

Page 24: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Joint work with

SPECIES team(UNAM-CONABIO)

C. Stephens (C3-UNAM, Phys)C. González (UAM-Lerma, Bio)R. Sierra (CONABIO, Math)J.C. Salazar (CONABIO, Eng)E. Rovredo (CONABIO, Bio)

CONABIO ecosystemsevalutation team

A. Cuervo (UNAM-CONABIO, Bio)W. Tobon (CONABIO, Bio)D. Ramirez (CONABIO, Bio)J. Lopez (CONABIO, Bio)T. Urquiza (CONABIO, Bio)

18

Page 25: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

Questions?

18

Page 26: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

References i

References

W. Fithian and T. Hastie. Finite-sample equivalence instatistical models for presence-only data. The Annals ofApplied Statistics, 7(4):1917, 2013.

C. González-Salazar, C. R. Stephens, and P. A. Marquet.Comparing the relative contributions of biotic and abioticfactors as mediators of species’ distributions. EcologicalModelling, 248:57–70, 2013.

19

Page 27: Two approaches to species distribution modeling and how ... Barrio… · Two approaches to species distribution modeling and how the climate change is incorporated Juan M. Barrios

References ii

A. T. Peterson and J. Soberón. Species distribution modelingand ecological niche modeling: getting the concepts right.Natureza & Conservação, 10(2):102–107, 2012.

S. J. Phillips, M. Dudík, and R. E. Schapire. A maximum entropyapproach to species distribution modeling. In Proceedingsof the 21st international conference on Machine learning,page 83. ACM, 2004.

J. Sarukhán and R. Jiménez. Generating intelligence fordecision making and sustainable use of natural capital inMexico. Current Opinion in Environmental Sustainability, 19:153–159, 2016.

20