maxent interface. maximum entropy (maxent) deterministic precise mathematical definition continuous...

12
Maxent interface

Post on 18-Dec-2015

233 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Maxent interface

Page 2: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Maximum Entropy (Maxent)

• Deterministic

• Precise mathematical definition

• Continuous and categorical environmental data

• Continuous output

Page 3: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Maxent can be downloaded at: http://www.cs.princeton.edu/~schapire/maxent/

Note: when downloading Maxent, make sure that maxent.jar is saved as is, and not as a .zip file

Page 4: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Input data:

1. Samples: a .csv file with 3 fields (species label, longitude, latitude) and a header as first line. Can have multiple species in a single file

2. Environmental layers*: ASCII files (ESRI or DIVA-GIS formats) grouped in a folder. No mask file is needed

* also possible to use SWD format (sample-with-data): a .csv file containing the environmental variables’ values for each occurrence point

Page 5: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Classes of features:

1.Linear* → variable itself

2.Quadratic → square of variable

3.Product → product of two variables

4.Threshold → binary transformation (0, 1) of a continuous variable using a threshold

5.Hinge → like a linear feature, but constant below a threshold

* Categorical data: Binary feature → variable itself

Page 6: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

What are the features used for?• to constrain the probability distribution of maximum entropy (most spread out) which determines a species probability distribution (output prediction)

Constraints:

Linear* → mean

Quadratic → variance

Product → covariance

Threshold → fit an arbitrary response

Hinge → like linear (but constant below a threshold)

* Categorical data: Binary feature → proportion

Page 7: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Auto features setting optimizes the use of a set of features based on the number of presence records for the species

• Linear features if <10 presence points available • Linear + quadratic if 10-14 presence points available• Linear + quadratic + hinge if 15-79 presence points available• All features if >80 presence points available

In order to override this default setting, it is necessary to use the command line flags described in help menu. However, the beta regularization value has to be adjusted too.

Page 8: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Outputs:

• SpeciesName.html contains response curves, pictures of predictions and jackknife to measure variable importance if chosen

• prediction can be saved as cumulative, logistic, or raw

• model can be projected on different climatic datasets (different geographic region or different period of time

• Output file types available: ASCII grid and DIVA-GIS grid (.mxe is not a grid output)

Page 9: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Examples of response curves (how each environmental variable affects Maxent model)

Picture of Maxent prediction

Page 10: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Cumulative output

!( Occurrence data

0

0 - 0.001

0.001000000 - 0.1

0.100000000 - 1

1.000000001 - 10

10.00000001 - 50

50.00000001 - 100

!(

!(

!(

!(

!(!(

!(

!(!(

!(

!(

!(

!( !(!(!(

!(

!(

!(!(

!(

!(!(

!(

!(!(

!(

!(

!(

!(

!(

!(!(

!(!(

!(

!(

!(

!(

!(

Logistic output

!( Occurrence data

0.000000001 - 0.047

0.0471 - 0.141

0.142 - 0.26

0.261 - 0.401

0.402 - 0.535

0.536 - 0.662

0.663 - 0.926

!(

!(

!(

!(

!(!(

!(

!(!(

!(

!(

!(

!( !(!(!(

!(

!(

!(!(

!(

!(!(

!(

!(!(

!(

!(

!(

!(

!(

!(!(

!(!(

!(

!(

!(

!(

!(

!( Occurrence data

0 - 0.000092498

0.0001 - 0.00032374

0.00032375 - 0.00060123

0.00060124 - 0.00097122

0.00097123 - 0.0015956

0.0015957 - 0.0029368

0.0029369 - 0.0059198

Raw output

Used to be the default type New default type Raw values (very small)

Each value is the sum of Non-linear scale up Sum over all cells used

probabilities of cells < the of raw values for training is 1 cell grid, times 100

General notes:

• Thresholding (binning) can change the look of the map significantly

• Care in interpreting the thresholds (e.g. a cumulative value of 80% doesn’t mean that the probability of a species’ occurrence is 80%)

• Grids have floating points values, thus they should be imported as floating point grids this in GIS software in order to preserve the fine details in classifying cells as suitable

Page 11: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

Lastly... (more) Settings button: opens a new window with more settings Random test/train partition of

occurrence data for each run; same for background data

% of occurrence data randomly set aside as test points (default is 0)

modifies the regularization value (higher value gives a more spread out distribution); works only if auto features option is off.

Occurrence data from a file (rather than a random sample of training data) is used to test AUC, omission, etc

Sampling is assumed to be biased according to sampling distribution

Page 12: Maxent interface. Maximum Entropy (Maxent) Deterministic Precise mathematical definition Continuous and categorical environmental data Continuous output

To run Maxent: • Occurrence data in a .csv file

• Training environmental dataset (no mask needed) containing ASCII grids

• Optional: environmental dataset for projecting models

Maxent outputs (predictions):

To summarize in a few words...

• ASCII grids, floating point (not integers)

• Can be raw, logistic, or cumulative predictions

• Additional files, including an .html summary file