biophysical gradient modeling. management needs decision support tools – baseline information...

Biophysical Gradient Modeling

Management Needs

Decision Support Tools– Baseline Information

• Vegetation characteristics• Forest stand structure• Fuel loads

– Predictive Mapping• Vegetation maps• Fuels maps

• What are the different vegetation types in the Sky Island systems of Chihuahuan Desert Borderlands?

• Are the local- and landscape-scale abundance and distribution patterns of vegetation related to variation in the biophysical environment and the spectral characteristics of the vegetation?

• Can those species-environment relationships be used in a predictive manner to map vegetation across the landscape?

Research Questions

Species-Environment Relationships in SW North America

Niering and Lowe (1984)

Sky Island Forests

• Sierra Madre Oriental and Occidental

• Post-Pleistocene refugia

• High vascular plant diversity and endemism

Integrated approach

Merge extensive field sampling with image classification of

vegetation/fuel characteristics and biophysical gradient modeling.

Davis Mountain AlternaLampropellis alterna

Vegetation Sampling

• 600 Permanent plots– Systematic sampling grid– Captured topographic

variability– Circular, fixed-area plots

• Tree attributes–Species ID, DBH,

height, live crown height, spatial location

Topographic Data

elevationN aspectE aspect

slopeISFDSFTRMIPRR

topopos 150topopos 450

topo configurationlandform

sediment transportwetness indexnetwork indexflow direction

flow accumulation

} Digital Elevation

Model

AnalysisSpecies data for each plot (Basal Area/ Density)

Cluster Analysis

Species IV =Sum Rel BA + Rel Dens

Vegetation Types

CART

Species-Environment Relationships

Topographic Data For Each Plot

ENVI Decision TreeVegetation and Fuels Maps

9 Dominant Forest Types

Pinyon Pine ForestOak-Pinyon-Juniper ForestAlligator Juniper ForestGray Oak ForestEmory Oak Forest

Cypress-Fir Forest

Ponderosa-SW White Pine Forest

Gallery Forest

Graves Oak Forest

• Dry sites• High solar radiation• Upper topographic positions

• Mesic sites• Low solar radiation• Valley bottoms

Tolerant Species

Good Competitors

Ele

vatio

nE

leva

tion

high

high

low

low

CART Basics: How do you parse these data into homogeneous groups?

Classification

Given a collection of records Each record contains a set of attributes, one of the attributes is the class.

Find a model for class attribute as a function of the values of other attributes.

Development of CART

• Leo Breiman- discovered tree-based methods of Classification that later became machine learning. Also know as data mining.

• Wrote CART: Classification and Regression Trees with Jerome Friedman and Richard Olshen in 1984.

and also Random Forests….

Classification and Regression Trees

• A supervised learning algorithm that recursively partitions heterogeneous data into successive homogeneous subsets using binary splits

• Non-parametric and non-linear• Can handle numerical or

categorical• Easy interpretability of results• Output can be directly fed into

ENVI Decision Tree to classify your image

Steps for Producing a CART Model

1. Determine the vegetation/fuel types using field generated data or prior knowledge of the site.

2. Extract spectral and landform metric data from imagery and DEMs3. Inspect the training data and check for an extremely unbalanced

dataset.4. Grow the CART model to its full size and prune it using the 1- SE rule.5. Use 10-fold cross-validation and bootstrapping to validate the model

accuracy using misclassification % and the Kappa statistic.6. Code the maps using ENVI decision tree and visually asses the “look” of

the map.7. Validate the maps in the field to produce misclassification % and the

Kappa statistic.

Impurity of a Node

• Need a measure of impurity of a node to help decide on how to split a node, or which node to split

• The measure should be at a maximum when a node is equally divided amongst all classes

• The impurity should be zero if the node is all one class

Measures of Impurity

• Misclassification Rate• Gini Index

In practice the first is not used for the following reasons:

• Situations can occur where no split improves the misclassification rate

• The misclassification rate can be equal when one option is clearly better for the next step

Visual Example

Selection of Splits

• We select the split that most decreases the Gini Index. This is done over all possible places for a split and all possible variables to split.

• We keep splitting until the terminal nodes have very few cases or are all pure – this is an unsatisfactory answer to when to stop growing the tree, but it was realized that the best approach is to grow a larger tree than required and then to prune it!

Pruning the Tree I• The best method of arriving at a suitable size

for the tree is to grow an overly complex one then to prune it back. The pruning is based on the misclassification rate. However the error rate will always drop (or at least not increase) with every split. This does not mean however that the error rate on Test data will improve.

Pruning the Tree II

• The solution to this problem is cross-validation. One version of the method carries out a 10 fold cross validation where the data is divided into 10 subsets of equal size (at random) and then the tree is grown leaving out one of the subsets and the performance assessed on the subset left out from growing the tree. This is done for each of the 10 sets. The average performance is then assessed.

Advantages and Disadvantages

• Advantages– Handles data with any structure – Robust to outliers– Machine learning-little input from analyst– Final results can be summarized in logical if-then conditions

• Disadvantages– Knowing when to stop splitting– Does not use combinations of variables– Computations are complex in determining best split

conditions

…back to Mapping fuels and Vegetation in the Chihuahuan Desert Borderlands

Misclassification = 29.1%Kappa = 0.57

Vegetation Types

oak-pinyon-juniper

grey oak

alligator juniper

pinyon pine

mesic woodlandW

0 2 4 6 81Kilometers

Once map is generated…

perform field validation