golder 2013 dsm_introduction_presentation_feb6_ram_version1

125
Introduction to Digital Soil Mapping (DSM) R. A. (Bob) MacMillan LandMapper Environmental Solutions Inc. Presented to Golder Associates: Feb 6, 2013

Upload: bob-macmillan

Post on 11-Jul-2015

471 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Introduction to Digital Soil Mapping

(DSM)

R. A. (Bob) MacMillanLandMapper Environmental Solutions Inc.

Presented to Golder Associates: Feb 6, 2013

Page 2: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Outline• Unifying DSM Framework: Universal Model of Variation

– Z(s) = Z*(s) + ε(s) + ε

• Past: Early History of Development of DSM (pre 2003)– Theory, Concepts, Models, Software, Inputs, Developments

– Examples of early methods and outputs

• Key Recent Developments in DSM post 2003– Theory, Concepts, Models, Software, Inputs, Developments

– Examples of recent methods and outputs

• Future Trends: How do I See DSM Developing?– Theory, Concepts, Inputs, Models, Software, Developments

– From Static Maps to Dynamic Real-Time Models

Page 3: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Introduction

Universal Model of Soil Variation

A Unifying Framework for DSM

Page 4: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Universal Model of Soil Variation• A Unifying Framework for Digital Soil Mapping

Z(s) = Z*(s) + ε(s) + ε

Predicted soil type or

soil property value

Deterministic part of

the predictive model

Stochastic part of the

predictive model

Pure Noise part of

the predictive model

Predicted spatial

pattern of some soil

property or class

including uncertainty

of the estimate

part of the variation

that shows spatial

structure, can be

modelled with a

variogram

part of the variation

that is predictable by

means of some

statistical or heuristic

soil-landscape model

part of the variation

that can’t be predicted

at the current scale

with the available

data and models

Source: Burrough, 1986 eq. 8.14

Page 5: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Deterministic Part of Prediction Model:

Z*(s)

• Conceptual Models

– Conceptual or mental soil-landscape models

– Produce area-class maps

• Statistical Models

– Scorpan – relate soils/soil properties to covariates

– Explain spatial distribution of soils in terms of known soil forming factors as represented by covariates

EOR Series DYD Series KLM Series FMN Series

15

40

60

COR Series

I n d i v i d u a l s a l in i t y h a z a r d r a t i n g s

f o r e a c h l a y e r

1 0 0 x 1 0 0 m g r id

L a n d s c a p e

c u r v a tu r e

V e g e ta t io n

R a in fa l l

G e o lo g y

S o i ls

L a n d s u r f a c e

S a l in i t y h a z a r d

m a p

L a y e r w e ig h t in g s

2 x

1 x

2 x

1 x

3 x

T o ta l s a l in i t y

h a z a r d r a t in g

Page 6: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Stochastic Part of Prediction Model:

ε(s)

• Geostatistical Estimation

– Predict soil properties• Point or block kriging

– Predict soil classes• Indicator kriging

– Predict error of estimate

• Correct Deterministic Part

– Error in deterministic part is computed (residuals)

– If structure exists in error then krige error & subtract

Page 7: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Pure Noise Part of Prediction Model:

ε(s)

• Some Variation not Predictable

– Have to be honest about this• Should quantify and report it

• Deterministic Prediction

– Mental and Statistical Models• Not perfect – often lack suitable

covariates to predict target variable

• Lack covariates at finer resolution

• Geostatistical Prediction

– Insufficient point input data• Can’t predict at less than the

smallest spacing of input point datad1 d2 d3 d4

SemiVariance

Lag (distance)

Sill

Nugget

Range

Page 8: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past

Early History of DSM Development

(pre 2003)On Digital Soil Mapping

McBratney et al., 2003

Page 9: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Early History of Development of DSM

Deterministic

Soil Classes

Soil Properties

Stochastic

Soil Classes

Soil Properties

Page 10: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Theory: Deterministic Component

Z*(s) Classed Conceptual Models– Jenny (1941)

• CLORPT (Note no N=space)

– Simonson (1959)• Process Model of additions,

removals, translocations, transformations

– Ruhe (1975)• Erosional -Depositional

surfaces, open/closed basins

– Dalrymple et al., (1968)• Nine unit hill slope model

– Milne (1936a, 1936b)• Catena concept, toposequences

Page 11: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Concepts: Deterministic Component

Z*(s) Classed Conceptual Models

Climate

Topography

Parent

Material

Organisms

Time

Soil

Soil = f (C, O, R, P, T, …)

Source: Lin, 2005 Frontiers in Soil Science

http://www7.nationalacademies.org/soilfrontiers/

Page 12: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

http://solim.geography.wisc.edu/index.htm

Page 13: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Deterministic Component

Z*(s) Classed Statistical Predictions• Fuzzy Inference

– Zhu, 1997, Zhu et al., 1996

– MacMillan et al., 2000, 2005

• Neural Networks

– Zhu, 2000

• Expert Knowledge (Bayesian)

– Skidmore et al., 1991

– Cook et al., 1996, Corner et al., 1997

• Regression Trees

– Moran and Bui, 2002, Bui and Moran, 2003

I n d i v i d u a l s a l in i t y h a z a r d r a t i n g s

f o r e a c h l a y e r

1 0 0 x 1 0 0 m g r id

L a n d s c a p e

c u r v a t u r e

V e g e t a t io n

R a in f a l l

G e o lo g y

S o i ls

L a n d s u r f a c e

S a l in i t y h a z a r d

m a p

L a y e r w e ig h t in g s

2 x

1 x

2 x

1 x

3 x

T o t a l s a l in i t y

h a z a r d r a t in g

Source: Jones et al., 2000

Page 14: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Software: Deterministic Component

Z*(s) Classed Statistical Predictions• Regression Trees

– CUBIST • Rulequest Research , 2000

– CART• Breiman et al., 1984

– C4.5 & See5• Quinlin, 1992

– JMP (SAS)• http://www.jmp.com/

– R• http://www.r-project.org/

• Fuzzy Logic

– SoLIM

• Zhu et al., 1996, 1997

– LandMapR, FuzME

• Bayesian Logic

– Prospector

• Duda et al., 1978

– Expector

• Skidmore et al., 1991

– Netica

• Norsys.com/netica

Page 15: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Inputs: Deterministic Component

Z*(s) Classed Statistical Predictions

• C = Climate

– Temp, Ppt, ET, Solar Rad

• Mean, min, max, variance

• Annual, monthly, indices

• O = Organisms

– Manual Maps

• Land Use

• Vegetation

– Remotely Sensed Imagery

• Classified RS imagery

• NDVI, EVI, other ratios

• R = Relief (topography)

– Primary Attributes

• Slope, aspect, curvatures

• Slope Position, roughness

– Secondary Attributes

• CTI, WI, SPI, STC

• P = Parent Material

– Published geology maps

– Gamma radiometrics

– Thermal IR, RS Ratios

• A = Age

Page 16: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Inputs: Deterministic Component

Z*(s) Classed Statistical Predictions• Common Topo Inputs

– Profile Curvature

– Plan (Contour) Curvature

– Slope Gradient (& Aspect)

– CTI or Wetness Index• Sometimes, not always

• Less Common Topo Inputs

– Surface Roughness

– Relief within a window

– Relief relative to drainage• Pit, peak, Ridge, channel,

Profile Curvature Plan Curvature

Slope Gradient Wetness Index

Pit 2 Peak Relief Divide 2 Channel

Source: MacMillan, 2005

Page 17: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Inputs: Non-DEM Airborne

Radiometrics

• Radiometrics 4 Subsurface • Infer Parent Material

Source: Mayr, 2005

Page 18: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Inputs: Non-DEM Satellite Imagery

Grassland Land Cover Types Alpine Land Cover Types

Page 19: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Deterministic Component

Z*(s)

Examples of Predictions of Soil Class

Maps

Page 20: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Approaches to Producing Predictive Area-

Class Maps

Page 21: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Knowledge-Based Classification In SoLIM

Source: Zhu, SoLIM Handbook

Page 22: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Knowledge-Based Classification Using

Boolean Decision Tree in USA

Gilpin

Pineville

Laidig

Guyandotte

Dekalb

Component Soils

Craigsville

Meckesville

Cateache

Shouns

Source: Thompson et al., 2010 WCSS

Page 23: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Knowledge-Based Classification In LandMapR

Source: Steen and Coupé, 1997

Source: MacMillan, 2005

Page 24: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Knowledge-Based Classification In LandMapR

Source: Global Forest Watch Canada, 2012

Page 25: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Note: Not simple slope elements but complex patterns

Source: Cole and Boettinger, 2004

Knowledge-Based Classification In Utah,

Knowledge-Based PURC Approach

Page 26: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Approaches to Producing Predictive Area-

Class Maps

Page 27: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Supervised Classification Using Regression Trees

Note similarity of supervised rulesand classes to typical soil-landformconceptual classes

Note numeric estimate of likelihood of occurrence of classes

Source: Zhou et al., 2004,

JZUS

Page 28: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Supervised Classification Using Bayesian

Analysis of Evidence/Classification Trees

Source: Zhou et al., 2004,

JZUS

Page 29: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Predicting Area-Class Soil Maps Using

Discriminant Analysis

Source: Scull et al., 2005, Ecological Modelling

Page 30: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Uncertainty of prediction

Bui and Moran (2003)

Geoderma 111:21-44

Extrapolation

Source: Bui and Moran., 2003

Predicting Area-Class Soil Maps Using

Regression Trees

Page 31: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Supervised Classification Using Fuzzy Logic

• Shi et al., 2004– Used multiple cases of reference

sites

– Each site was used to establish fuzzy similarity of unclassified locations to reference sites

– Used Fuzzy-minimum function to compute fuzzy similarity

– Harden class using largest (Fuzzy-maximum) value

– Considered distance to each reference site in computing Fuzzy-similarity

Fuzzy likelihood of being a broad ridge

Source: Shi et al., 2004

Page 32: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Approaches to Producing Predictive Area-

Class Maps

Page 33: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Concept of Fuzzy K-means Clustering

Credit: J. Balkovič & G. Čemanová

Source: Sobocká et al., 2003

Page 34: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Example of Application of Fuzzy K-means

Unsupervised Classification

From: Burrough et al., 2001, Landscsape Ecology

Note similarity of unsupervised

classes to conceptual classes

Page 35: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Example of Application of Disaggregation of

a Soil Map by Clustering into Components

Source: Faine, 2001

Page 36: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Developments: Deterministic Component

Z*(s) Classed Predictive Maps in Past• Characteristics of Models

– Models largely ignored ε• Seldom estimate error

• Rarely correct for error

– Mainly use DEM inputs• Initially 3x3 windows

• Slope, aspect, curvatures

• Maybe wetness index

• Later improvements were measures of slope position

– Rarely use ancillary data• Exceptions like Bui, Skull

– Operate at single scale

• Characteristics of Models

– Many use expert knowledge• Data mining is the exception

• Training data seldom used

– Specialty software prevails• Software for DEM analysis

– SoLIM, TAPESG, TOPAZ, TOPOG, TAS, SAGA, ESRI, ISRISI, LandMapR

• Software for extracting rules

– Expector, Netica, CART, See 5, Cubist, Prospector

• Software for applying rules

– ESRI, SoLIM, SIE, SAGA

Page 37: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Deterministic Component

Z*(s) for Continuous Soil Properties

Approaches Aimed at Predicting

Continuous Soil Properties

Page 38: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Concepts: Deterministic Component

Z*(s) Continuous Soil Properties• Same Theory-Concepts

as for Classed Maps

– Except theory applied to individual soil properties

– Initially referred to as environmental correlation

– Soil properties related to• Landscape attributes

• Climate variables

• Geology, lithology, soil pm

• Key Papers

– Moore et al., 1993

• Linear regression

– McSweeney et al., 1994

– McKenzie & Austin, 1993

– Gessler at al, 1995

• GLMs in S-Plus

– McKenzie & Ryan, 1999

• Regression Trees

Soil = f (C, O, R, P, T, …)

Page 39: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Deterministic Component

Z*(s) Continuous Soil Properties• Regression Trees

– McKenzie & Ryan, 1998, Odeh et al., 1994

• Fuzzy Logic-Neural Networks

– Zhu, 1997

• Bayesian Expert Knowledge

– Skidmore et al., 1996

– Cook et al., 1996, Corner et al., 1997

• GLMs – General Linear Models

– McKenzie & Austin, 1993

– Gessler et al., 1995Source: McKenzie and Ryan, 1998

Page 40: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Inputs: Deterministic Component

Z*(s) for Continuous Soil Properties

• Similar to Classed Maps But:

– Many innovations originated with continuous modelers

• Increased use of non-DEM attributes

– climate, radiometrics, imagery

• Improved DEM derivatives

– Wetness Index & CTI

– Upslope means for slope, etc.

– Inverted DEMs to compute

» Down slope dispersal

» Down slope means

» New slope position data

Source: McKenzie and Ryan, 1998

Page 41: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Deterministic Component

Z*(s) for Continuous Soil Properties

Examples of Predictions of Soil

Property Maps

Page 42: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Deterministic Component

Z*(s) Continuous Maps

• Aandahl, 1948 (Note Date!)

– Regression model• Predicted

– Average Nitrogen (3-24 inch)

– Total Nitrogen by depth

– Total Organic Carbon by depth interval

– Depth of profile to loess

• Predictor (covariate)

– Slope position as expressed by length of slope from shoulder

– Lost in the depths of time

Source: Aandahl, 1948

Page 43: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Deterministic Component

Z*(s) for Continuous Soil Properties

• Moore et al., 1993

– Seminal paper

– Focus on topography• Small sites

• Other covariates were assumed constant

– Got people thinking• About quantifying

environmental correlation, especially soil-topography relationships

Source: Moore et al, 1993

Page 44: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Deterministic Component

Z*(s) for Continuous Soil Properties

• McKenzie & Ryan, 1998

– Regression Tree: Soil Depth

Source: McKenzie and Ryan, 1998

Page 45: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Deterministic Component

Z*(s) for Continuous Soil Properties

• Gessler et al., 1995

– GLMs

– Largely based• Topo

– CTI

• Others held

– Steady

Source: McKenzie and Ryan, 1998

Source: Gessler, 2005

Page 46: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Credit: Minasny & McBratney

Regression tree2.17

1.18 2.84

Text: C Text: S,LS,L,CL,LiC

0.64 2.21 2.97 2.04

160.1

54.61 27.45

BD<1.43 BD>1.43 Clay<46.5 Clay>46.5

15.65 13.00 14.59 5.50

BD<1.42 BD>1.42

3.37 2.81

1.83 8.90

Past Models: Deterministic Component

Z*(s) for Continuous Soil Properties

Source: Minasny and McBratney

Page 47: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Developments: Deterministic Component

Z*(s) Predictive Maps up to 2003

• Main Developments

– Better DEM derivatives

• More and better measures of

landform position or

context

• Some recognition of scale

and resolution effects

– Different window sizes

– Different grid resolutions

– More non-DEM inputs

• Increased use of imagery

• New surrogates for PM

• Main Developments

– Integration of single models

into multi-purpose software

• ArcGIS, ArcSIE, ArcView

• SAGA, Whitebox, IDRISI

– Improved processing ability

• Bigger files, faster processing

– Emergence of 2 main scales

• Hillslope elements (series)

– Quite similar across models

• Landscape patterns (domains)

– Similar to associations

Page 48: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Early History of Development of DSM

Deterministic

Soil Classes

Soil Properties

Stochastic

Soil Classes

Soil Properties

Page 49: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Theory: Stochastic Component

ε(s)– Waldo Tobler (1970)

• First law of geography

– Everything is related to everything else, but near things are more related than distant things

– Matheron (1971)• Theory of regionalized variables

– Webster and Cuanalo (1975)• clay, silt, pH, CaCO3, colour

value, and stoniness on transect

– Burgess and Webster (1980 ab)• Soil Property maps by kriging

• Universal kriging (drift) of EC

Page 50: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Stochastic Component

ε(s)– Universal Model of Variation

• Matheron (1971)

• Burgess and Webster (1980 ab)

• Webster and Burrough (1980)

• Burrough (1986)

• Webster and McBratney (1987)

• Oliver (1989)

Source: Oliver, 1989

Page 51: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Stochastic Component

ε(s) Optimal Interpolation by Kriging

Fit Semi-variogram to lag data

6

6

7

6

6

7

7

5

8 5

x

y

Collect point sample observations

Irregular spatial distribution

(of observed point values)

Compute semi-variance

at different lag distances

Estimate values and error at fixed grid locations

6.1 5.7 5.3 5.8

7.0 6.5 6.0 5.2

7.6 6.0 5.77.0

7.2 7.0 6.2 5.5

Page 52: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Software: Stochastic Component

ε(s)• Earlier Stand Alone

– Pc-Geostat (PC-Raster)• Early version of GSTAT

– VESPER• Variogram estimation and

spatial prediction with error

• Minasny et al., 2005

• http://sydney.edu.au/agriculture/pal/software/vesper

– GEOEASE (DOS, 1991)• http://www.epa.gov/ada/csm

os/models/geoeas.html

• Later More Integrated

– GSTAT • Pebesma and Wesseling, 1998

• Incorporated into ISRISI

• Now incorporated into R and S-Plus packages

– Pebesma, 2004

• http://www.gstat.org/index.html

– ArcGIS• Geostatistical Analyst

– SGeMS (Stanford Univ)• http://sgems.sourceforge.net/

Page 53: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Inputs: Stochastic Component ε(s)

• Essentially Just x,y,z Values at Point Locations

1. Start with set of soil property values

irregularly distributed in x,y Cartesian space

2. Locate the regularly spaced grid nodes where predicted soil property

values are to be calculated

3. Locate the n soil property data points

within a search window around the current grid cell for which a value is

to be calculated

4. Compute a new value for each location as the weighted average of n

neighbor elevations with weights established by

the semi-variogram

Page 54: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Past Models: Stochastic Component ε(s)

for Continuous Soil Properties

Examples of Predictions of Soil

Property Maps by Kriging

Page 55: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Continuous Soil Property Maps by

Kriging

• Very Early Alberta Example

– Lacombe Research Station

• Sampled soils on a 50 m grid

– Sand, Silt, Clay,

– pH, OC, EC, others

– 3 depths (0-15, 15-50, 50-100)

• Used custom written software

– Compute variograms

– Interpolate using the variograms

• Only visualised as contour maps

– Only got 3D drapes in 1988

– Used PC-Raster to drape

– Saw strong soil-landscape pattern

0

20

40

60

80

100

120

140

160

1 3 5 7 9 11 13 15 17 19

SEMI-VARIOGRAM FOR A-HORIZON %SAND

LAG (1 LAG = 30 M)

SE

MI-

VA

RIA

NC

E

LACOMBE SITE: A HORIZON %SAND (1985)Source: MacMillan, 1985 unpublished

Page 56: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Continuous Soil Property Maps by

Kriging

Source: http://sydney.edu.au/agriculture/pal/software/vesper.shtml

Page 57: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Continuous Soil Property Maps by

Kriging

• Yasribi et al., 2009

– Simple ordinary kriging

of soil properties (OK)

• No co-kriging

• No regression prediction

– Relies on presence of

• Sufficient point samples

• Spatial structure over

distances longer then the

smallest sampling

interval

Source: Yasribi et al., 2009

Page 58: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Continuous Soil Property Maps by

Kriging

• Shi, 2009

– Comparison of pH by

four different methods

• a) HASM

• b) Kriging

• c) IWD

• d) Splines

Source: Yasribi et al., 2009

Page 59: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Developments: Stochastic Component

ε(s) Predictive Maps up to 2003

• Main Developments

– Theory

• Becomes better understood

and accepted

– Concepts

• Regression-kriging evolves

to include a separate part for

regression prediction

– Models

• Understanding and use of

universal model grows

• Directional, local variograms

• Main Developments

– Software

• From stand alone and single

purpose to integrated software

• Improvements in

– Visualization

– Capacity to process large

data sets

– Automated variogram fitting

– Ease of use

– Inputs

• Developments in sampling

designs and sampling theory

Page 60: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Present and Recent Past

Key Developments in DSM Since 2003

(2003-2012)On Digital Soil Mapping

McBratney et al., 2003

Page 61: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Developments in DSM Since 2003

Deterministic

Soil Classes

Soil Properties

Stochastic

Soil Classes

Soil Properties

Increasing Convergence and Interplay

Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation

Page 62: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Theory: Key Developments Since 2003

• Deterministic Part

– Pretty much unchanged

• Still based on attempting to

elucidate quantitative

relationships between soils

& environmental covariates

– But

• Scorpan elaboration

highlights importance of

the spatial component (n)

and of spatially correlated

error ε(s)

• Stochastic Part

– Same underlying theory

• Still based on theory of

regionalized variables

– But

• Increasing realization that

the structural part of

variation (non-stationary

mean or drift) can be better

modelled by a deterministic

function than by purely

spatial calculations

Page 63: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Concepts: Key Developments Since 2003

• Deterministic Part

– Scorpan Model

• Explicitly recognizes soil data

(s) as a potential input to

predict other soil data

– Soil inputs can include soil

maps, point observations,

even expert knowledge

• Explicitly recognizes space

(n) or location as a factor in

predicting soil data

– Space as in x,y location

– Space as in context, kriging

• Factors as predictors

– Factors explicitly seen as

quantitative predictors in

prediction function

Scorpan (McBratney et al., 2003) elaborates and popularizes universal model of variation

Page 64: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Concepts: Key Developments Since 2003

• Stochastic Part

– Emergence of Regression

Kriging (RK)

• Key difference to ordinary

kriging is that it is no longer

assumed that the mean of a

variable is constant

• Local variation or drift can

be modelled by some

deterministic function

– Local regression lowers

error, improves predictions

– Local regression function

can even be a soil mapSource: Heuvelink, personal communication

Page 65: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Models: Key Developments Since 2003

• Deterministic Part

– Improvements in Data

Mining and Knowledge

Extraction

• Supervised Classification

– Training data obtained

from both points and maps

» Sample maps at points

– Ensemble or multiple

realization models (100 x)

» Boosting, bagging

» Random Forests

» ANN, Regression tree

• Deterministic Part

– Improvements in Data

Mining and Knowledge

Extraction

• Expert Knowledge Extraction

– Bayesian Analysis of Evidence

– Prototype Category Theory

– Fuzzy Neural Networks

– Tools for Manual Extraction

of Fuzzy Expert Knowledge

» ArcSIE, SoLIM

• Unsupervised classification

– Fuzzy k-means, c-means

Page 66: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Models: Key Developments Since 2003

• Stochastic Part

– Regression Kriging

• Recognized as equivalent to

universal kriging or kriging

with external drift

• Use of external knowledge

and maps made easier

– Incorporation of soft data

• Made more accessible

through implementation in

commercial (ESRI) and

open source software (R)

• Stochastic Part

– Regression Kriging

• Odeh et al., 1995

• McBratney et al., 2003

• Hengl et al., 2004, 2007,

2003

• Heuvelink, 2006

• Hengl how to books

– http://spatial-

analyst.net/book/

– http://www.itc.nl/library

/Papers_2003/misca/hen

gl_comparison.pdf

Page 67: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Comparison of Soil Property Maps by

Kriging & RK

• Hengl et al., 2012

– Comparison of ordinary

kriging and regression

kriging

• Evidence supports RK as

explaining more of the

variation than OK alone

– Greater spatial detail

– Fewer extrapolation

areas

– Better fit to data

Source: Hengl et al., 2012

Page 68: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Software: Key Developments Since 2003

• Commercial Software

– JMP (SAS) (McBratney)• http://www.jmp.com/

– S-Plus, Matlab, • Used by soil researchers

– See5, CUBIST, CART• Regression Trees

– Netica (Bayesian)

• Norsys.com/netica

– Improvements

• Better visualization

• Better interfaces

• Non-commercial Software

– Fuzzy Logic

• SoLIM Zhu et al., 1996, 1997

• ArcSIE Shi, FuzME

– Bayesian Logic

– Full Range of Options• R

– http://www.r-project.org

– Regression Kriging

– Random Forests

– Regression Trees

– GLMs

• GSTAT (in R)

Page 69: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003

• Terrain Attributes

– More and better measures

• Primarily contextual and

related to landform position

– Real advances related to

• Multi-scale analysis

– varying window size and

grid resolution

• Window-based and flow-

based hill slope context

• Systematic examination of

relationships of properties

and processes to scale

Source: Smith et al., 2006

Source: Schmidt and Andrew., 2005

Page 70: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003

• Terrain Attributes

– Multi-scale analysis

• Varying window size and

grid resolution

• Identifies that some

variables are more useful

when computed over larger

windows or coarser grids

– Finer resolution grids not

always needed or better

– Drop off in predictive

power of DEMs after

about 30-50 m grid

resolution

Source: Deng et al., 2007

Page 71: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003

• ConMAP: Hyper-scale Contextual Analysis of Topographic Parameters

Source: Berhens et al., in press

– Neighborhood example

• Diameter

– 21 km

• Predictirs

– 775

Page 72: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003

• ConSTAT: Hyper-scale Contextual Analysis of Topographic Parameters

Source: Berhens et al., in press

ConStat (ConMap)- neighborhood reduction

a) Full neighborhoodb) Reduction of radiic) Reduction on radii d) Combination of b and c

Page 73: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003

• ConSTAT: Hyper-scale Contextual Analysis of Topographic Parameters

Source: Berhens et al., in press

Page 74: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003

• Hyper-scale Terrain

Analysis in ConSTAT

– Systematic analysis of relative

importance of terrain

measures different scales

• Compute statistics of terrain

measures at different scales

– Use data mining (Random

Forests) to identify

importance of different

statistics at different scales

and at each different location

Source: Berhens et al., in press

Page 75: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

MrVBF: Multi-scale DEM AnalysisSmooth and subsample

Original: 25 m Generalised: 75 m Generalised 675 mFlatness

Bottomness

Valley Bottom

Flatness

Valley Bottom

Flatness

Bottomness

Flatness

Source: Gallant, 2012

Page 76: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Multiple Resolution Landform Position

MrVBF Example Outputs

Source: Gallant, 2012

Broader Scale 9” DEM

MRVBF for 25 m DEM

Page 77: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Developments: Improved Measures of

Landform Position

• SAGA-RHSP: relative

hydrologic slope position

• SAGA-ABC: altitude

above channel

Source: C. Bulmer, unpublishedCalculation based on: MacMillan, 2005

Source: C. Bulmer, unpublished

Page 78: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Developments: Improved Measures of

Landform Position

• TOPHAT – Schmidt

and Hewitt (2004)

• Slope Position – Hatfield

(1996)

Source: Hatfield (1996)Source: Schmidt & Hewitt, (2004)

Page 79: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Developments: Improved Measures of

Landform Position - Scilands

Source: Rüdiger Köthe , 2012

Page 80: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Measures of Relative Slope Length (L)

Computed by LandMapR• Percent L Pit to Peak • Percent L Channel to Divide

MEASURE OF LOCAL CONTEXTMEASURE OF REGIONAL CONTEXT

Image Data Copyright the Province of British Columbia, 2003

Source: MacMillan, 2005

Page 81: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Image Data Copyright the Province of British Columbia, 2003

Measures of Relative Slope Position

Computed by LandMapR• Percent Diffuse Upslope Area • Percent Z Channel to Divide

RELATIVE TO MAIN STREAM CHANNELSSENSITIVE TO HOLLOWS & DRAWS

Source: MacMillan, 2005

Page 82: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Developments: Improved Classification of

Landform Patterns Iwahashi & Pike (2006)• Iwahashi landform underlying 1:650k soil map

Source: Reuter, H.I. (unpublished)

steep gentle

Terr

ain

Series

Fine texture,

High convexity

Fine texture,

Low convexity

Coarse texture,

High convexity

Coarse texture,

Low convexity

Terrain Classes

1

4

5

8

9

12

13

2 6 10 14

3 7 11 15

16

Page 83: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003

• Non-Terrain Attributes

– Systematic analysis of

environmental covariates

• Detect distances and scales

over which each covariate

exhibits a strong relationship

with a soil or property to be

predicted or just with itself

– Vary window sizes and grid

resolutions and compute

regressions on derivatives

– analyse range of variation

inherent to each covariate

» Functional relationships

are dependent on scale

Source: Park, 2004

Page 84: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003

• Non-Terrain Attributes

– Systematic analysis of scale of

environmental covariates

• Select and use input covariates

at the most appropriate scale

– Explicitly recognize the

hierarchical nature of

environmental controls on

soils

– Select variables at the scales,

resolutions or window sizes

with the strongest predictive

power for each property or

class to be predicted.

Source: Park, 2004

Page 85: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003

Source: David Jacquier, 2010

Harmonization of soil profile depth data through spline fitting

Page 86: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Inputs: Key Developments Since 2003From discrete soil classes to continuous soil properties

‘Modal’

profile

Fit mass-

preserving

spline

Spline

averages

at

specified

depth

ranges

Estimate

averages for

spline at

standardised

depth

ranges, e.g.,

globalsoilmap

depth ranges

Fitted

Spline

Clearfield soil seriesWapello County, Iowa

Mukey: 411784Musym: 230C

Source: Sun et al., (2010)

Harmonization of soil profile

data through spline fitting

Page 87: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Outputs: Key Developments Since 2003

• From Classes to Properties

– Non-disaggregated soil maps

• Weighted averages by polygon

by soil property and depth

– Calling version 0.5

– Disaggregated Soil Class Maps

• Estimate soil property values at

every grid cell location & depth

– Based on weighted likelihood

value of occurrence of each of

n soils times property value for

that soil at that depth

– Likelihood value can come

from various methods

Source: Sun et al, 2010

Source: Hempel et al., 2011

Page 88: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Outputs: Key Developments Since 2003

• From Classes to Properties

– Disaggregated Soil Class Maps

• Estimate soil property values at

every grid cell location

Source: Zhu et al., 1997

Page 89: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Recent Models

Recent Examples of Predictions of

Soil Class Maps

Page 90: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Predicting Area-Class Soil Maps

Source: Grinand et al., 2008

Clovis Grinand, Dominique Arrouays,

Bertrand Laroche, and Manuel Pascal Martin.

Extrapolating regional soil landscapes from an

existing soil map: Sampling intensity,

validation procedures, and integration of

spatial context. Geoderma 143, 180-190

Page 91: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Recent Knowledge-Based Classification In

Africa, Multi-scale, Hierarchical Landforms

Source: Park et al, 2004

Elevation + Slope + UPA + Catena

( 2 km support)

SOTER Soil and landforms

(1:1 million – 1.5 million

Page 92: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Predicted

soil series

TOPAZ LandMapR

DEM

Point Data

Detailed soil maps

Covariates

TRAINING DATA MODELLING

(NETICA)OUTPUTS

Expert

knowledge

Accuracy

assessment

TAPES-G

Digital Soil Mapping

in England & Wales

using Legacy Data

Source: Mayr, 2010

Page 93: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Source: Sun et al., 2010

Predicting Area-Class Soil Maps Using

Multiple Regression Trees (100 x)

Prepare a database and tables of mapping units & soil series, and covariates

Select 1/n of the points systematically (n=100)

Sample soil series randomly from the multinomial distribution of mapping unit composites

Construct decision tree

Predict soil series at all pixels

Calculate the soil series statistics based on the n predictions for each pixel

Calculate the probability for each soil series

Generate soil series maps

Repeat n

times

Used See 5, (RuleQuest

Research, 2009

Page 94: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Source: Sun et al., 2010

Predicting Area-Class Soil Maps Using

Multiple Regression Trees (100 x)

A closer look at the junction point in the middle of 4 combined maps,

(a) the original map units, and

(b) the most likely soil series map and its associated probability.

The length of the image is approximately 14 km.

Legend

monr_comppct

Value

High : 100

Low : 7

(a)

(b)

Page 95: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Recent Models

Recent Examples of Predictions of

Continuous Soil Property Maps

Page 96: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Continuous Soil Property Maps by

Kriging & RK

• Hengl et al., 2004

– Comparison of topsoil

thickness by four

different methods

• a) Point locations

• b) Soil Map only

• c) Ordinary Kriging

• d) Plain Regression

• e) Regression-kriging

– Evidence supports RK

Source: Hengl et al., 2004

Page 97: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

300 soil point data

Assemble

field data

Source: Minasny et al., 2010

Recent Example: Regression-Kriging

(scorpan + ε)

Page 98: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Assemble covariates for

the predictive model

Recent Example: Regression-Kriging

(scorpan + ε)

Source: Minasny et al., 2010

Page 99: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Linear ModelOC = f(x) + e

PredictorsElevation

AspectLandsat band 6

NDVILand-use

Soil-Landscape Unit

Perform regression to

build a predictive model

Recent Example: Regression-Kriging

(scorpan + ε)

Source: Minasny et al., 2010

Page 100: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Predict both

property value

and standard

error over the

entire area

Recent Example: Regression-Kriging

scorpan + ε)

Source: Minasny et al., 2010

Page 101: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Fit a variogram to the

residuals

Recent Example: Regression-Kriging

(scorpan + ε)

Source: Minasny et al., 2010

Page 102: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Krige the residuals

Recent Example: Regression-Kriging

scorpan + ε)

Source: Minasny et al., 2010

Page 103: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

+Linear Model Residuals

Final Prediction

Add interpolated

residuals to the

prediction from

regression

Recent Example: Regression-Kriging

scorpan + ε)

Source: Minasny et al., 2010

Page 104: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

+(Std.err. of regression)2

(Std. err. of kriging)2

(Total Variance)1/2

Add regression variance

and kriging variance to

get total variance

Recent Example: Regression-Kriging

(scorpan + ε)

Source: Minasny et al., 2010

Page 105: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Mean 64.0

Min 27.0

Max 87.9

CV% 18.4

RMSE 9.8

RI (%) 19.7

Mg C/ha

15

25

35

45

55

65

75

85

95

Final C map

C=100-1.2EC-5.2REF-0.6REF2-2.1ELC predicted for

sampled locations

C predicted for

all grid locationsResiduals

Kriging

Regression

model

Recent Example: Regression-Kriging

Page 106: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Continuous Soil Property Maps by

Hybrid Bayesian Analysis

Source: Mayr et al., 2010

Page 107: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

Future Trends

Personal View of Likely Future DSM

Development

(Post 2012)

Page 108: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Lets Go Back and Talk About

the Universal Model of Variation Again

Source: Heuvelink et al., 2004

Deterministic part of

the predictive model

Stochastic part of the

predictive model

Lots of things qualify

as regression!

Regression just

means minimizing

variance

What is all this talk

about optimization?

Z(s) = Z*(s) + ε(s) + ε

Page 109: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: A Conceptual Framework for

GSIF – A Global Soil Information Facility

Source: Hengl et al., 2011

Collaborative and

open and modelling

on an inter-active,

web-based server-

side platform

Collaborative and

open production,

assembly and sharing

of covariate data

(World Grids)

Collaborative and

open collection,

input and sharing of

geo-registered field

evidence

(Open Soil Profiles)

Maps we can all contribute to, access, use, modify and

update, continuously and transparently

Everything is

accessible,

transparent and

repeatable

Page 110: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Functionality for GSIF – A

Global Soil Information Facility

Source: Hengl et al., 2011

Possibility of making

use of existing

legacy soil maps

(even new soil maps)

needed for soil

prediction anywhere

Possibility of

rescuing, sharing,

harmonizing and

archiving soil

profile point data

needed for soil

prediction anywhere

Possibility to

develop and use

global models (even

for local mapping)

Possibility to

develop and use

multi-scale and

multi-resolution

hierarchical models

Possibility to assess

error and correct for

it everywhere

Page 111: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Conceptual Framework for

GSIF – World Soil Profiles

Source: Hengl et al., 2011

Page 112: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Implemented Framework for

GSIF – World Soil Profiles

Source: www.worldsoilprofiles.org

Page 113: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Implemented Framework for

GSIF – World Soil Profiles

Source: www.worldsoilprofiles.org

Page 114: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Conceptual Framework for

GSIF – World Grids

Source: Hengl et al., 2011

Page 115: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Implemented Framework for

GSIF – World Grids

Source: www.worldgrids.org

Page 116: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Implemented Framework for

GSIF – World Grids

Source: www.worldgrids.org

Page 117: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Implemented Framework for

GSIF

Page 118: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Collaborative Global, Multi-

Scale Mapping through GSIF

Source: Hengl et al., 2011

Possibility for combining

Top-Down and Bottom-up

mapping through weighted

averaging of 2 or more sets

of predictions

)

Possibility to

develop and use

global models (even

for local mapping)

Page 119: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Global, Multi-Scale Modeling

of Soil Properties through GSIF

Source: Hengl et al., 2011

Possibility to

develop and use

global models (even

for local mapping)

Possibility to

develop and use

multi-scale and

multi-resolution

hierarchical models

Page 120: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Global, Multi-Scale Modeling

of Soil Properties through GSIF

Global Models

inform and

improve local

mapping

• Global DSM Models

– Make use of ALL data

• From everywhere in

the world

– Provide initial coarse

local predictions

• That can be refined

and improved with:

– More & finer local data

– Local model runs

Source: Hengl personal communication, 2013

Page 121: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Global, Multi-Scale Modeling

of Soil Properties through GSIF

Source: Hengl et al., 2011

Global Models

inform and

improve local

mapping

Page 122: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Functionality for GSIF – A

Global Soil Information Facility

Source: Hengl et al., 2011

Anyone can

access and

display the

maps

Page 123: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Functionality for GSIF – A

Global Soil Information Facility

Source: Hengl et al., 2011

Slide credit: Tom Hengl,

2011

With Google

Earth everyone

has a GIS to

view free soil

maps and data

Page 124: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: Collaborative Global, Multi-

Scale Mapping through GSIF

Source: Hengl et al., 2011

A Global

Collaboratory!

Working together

we can map the

world one tile at a

time!

The next generation

of soil surveyors is

everyone!

Page 125: Golder 2013 dsm_introduction_presentation_feb6_ram_version1

The Future: From Mapping to

Continuously Updated Modelling

Possibility to move from single

snapshot mapping of static soil

properties to continuous update and

improvement of maps of both static and

dynamic properties within a structured

and consistent framework.