spatial interpolation comparison - evaluation of spatial ...spatial- interpolation comparison...

Download Spatial Interpolation Comparison - Evaluation of spatial ...spatial-  Interpolation Comparison Evaluation of spatial prediction methods ... Lessons learned (from SIC) Geostatistics course, 25 ... 9.5 10.0 10.5 11.0 11.5 12.0 12

Post on 19-May-2018

212 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • Spatial Interpolation ComparisonEvaluation of spatial prediction methods

    Tomislav Hengl

    ISRIC World Soil Information, Wageningen University

    Geostatistics course, 2529 October 2010, Wageningen

  • Topic

    I Geostatistics = a toolbox to generate maps from point datai.e.to interpolate;

    I There are many possibilities;

    I An inexperienced user will often be challenged by the amountof techniques to run spatial interpolation;

    I . . .which method should we use?

    Geostatistics course, 2529 October 2010, Wageningen

  • Topic

    I Geostatistics = a toolbox to generate maps from point datai.e.to interpolate;

    I There are many possibilities;

    I An inexperienced user will often be challenged by the amountof techniques to run spatial interpolation;

    I . . .which method should we use?

    Geostatistics course, 2529 October 2010, Wageningen

  • Topic

    I Geostatistics = a toolbox to generate maps from point datai.e.to interpolate;

    I There are many possibilities;

    I An inexperienced user will often be challenged by the amountof techniques to run spatial interpolation;

    I . . .which method should we use?

    Geostatistics course, 2529 October 2010, Wageningen

  • Topic

    I Geostatistics = a toolbox to generate maps from point datai.e.to interpolate;

    I There are many possibilities;

    I An inexperienced user will often be challenged by the amountof techniques to run spatial interpolation;

    I . . .which method should we use?

    Geostatistics course, 2529 October 2010, Wageningen

  • Have you heard of SIC?

    Geostatistics course, 2529 October 2010, Wageningen

    http://bookshop.europa.eu/uri?target=EUB:NOTICE:LBNA21595:EN:HTML

  • The spatial prediction game

    Participants were invited to estimate values located at 1000locations (right, crosses), using 200 observations (left, circles).

    Geostatistics course, 2529 October 2010, Wageningen

  • Lessons learned (from SIC)

    Geostatistics course, 2529 October 2010, Wageningen

  • Li and Heap (2008)

    Geostatistics course, 2529 October 2010, Wageningen

    http://www.ga.gov.au/image_cache/GA12526.pdf

  • How many techniques are there?

    Li and Heap (2008) list over 40 unique techniques.

    1. Are all these equally valid?

    2. How to objectively compare various methods (which criteriato use)?

    3. Which method to pick for your own case study?

    Geostatistics course, 2529 October 2010, Wageningen

    http://www.ga.gov.au/image_cache/GA12526.pdf

  • There are not as many

    There are roughly four main clusters of techniques:

    1. splines (deterministic);

    2. kriging-based (plain geostatistics);

    3. regression-based;

    4. expert systems / machine learning;

    Geostatistics course, 2529 October 2010, Wageningen

  • There are not as many

    There are roughly four main clusters of techniques:

    1. splines (deterministic);

    2. kriging-based (plain geostatistics);

    3. regression-based;

    4. expert systems / machine learning;

    Geostatistics course, 2529 October 2010, Wageningen

  • There are not as many

    There are roughly four main clusters of techniques:

    1. splines (deterministic);

    2. kriging-based (plain geostatistics);

    3. regression-based;

    4. expert systems / machine learning;

    Geostatistics course, 2529 October 2010, Wageningen

  • There are not as many

    There are roughly four main clusters of techniques:

    1. splines (deterministic);

    2. kriging-based (plain geostatistics);

    3. regression-based;

    4. expert systems / machine learning;

    Geostatistics course, 2529 October 2010, Wageningen

  • The 5 criteria

    1. the overall mapping accuracy, e.g.standardized RMSE atcontrol points the amount of variation explained by thepredictor expressed in %;

    2. the bias, e.g.mean error the accuracy of estimating thecentral population parameters;

    3. the model robustness, also known as model sensitivity inhow many situations would the algorithm completely fail /how much artifacts does it produces?;

    4. the model reliability how good is the model in estimatingthe prediction error (how accurate is the prediction varianceconsidering the true mapping accuracy)?;

    5. the computational burden the time needed to completepredictions;

    Geostatistics course, 2529 October 2010, Wageningen

  • The 5 criteria

    1. the overall mapping accuracy, e.g.standardized RMSE atcontrol points the amount of variation explained by thepredictor expressed in %;

    2. the bias, e.g.mean error the accuracy of estimating thecentral population parameters;

    3. the model robustness, also known as model sensitivity inhow many situations would the algorithm completely fail /how much artifacts does it produces?;

    4. the model reliability how good is the model in estimatingthe prediction error (how accurate is the prediction varianceconsidering the true mapping accuracy)?;

    5. the computational burden the time needed to completepredictions;

    Geostatistics course, 2529 October 2010, Wageningen

  • The 5 criteria

    1. the overall mapping accuracy, e.g.standardized RMSE atcontrol points the amount of variation explained by thepredictor expressed in %;

    2. the bias, e.g.mean error the accuracy of estimating thecentral population parameters;

    3. the model robustness, also known as model sensitivity inhow many situations would the algorithm completely fail /how much artifacts does it produces?;

    4. the model reliability how good is the model in estimatingthe prediction error (how accurate is the prediction varianceconsidering the true mapping accuracy)?;

    5. the computational burden the time needed to completepredictions;

    Geostatistics course, 2529 October 2010, Wageningen

  • The 5 criteria

    1. the overall mapping accuracy, e.g.standardized RMSE atcontrol points the amount of variation explained by thepredictor expressed in %;

    2. the bias, e.g.mean error the accuracy of estimating thecentral population parameters;

    3. the model robustness, also known as model sensitivity inhow many situations would the algorithm completely fail /how much artifacts does it produces?;

    4. the model reliability how good is the model in estimatingthe prediction error (how accurate is the prediction varianceconsidering the true mapping accuracy)?;

    5. the computational burden the time needed to completepredictions;

    Geostatistics course, 2529 October 2010, Wageningen

  • The 5 criteria

    1. the overall mapping accuracy, e.g.standardized RMSE atcontrol points the amount of variation explained by thepredictor expressed in %;

    2. the bias, e.g.mean error the accuracy of estimating thecentral population parameters;

    3. the model robustness, also known as model sensitivity inhow many situations would the algorithm completely fail /how much artifacts does it produces?;

    4. the model reliability how good is the model in estimatingthe prediction error (how accurate is the prediction varianceconsidering the true mapping accuracy)?;

    5. the computational burden the time needed to completepredictions;

    Geostatistics course, 2529 October 2010, Wageningen

  • Can we simplify this?

    1. In theory, we could derive a single composite measure thatwould then allow you to select the optimal predictor for anygiven data set (but this is not trivial!)

    2. But how to assign weights to different criteria?

    3. In many cases we simply finish using some nave predictor that is predictor that we know has a statistically more optimalalternative, but this alternative is not feasible.

    Geostatistics course, 2529 October 2010, Wageningen

  • Automated mapping

    In the intamap package1 decides which method to pick for you:

    > meuse$value output

  • Hypothesis

    We need a single criteria to compare various prediction methods.

    Geostatistics course, 2529 October 2010, Wageningen

  • Mapping accuracy and survey costs

    The cost of a soil survey is also a function of mapping scale,roughly:

    log(X) = b0 + b1 log(SN) (1)

    We can fit a linear model to the empirical table data frome.g.Legros (2006; p.75), and hence we get:

    X = exp (19.0825 1.6232 log(SN)) (2)

    where X is the minimum cost/ha in Euros (based on estimates in2002). To map 1 ha of soil at 1:100,000 scale, for example, oneneeds (at least) 1.5 Euros.

    Geostatistics course, 2529 October 2010, Wageningen

  • Survey costs and mapping scale

    9.5 10.0 10.5 11.0 11.5 12.0 12.5

    1

    01

    23

    Scale number (logscale)

    Min

    imum

    sur

    vey

    cost

    s in

    EU

    R /

    ha (

    log

    scal

    e)

    Geostatistics course, 2529 October 2010, Wageningen

  • Survey costs and mapping scale

    Total costs of a soil survey can be estimated by using the size ofarea and number of samples.The effective scale number (SN) is:

    SN =

    4 A

    N 102 . . . SN =

    A

    N 102 (3)

    where A is the surface of the study area in m2 and N is the totalnumber of observations.

    Geostatistics course, 2529 October 2010, Wageningen

  • Converges to:

    X = exp

    (19.0825 1.6232 log

    [0.0791

    A

    N 102

    ])(4)

    Geostatistics course, 2529 October 2010, Wageningen

  • Output map, from info perspective

    The resulting (predictions) map is a sum of two signals:

    Z (s) = Z (s) + (s) (5)

    where Z (s) is the true variation, and (s) is the error component.Th

Recommended

View more >