spatial data uncertainty in geostatistics and …...spatial data uncertainty in geostatistics and...

16
Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump | Science Leader Earth Science Informatics 30 May 2018

Upload: others

Post on 21-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study

MINERAL RESOURCES

Franky Fouedjio and Jens Klump | Science Leader Earth Science Informatics30 May 2018

Page 2: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Introduction• Machine learning can be a great tool, but how does it compare

with established, well-understood tools?

• Specific questions:• Reporting and interpretation of uncertainty

• Spatially correlated variables

• Methods:• Real-world case of geochemical and auxiliary data

• Simulation of geochemical and auxiliary data

Spatial Data Uncertainty | Fouedjio & Klump2 |

Page 3: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Case Study: Soil Geochemistry• Data: soil geochemistry of southwest England (source: C. Kirkwood, BGS

G-BASE)

• Elements used in this study: Al, Ba, Br, Ca, Ce, Co, Cr, Cs, Fe, Ga, Ge, Hf, K, La, Mg, Mn, Mo, Na, Nb, Nd, Ni, P, Rb, Sc, Se, Si, Sm, Sr, Ta, Th, Ti, U, V, Y, Zr

• Other elements were excluded due to their hydrothermal mobility or concentrations below detection limits.

• Auxiliary data: Gravity, geomorphology, radiometrics, IR (LANDSAT)

• Geographically sparse data

• Aim: geochemical exploration (outliers)

Spatial Data Uncertainty | Fouedjio & Klump3 |

Page 4: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Study Area

Spatial Data Uncertainty | Fouedjio & Klump4 |

Page 5: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Auxiliary Variables

Spatial Data Uncertainty | Fouedjio & Klump5 |

Page 6: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Prediction – Kriging vs. Random Forest

Spatial Data Uncertainty | Fouedjio & Klump6 |

Pre

dic

tio

nU

nce

rtai

nty

Kriging Random Forest

Page 7: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

KED vs. QRF in real-world data• Prediction uncertainty maps provided by KED vary less across the

study area than QRF.

• The largest prediction uncertainties given by KED are concentrated in those areas not surveyed or where the sampling was too sparse.

• Prediction uncertainty provided by KED depends mainly on the data configurations.

• Prediction uncertainty maps provided by QRF show spatial patterns not related to the density of sampling locations but to the distribution of particular auxiliary variables.

Spatial Data Uncertainty | Fouedjio & Klump7 |

Page 8: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Real-world vs. simulated data• Geological data are

notoriously sparse and incomplete because data are not easily obtained.

• To overcome the shortcomings of real-word geological data we decided to test KED and QRF on synthetic data sets.

Spatial Data Uncertainty | Fouedjio & Klump8 |

Page 9: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Simulated Data

Spatial Data Uncertainty | Fouedjio & Klump9 |

Page 10: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Simulated Data• Case 1: linear relationship between Y and X, where Y has a weak

spatial correlation.

• Case 2: linear relationship between the Y and X, where Y has a strong spatial correlation.

• Case 3: complex nonlinear relationship between Y and X, where Y displays a weak spatial correlation.

• Case 4: complex nonlinear relationship between Y and X, where the target variable displays a strong spatial correlation.

Spatial Data Uncertainty | Fouedjio & Klump10 |

Page 11: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Spatial Data Uncertainty | Fouedjio & Klump

Simulations of the auxiliary variables

11 |

An example of simulated auxiliary variables using Gaussian Random Fields.

Page 12: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Simulations of the target variables

Spatial Data Uncertainty | Fouedjio & Klump12 |

An example of the simulated target variables under different cases.

Page 13: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Case 2: Linear response, spatial correlation• As expected, Kriging with

External Drift (KED) outperforms Quantile Regression Random Forest (QRF).

Spatial Data Uncertainty | Fouedjio & Klump13 |

Page 14: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Case 3: non-linear, no spatial correlation• As expected, the Root Mean

Square Error (RMSE) shows that QRF performs better than KDE.

• The goodness of fit, accuracy and probability interval width show that KED has a better prediction uncertainty that QRF.

Spatial Data Uncertainty | Fouedjio & Klump14 |

Page 15: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Conclusions• We compared the predictive capacity and uncertainty of KED and

QRF using both real-world and synthetic data.• In a direct comparison of KED and QRF, both methods produced

similar predictive maps.• As expected, KED is able to exploit linear spatial dependencies

while QRF is better at handling non-linear dependencies.• Surprisingly, KED always outperformed QRF with respect to

measures of uncertainty.• Applications must weigh the benefits of better uncertainty in KED

against better handling of non-linearity in QRF.

Spatial Data Uncertainty | Fouedjio & Klump15 |

Page 16: Spatial Data Uncertainty in Geostatistics and …...Spatial Data Uncertainty in Geostatistics and Machine Learning: A Case Study MINERAL RESOURCES Franky Fouedjio and Jens Klump |

Mineral ResourcesJens KlumpScience Leader

t +61 8 6436 8828e [email protected] people.csiro.au/Jens-Klump

Stanford UniversityFrancky Fouedjio

t +61 2 9123 4567e [email protected] www.csiro.au/lorem

Thank you

MINERAL RESOURCES