investigating soil data with r · r is a great platform for soil science: the ultimate...
TRANSCRIPT
Investigating soil data with R
Pierre Roudier1 and Dylan E. Beaudette2
1 Landcare Research – Manaaki Whenua, New Zealand2 Natural Resources Conservation Service, USDA, USA
Acknowledgement
Talk Outline
1 The soil resourceSoil matters!Soil scienceSoil data and its analysis
2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling
3 Recent and future developmentsIntroduction of S4 classesDesign
4 Conclusions and further work
Talk Outline
1 The soil resourceSoil matters!Soil scienceSoil data and its analysis
2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling
3 Recent and future developmentsIntroduction of S4 classesDesign
4 Conclusions and further work
Soil matters!
Soil
is not dirt.
a very thin layer between parent rock and atmosphere,
a very complex body – physical, chemical and biological interactions,
a support for almost all terrestrial ecosystems – and thus food production
Soil matters!
Soilis not dirt.
a very thin layer between parent rock and atmosphere,
a very complex body – physical, chemical and biological interactions,
a support for almost all terrestrial ecosystems – and thus food production
A threatened resource
How to feed 7 billion+ people?
Growing tensions on arable land
Urbanisation
Erosion
Etc.
↪→ Important to provide soil information to a wide range of decision makers.
A threatened resource
“Man has only a thin layer of soil between himself and starvation.” –Bard of Cincinnati
Soils today
Long been only regarded as a producer of crops
But soils are back on the global agenda
New challenges through global projects
Soils today
Long been only regarded as a producer of crops
But soils are back on the global agenda
New challenges through global projects
GlobalSoilMap.net– a leading project
International network of soil scientists
Generate and provide a ≈ 100m grid of 9 key soils attributes globally
R has been identified as a key platform for the various stages involved
Data courtesy of Tomislav Hengl.
↪→ The way to do soil science is changing significantly(“pedometrics”)
GlobalSoilMap.net– a leading project
International network of soil scientists
Generate and provide a ≈ 100m grid of 9 key soils attributes globally
R has been identified as a key platform for the various stages involved
Data courtesy of Tomislav Hengl.
↪→ The way to do soil science is changing significantly(“pedometrics”)
The soil profile
The soil profile
Soil data:
Highly multi-dimensional
Point support
Soft and hard data
Importance of legacy data
Most of the time associated with environmentalcovariates
Talk Outline
1 The soil resourceSoil matters!Soil scienceSoil data and its analysis
2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling
3 Recent and future developmentsIntroduction of S4 classesDesign
4 Conclusions and further work
The aqp package
aqp – Algorithms for Quantitative Pedology
Soil profile sketches
An example of profile sketches manually created from soil profile observations collectedas part of the Pinnacles National Monument soil survey. Horizon designations,sequences, boundaries, and soil texture classes are usually sufficient for describingcomplex soil-landscape relationships.
(Digital) Soil profile sketches
A1
A2
Bt1
Bt2
Cr
R
08A1
A/C
CrR
A2
14
A1
A2
R
13
A
Bw1
Bw2
Bw3
CrR
12
A
Bw1
Bw2
Crt
09
A
AB
Cr
Bt1
Bt2
03OiA
Bt
Crt
AB
01A1
Bw1
Bw2
A2
Cr
15A
AB2
Bw1
AB1
Bw2R
07Oi
Ad
Bw
Cr/Bw
Oe/A
Cr
05
A
AB1
Crt
AB2
Bt1
Bt2
11A
EC1
EC2
BC
AB
10Oe
AB1
A
AB2
Bt
???
02Oi/A
A
Bt1
Bt2
???
04A
R
Bt1
Bt2
Bt3
AB
BC
CB
06
180 cm
160 cm
140 cm
120 cm
100 cm
80 cm
60 cm
40 cm
20 cm
0 cm
summit backslope
swale
(Digital) Soil profile sketches
A1
A2
Bt1
Bt2
Cr
R
08A1
A/C
CrR
A2
14
A1
A2
R
13
A
Bw1
Bw2
Bw3
CrR
12
A
Bw1
Bw2
Crt
09
A
AB
Cr
Bt1
Bt2
03OiA
Bt
Crt
AB
01A1
Bw1
Bw2
A2
Cr
15A
AB2
Bw1
AB1
Bw2R
07Oi
Ad
Bw
Cr/Bw
Oe/A
Cr
05
A
AB1
Crt
AB2
Bt1
Bt2
11
A
EC1
EC2
BC
AB
10Oe
AB1
A
AB2
Bt
???
02Oi/A
A
Bt1
Bt2
???
04A
R
Bt1
Bt2
Bt3
AB
BC
CB
06
180 cm
160 cm
140 cm
120 cm
100 cm
80 cm
60 cm
40 cm
20 cm
0 cm
Summit Backslope Swale
98 79 67 40 30 −5 −5 −9 −13 −30 −51 −54 −81 −86 −97Relative Topographic Position (50 meter radius)
summit backslope
swale
Numerical soil classification
Profile j Profiles j+1, j+2, ... j=n
evaluate dissimilarity metricfor each depth increment
dissimilarity matricesfor all depth increments
apply depth-weightingand sum dissimilarities
final between-profiledissimilarity matrix
Numerical soil classification
0
5
10
15
20
25
30
OiA
Bt
Crt
AB
01
A
Bw1
Bw2
Crt
09
Oi
Ad
Bw
Cr/Bw
Oe/A
Cr
05Oi/A
A
Bt1
Bt2
???
04
A
AB2
Bw1
AB1
Bw2R
07
A1
Bw1
Bw2
A2
Cr
15
A1
A2
Bt1
Bt2
Cr
R
08Oe
AB1
A
AB2
Bt
???
02
A1
A2
R
13
A1
A/C
CrR
A2
14
A
Bw1
Bw2
Bw3
CrR
12
A1
A2
CCr
19Tollhs
A
AB
Cr
Bt1
Bt2
03
A
EC1
EC2
BC
AB
10
Ap1
Ap2
A1
A2
Bt1
Bt21
Bt22
18Chualr
A
R
Bt1
Bt2
Bt3
AB
BC
CB
06
A1
A2
A3
B1
B2
C1
C2
16 Vista
A
AB1
Crt
AB2
Bt1
Bt2
11
A11
A12
A2
B1
B2
C
17Ahwahn
200 cm
150 cm
100 cm
50 cm
0 cm
Sampled SoilsSoil Survey Legend
Group 1Group 2Group 3Group 4
Soil data harmonisation
Harmonisation of heterogeneous data sets
Legacy data
Diversity of measurement methods
Etc.
Soil data harmonisation
Harmonisation of the depth support
Organic carbon (%)
Dep
th(c
m)
120
100
80
60
40
20
0
0.0 0.5 1.0 1.5
Soil data harmonisation
Harmonisation of the depth support
Organic carbon (%)
Dep
th(c
m)
120
100
80
60
40
20
0
0.0 0.5 1.0 1.5Organic carbon (%)
Dep
th(c
m)
120
100
80
60
40
20
0
0.0 0.5 1.0 1.5
Soil data harmonisation
Harmonisation of the depth support
Organic carbon (%)
Dep
th(c
m)
120
100
80
60
40
20
0
0.0 0.5 1.0 1.5Organic carbon (%)
Dep
th(c
m)
120
100
80
60
40
20
0
0.0 0.5 1.0 1.5Organic carbon (%)
Dep
th(c
m)
120
100
80
60
40
20
0
0.0 0.5 1.0 1.5
Aggregation of soil properties
segmenteach profile collect profiles representative depth function
and characteristic variability
computesummary
statistics bysegment
Aggregation of soil properties
25th Percentile, Median, 75th Percentile
Depth(cm)
0
20
40
60
80
100
20 30 40 50 60 70 80
Sand (%)
10 20 30 40
Clay (%)
0 1 2 3
Total Carbon (%)
6 8 10 12 14 16
CIE Redness
6.0 6.5 7.0 7.5
pH
5 10 15 20 25 30
0
20
40
60
80
100
CEC (cmol [+] / kg)
0
20
40
60
80
100
0.1 0.2 0.3 0.4
Fe Crystallinity
0.8 1.0 1.2 1.4
Mn Crystallinity
0.5 1.0 1.5 2.0 2.5 3.0
Total Ca (%)
2 4 6 8 10 12
Total Fe (%)
60 80 100
120
140
Total Zr (ppm)
30 35 40
0
20
40
60
80
100
Total Si (%)
Meta-basalt Granodiorite
Pedotransfer functions
”Pedotransfer functions” are models predicting a soil attribute from other(s) soilattribute(s) and environmental covariates.
From simple approach with multivariate linear models...
... to state-of-the-art machine learning algorithms (ANN, random forests,etc.)
↪→ R is of course a natural platform for these tasks.
Talk Outline
1 The soil resourceSoil matters!Soil scienceSoil data and its analysis
2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling
3 Recent and future developmentsIntroduction of S4 classesDesign
4 Conclusions and further work
Structuring soil data
100 meters Site Datax, y, z slope, erosion, drainage, geology, ... species, abundance
xxxxx xxxx xxxxxxxx xxxxx xxxx xxxxx xxxxxx xxxxxxxxxxxxxxxx xxxxxxxxx xxxxxxxxxxx
50 -
200
cm Horizon Morphology Data
name, depths, roots, color, structure, texture, ... feature, abundance
xxxx xxxxxx xxxxxxxxx xxxxxx xxxxxxx xxxxxx xxxxxxxxxxxxxxxxxx xxxxxxxxxxxx xxxxxxxxx
Laboratory Dataavailable {Na, K, Mg, Ca}, total {Si, Al, Fe}, CEC, ph, PSD, EC, SAR, ...
xxxxxxxxxxxx xxxxxxxxxxxxxxx xxxxxxx xxxxxxxxxx xxxxxxxx xxxxx
Spatial analysis
Spatial analyis is an example where we need bindings to other packages.
Slice the soil collection on any depth interval
Cast it as a sp object (SpatialPointsDataFrame)
Apply sp/raster methods for spatial data analysis
Spatial analysis
50 100 150 200 250
0.00
0.05
0.10
0.15
0.20
0.25
0.30
distance
sem
ivar
ianc
e
10 cm depth20 cm depth30 cm depth40 cm depth50 cm depth60 cm depth
Si:Al Variogram Models
Spatial analysis
S4 Classes and Methods for Soil Information
Formal class 'SoilProfileCollection' [package ”.GlobalEnv”] with 4 slots..@ id : Named chr ”P001”.. ..− attr(*, ”names”)= chr ”id”..@ depths : int [1:6, 1:2] 0 2 14 49 57 89 2 14 49 57 ..... ..− attr(*, ”dimnames”)=List of 2.. .. ..$ : chr [1:6] ”1” ”2” ”3” ”4” ..... .. ..$ : chr [1:2] ”top” ”bottom”..@ units : chr ”cm”..@ horizons:'data.frame': 6 obs. of 4 variables:.. ..$ texture : Factor w/ 14 levels ”C”,”CBVSCL”,”FSL”,..: 8 8 8 8 9 NA.. ..$ stickiness : Factor w/ 4 levels ”MS”,”SO”,”SS”,..: NA NA NA NA NA NA.. ..$ plasticity : Factor w/ 4 levels ”MP”,”PO”,”SP”,..: NA NA NA NA NA NA.. ..$ field ph : num [1:6] 7.9 7.7 8 7.9 7.4 NA..@ site:'data.frame': 6 obs. of 2 variables:.. ..$ bound topography: Factor w/ 3 levels ”B”,”S”,”W”: 2 2 2 2 2 NA.. ..$ elevation: int [1:6] 13 7 9 14 21 NA
S4 Classes and Methods for Soil Information
Formal class 'SoilProfileCollection' [package ”.GlobalEnv”] with 4 slots..@ id : Named chr ”P001”.. ..− attr(*, ”names”)= chr ”id”..@ depths : int [1:6, 1:2] 0 2 14 49 57 89 2 14 49 57 ..... ..− attr(*, ”dimnames”)=List of 2.. .. ..$ : chr [1:6] ”1” ”2” ”3” ”4” ..... .. ..$ : chr [1:2] ”top” ”bottom”..@ units : chr ”cm”..@ horizons:'data.frame': 6 obs. of 4 variables:.. ..$ texture : Factor w/ 14 levels ”C”,”CBVSCL”,”FSL”,..: 8 8 8 8 9 NA.. ..$ stickiness : Factor w/ 4 levels ”MS”,”SO”,”SS”,..: NA NA NA NA NA NA.. ..$ plasticity : Factor w/ 4 levels ”MP”,”PO”,”SP”,..: NA NA NA NA NA NA.. ..$ field ph : num [1:6] 7.9 7.7 8 7.9 7.4 NA..@ site:'data.frame': 6 obs. of 2 variables:.. ..$ bound topography: Factor w/ 3 levels ”B”,”S”,”W”: 2 2 2 2 2 NA.. ..$ elevation: int [1:6] 13 7 9 14 21 NA..@ spatial:'SpatialPoints'..@ metadata:'data.frame'
S4 Classes and Methods for Soil Information
# Initializationdepths(spc) <− id ˜ top + bottom
# Adding site datasite(spc) <− ˜ slope + aspect + curvature + x + y + z
# Adding spatial coordinates (sp)coordinates(spc) <− ˜ x + y
S4 Classes and Methods for Soil Information
# Initializationdepths(spc) <− id ˜ top + bottom
# Adding site datasite(spc) <− ˜ slope + aspect + curvature + x + y + z
# Adding spatial coordinates (sp)coordinates(spc) <− ˜ x + y
# Accessorsunits(spc) # the units for depthsdepths(spc) # get depth matrixdepthsnames(spc) # get names of depth columnsprofile id(spc) # get profile IDshorizons(spc) # get horizon data as dataframe
S4 Classes and Methods for Soil Information
# Initializationdepths(spc) <− id ˜ top + bottom
# Adding site datasite(spc) <− ˜ slope + aspect + curvature + x + y + z
# Adding spatial coordinates (sp)coordinates(spc) <− ˜ x + y
# Accessorsunits(spc) # the units for depthsdepths(spc) # get depth matrixdepthsnames(spc) # get names of depth columnsprofile id(spc) # get profile IDshorizons(spc) # get horizon data as dataframe
# Overloadsmin(spc) # min depth within collectionmax(spc) # max depth within collectionlength(spc) # number of profiles
# Coercicionas.data.frame(spc) # convert back to original dataframe
# dataframe−like interfacespc$property # read propertyspc$property <− runif(length(spc)) # write property
Talk Outline
1 The soil resourceSoil matters!Soil scienceSoil data and its analysis
2 The aqp packageVisualisationClassificationHarmonisationAnalysis and modelling
3 Recent and future developmentsIntroduction of S4 classesDesign
4 Conclusions and further work
Conclusions
Soil science is changing, and needs new tools
aqp aims at providing soil scientists with a R-based pedometrics platform
R is a great platform for soil science:
The ultimate Excel-killer!Data and covariates in the same objectProvides advanced visualisations of soil dataHuge source of regression/classifion methods...... but also allows to test/extend/createSpatial-literate environmentSupports big data (parallelisation backends)
Conclusions
Soil science is changing, and needs new tools
aqp aims at providing soil scientists with a R-based pedometrics platform
R is a great platform for soil science:
The ultimate Excel-killer!Data and covariates in the same objectProvides advanced visualisations of soil dataHuge source of regression/classifion methods...... but also allows to test/extend/createSpatial-literate environmentSupports big data (parallelisation backends)