deep$learning$seismic$tomography - sesaai.stanford.edu · velocity$model$building$(tomography) •...
TRANSCRIPT
Deep learning seismic tomographyMauricio Araya-‐Polo (Shell), Joseph Jennings* (Shell, Stanford) and Stuart Farris (Stanford)
March 30, 2018
Velocity model building (tomography)
• Estimate subsurface wave speed from seismic data
• Arguably the most important and difficult task in exploration geophysics
• Common approaches:1. Refraction/reflection tomography
2. Full waveform inversion
2
J(m) =1
2||f(m)� d
obs
||22
Formulation of tomography problem
-‐ velocity model
-‐ Recorded data
-‐ Predicted data
mdobs
• models synthetic data via the wave/Eikonal equation
• Solved using gradient-‐based optimization techniques
• Computationally demanding
3
f(m)
f(m)
Oberved data ( )dobs
4
Velocity model ( ) from tomography
(Almomin, 2016)
m
5
Another perspective on tomography
6
is known deterministicallyf(m)
Statistical learning
7
f : X ! Y X
Y
-‐ Random input vector (observed data)
-‐ Random output variable (velocity models),
Statistical learning
8
f : X ! Y X
Y
-‐ Random input vector (observed data)
-‐ Random output variable (velocity models)
argminf
L(Y, f(X)) L(Y, f(X)) -‐ Loss function,
,
Statistical learning
9
f : X ! Y X
Y
-‐ Random input vector (observed data)
-‐ Random output variable (velocity models)
argminf
L(Y, f(X)) L(Y, f(X)) -‐ Loss function,
,
L(Y, f(X)) = (Y � f(X))2 (squared error loss)
Statistical learning
10
f : X ! Y X
Y
-‐ Random input vector (observed data)
-‐ Random output variable (velocity models)
argminf
L(Y, f(X)) L(Y, f(X)) -‐ Loss function,
,
L(Y, f(X)) = (Y � f(X))2
) f(x) = E(Y |X = x)
(squared error loss)
A different perspective on tomography
Estimated via statistical learningmethods
11
Deep learning as a seismic imaging tool
• 2D fault prediction from shot gathers – (Zhang, C. et al., 2014)
• 3D fault prediction from shot gathers – (Araya-‐Polo, M. et al., 2017)
12
Why deep learning for velocity estimation?
• Potential for less computational burden
• Automated (no human-‐curated analysis of gathers)
13
Deep learning training workflow
14
(Y )
(X)
argminf
L(Y, f(X))
Deep learning training workflow
15
(Y )
(X)
argminf
L(Y, f(X))
Features for deep-‐learning tomography
16
Choosing the feature
• Suspected poor performance on raw data and dimensionality issues
• Many seismic attributes from which to choose
• Cheap to compute
17
Semblance (velocity spectrum)
• A basic velocity analysis tool (Taner and Koehler, 1969)
• Contains “apparent velocity” information
• Cheap to compute
18
Earth Model
19
Seismic Experiment
20
Seismic Experiment
21
Seismic Experiment
22
Common midpoint gather (CMP)
23
Synthetic CMP (muted direct arrival)
24
25
• Data redundancy
Synthetic CMP (muted direct arrival)
Synthetic CMP (NMO hyperbola)
26
NMO-‐corrected CMP ( )
27
NMO-‐corrected CMP ( )
28
NMO-‐corrected CMP ( )
29
for
30
q
q[j, k]
for
31
Stack over offset
q
for
32
Stack over offset
q
Smooth along time
Semblance
33
• -‐ NMO-‐corrected image
for a particular velocity
• -‐ time index
• -‐ offset index
• -‐ output index
• -‐ length of smoothing window
Semblance
34
• -‐ NMO-‐corrected image
for a particular velocity
• -‐ time index
• -‐ offset index
• -‐ output index
• -‐ length of smoothing window
Stack over offsetSmooth in time
Normalization
Semblance (velocity spectrum)
35
Training data
36
X
Y
(feature)
(label)
Deep learning training workflow
37
(Y )
(X)
argminf
L(Y, f(X))
Deep neural network training
38
Training the network• 10,000 synthetic pseudorandom velocity models- 80% training, 20% testing
• Squared error loss:
• Metrics: Structual similarity index (SSIM) and score
39
L(Y, f(X)) = (Y � f(X))2
R2
SSIM
40
SSIM
41
x
y1
• Compares mean and standard deviation within window
SSIM
42
x
y1
• Compares mean and standard deviation within window
• Mean SSIM ( ) average of SSIM for all windowsMSSIM
• Compares mean and standard deviation within window
• Mean SSIM ( ) average of SSIM for all windows
• , MSSIM 1
SSIM
43
x
y1
MSSIM = 0.14
MSSIM
SSIM
44
x
y2
SSIM
45
x
MSSIM = 0.84
y2
Training results
46
Deep learning testing workflow
47
Test set (20% of the total data)
Deep learning testing results
48
MSSIM = 0.66170
49
Deep learning testing resultsMSSIM = 0.72079
50
Deep learning testing resultsMSSIM = 0.78219
51
Deep learning testing resultsMSSIM = 0.76253
FWI results
52
MSSIM = 0.49666
Deep learning testing results
53
MSSIM = 0.66170
Multiscale FWI results
54
MSSIM = 0.84702
DNN vs multiscale FWI results
55
MSSIM = 0.66170 MSSIM = 0.84702
Conclusions/future work
• We estimated a tomography operator via deep learning
• Semblance was the input feature
• Extend the method to 3D data
• Test on real data
56
Questions?
57
SStot
=mX
i=1
(yi � µy)2
Coefficient of determination
R2 = 1� SSres
SStot
58
(R2)
where,
Variance in the labels
Error in prediction
,
,
,SSres =mX
i=1
(yi � f(xi))2
SStot
=mX
i=1
(yi � µy)2
Coefficient of determination
R2 = 1� SSres
SStot
59
(R2)
where,
Variance in the labels
Error in prediction
,
,
,SSres =mX
i=1
(yi � f(xi))2
SStot
=mX
i=1
(yi � µy)2
Coefficient of determination
R2 = 1� SSres
SStot
60
(R2)
where,
Variance in the labels
Error in prediction
,
,
,SSres =mX
i=1
(yi � f(xi))2
Coefficient of determination
• zero error (we fit the data perfectly)
• no learning occurred. We predict
• we perform worse than just predict
61
(R2)
R2 = 1� SSres
SStot
R2 = 1 )
R2 = 0 ) µy
µyR2 < 0 )
Deep learning testing results
4
62