deep$learning$seismic$tomography - sesaai.stanford.edu · velocity$model$building$(tomography) •...

Deep learning seismic tomographyMauricio Araya-‐Polo (Shell), Joseph Jennings* (Shell, Stanford) and Stuart Farris (Stanford)

March 30, 2018

Velocity model building (tomography)

• Estimate subsurface wave speed from seismic data

• Arguably the most important and difficult task in exploration geophysics

• Common approaches:1. Refraction/reflection tomography

2. Full waveform inversion

2

J(m) =1

2||f(m)� d

obs

||22

Formulation of tomography problem

-‐ velocity model

-‐ Recorded data

-‐ Predicted data

mdobs

• models synthetic data via the wave/Eikonal equation

• Solved using gradient-‐based optimization techniques

• Computationally demanding

3

f(m)

f(m)

Oberved data ( )dobs

4

Velocity model ( ) from tomography

(Almomin, 2016)

m

5

Another perspective on tomography

6

is known deterministicallyf(m)

Statistical learning

7

f : X ! Y X

Y

-‐ Random input vector (observed data)

-‐ Random output variable (velocity models),


8

f : X ! Y X

Y


-‐ Random output variable (velocity models)

argminf

L(Y, f(X)) L(Y, f(X)) -‐ Loss function,

,


9

f : X ! Y X

Y



argminf


,

L(Y, f(X)) = (Y � f(X))2 (squared error loss)


10

f : X ! Y X

Y



argminf


,

L(Y, f(X)) = (Y � f(X))2

) f(x) = E(Y |X = x)

(squared error loss)

A different perspective on tomography

Estimated via statistical learningmethods

11

Deep learning as a seismic imaging tool

• 2D fault prediction from shot gathers – (Zhang, C. et al., 2014)

• 3D fault prediction from shot gathers – (Araya-‐Polo, M. et al., 2017)

12

Why deep learning for velocity estimation?

• Potential for less computational burden

• Automated (no human-‐curated analysis of gathers)

13

Deep learning training workflow

14

(Y )

(X)

argminf

L(Y, f(X))


15

(Y )

(X)

argminf

L(Y, f(X))

Features for deep-‐learning tomography

16

Choosing the feature

• Suspected poor performance on raw data and dimensionality issues

• Many seismic attributes from which to choose

• Cheap to compute

17

Semblance (velocity spectrum)

• A basic velocity analysis tool (Taner and Koehler, 1969)

• Contains “apparent velocity” information

• Cheap to compute

18

Earth Model

19

Seismic Experiment

20

Seismic Experiment

21

Seismic Experiment

22

Common midpoint gather (CMP)

23

Synthetic CMP (muted direct arrival)

24

25

• Data redundancy

Synthetic CMP (muted direct arrival)

Synthetic CMP (NMO hyperbola)

26

NMO-‐corrected CMP ( )

27


28


29

for

30

q

q[j, k]

for

31

Stack over offset

q

for

32

Stack over offset

q

Smooth along time

Semblance

33

• -‐ NMO-‐corrected image

for a particular velocity

• -‐ time index

• -‐ offset index

• -‐ output index

• -‐ length of smoothing window

Semblance

34

• -‐ NMO-‐corrected image

for a particular velocity

• -‐ time index

• -‐ offset index

• -‐ output index

• -‐ length of smoothing window

Stack over offsetSmooth in time

Normalization

Semblance (velocity spectrum)

35

Training data

36

X

Y

(feature)

(label)


37

(Y )

(X)

argminf

L(Y, f(X))

Deep neural network training

38

Training the network• 10,000 synthetic pseudorandom velocity models- 80% training, 20% testing

• Squared error loss:

• Metrics: Structual similarity index (SSIM) and score

39

L(Y, f(X)) = (Y � f(X))2

R2

SSIM

40

SSIM

41

x

y1

• Compares mean and standard deviation within window

SSIM

42

x

y1


• Mean SSIM ( ) average of SSIM for all windowsMSSIM


• Mean SSIM ( ) average of SSIM for all windows

• , MSSIM 1

SSIM

43

x

y1

MSSIM = 0.14

MSSIM

SSIM

44

x

y2

SSIM

45

x

MSSIM = 0.84

y2

Training results

46

Deep learning testing workflow

47

Test set (20% of the total data)

Deep learning testing results

48

MSSIM = 0.66170

49

Deep learning testing resultsMSSIM = 0.72079

50


51


FWI results

52

MSSIM = 0.49666


53

MSSIM = 0.66170

Multiscale FWI results

54

MSSIM = 0.84702

DNN vs multiscale FWI results

55

MSSIM = 0.66170 MSSIM = 0.84702

Conclusions/future work

• We estimated a tomography operator via deep learning

• Semblance was the input feature

• Extend the method to 3D data

• Test on real data

56

Questions?

57

SStot

=mX

i=1

(yi � µy)2

Coefficient of determination

R2 = 1� SSres

SStot

58

(R2)

where,

Variance in the labels

Error in prediction

,

,

,SSres =mX

i=1

(yi � f(xi))2

SStot

=mX

i=1

(yi � µy)2


R2 = 1� SSres

SStot

59

(R2)

where,


Error in prediction

,

,

,SSres =mX

i=1

(yi � f(xi))2

SStot

=mX

i=1

(yi � µy)2


R2 = 1� SSres

SStot

60

(R2)

where,


Error in prediction

,

,

,SSres =mX

i=1

(yi � f(xi))2


• zero error (we fit the data perfectly)

• no learning occurred. We predict

• we perform worse than just predict

61

(R2)

R2 = 1� SSres

SStot

R2 = 1 )

R2 = 0 ) µy

µyR2 < 0 )


4

62

deep$learning$seismic$tomography - sesaai.stanford.edu · velocity$model$building$(tomography) •...

Documents