arxiv:1906.02684v1 [eess.sp] 6 jun 2019

Citation A. Mustafa, M. Alfarraj, and G. AlRegib, Estimation of Acoustic Impedance from Seismic Data using Temporal Convolu-tional Network, Expanded Abstracts of the SEG Annual Meeting , San Antonio, TX, Sep. 15-20, 2019.

Review Date of presentation: 18 Sep 2019

Data and Code [Github Link]

Bib @incollection{amustafa2019AI,title=Estimation of Acoustic Impedance from Seismic Data using Temporal Convolutional Network,author=Mustafa, Ahmad and AlRegib, Ghassan,booktitle=SEG Technical Program Expanded Abstracts 2019,year=2019,publisher=Society of Exploration Geophysicists}

Contact [email protected] OR [email protected]://ghassanalregib.com/

arX

iv:1

906.

0268

4v1

[ee

ss.S

P] 6

Jun

201

9

https://github.com/olivesgatech/Estimation-of-acoustic-impedance-from-seismic-data-using-temporal-convolutional-network

mailto:[email protected]

mailto:[email protected]

http://ghassanalregib.com/

Estimation of Acoustic Impedance from Seismic Data using Temporal Convolutional NetworkAhmad Mustafa∗, Motaz Alfarraj, and Ghassan AlRegibCenter for Energy and Geo Processing (CeGP), Georgia Institute of Technology

SUMMARY

In exploration seismology, seismic inversion refers to the pro-cess of inferring physical properties of the subsurface fromseismic data. Knowledge of physical properties can provehelpful in identifying key structures in the subsurface for hy-drocarbon exploration. In this work, we propose a workflowfor predicting acoustic impedance (AI) from seismic data us-ing a network architecture based on Temporal ConvolutionalNetwork by posing the problem as that of sequence modeling.The proposed workflow overcomes some of the problems thatother network architectures usually face, like gradient vanish-ing in Recurrent Neural Networks, or overfitting in Convolu-tional Neural Networks. The proposed workflow was used topredict AI on Marmousi 2 dataset with an average r2 coeffi-cient of 91% on a hold-out validation set.

INTRODUCTION

Reservoir characterization workflow involves the estimation ofphysical properties of the subsurface, like acoustic impedance(AI), from seismic data by incorporating knowledge of thewell-logs. However, this is an extremely challenging task be-cause in most seismic surveys due to the non-linearity of themapping from seismic data to rock properties. Attempts to es-timate physical properties from seismic data have been doneusing supervised machine learning algorithms, where the net-work is trained on pairs of seismic traces and their correspond-ing physical property traces from well-logs. The trained net-work is then used to obtain a map of physical properties for theentire seismic volume.

Recently, there has been a lot of work integrating machinelearning algorithms in the seismic domain (AlRegib et al., 2018).The literature shows successful applications of supervised ma-chine learning algorithms to estimate petrophysical properties.For examples, (Calder On-macas et al., 1999) used ArtificialNeural Networks to predict velocity from prestack seismic gath-ers, (Al-Anazi and Gates, 2012) used Support Vector Regres-sion to predict porosity and permeability from core- and well-logs, (Chaki et al., 2015) proposed novel preprocessing schemesbased on algorithms like Fourier Transforms and Wavelet De-composition before using the seismic attribute data to predictwell-log properties. More recently, (Lipari et al., 2018) usedGenerative Adversarial Networks (GANs) to map migrated seis-mic sections to their corresponding reflectivity section. (Biswaset al., 2018) used Recurrent neural networks to predict stackingvelocity from seismic offset gathers. (Alfarraj and AlRegib,2018) used Recurrent Neural Networks to invert seismic datafor petrophysical properties by modeling seismic traces andwell-logs as sequences. (Das et al., 2018) used ConvolutionalNeural Network (CNNs) to predict p-impedance from normalincident seismic.

One challenge in all supervised learning schemes is to use anetwork that can train well on a limited amount of training dataand can also generalize beyond the training data. RecurrentNeural Networks (RNNs) can subvert this problem by sharingtheir parameters across all time steps, and by using their hiddenstate to capture long term dependencies. However, they can bedifficult to train because of the exploding/vanishing gradientproblem. CNNs have great utility in capturing local trends insequences, but in order to be able to capture long term depen-dencies, they need to have more layers (i.e, deeper networks),which in turn increase the number of learnable parameters. Anetwork with a large number of parameters cannot be trainedon limited training examples.

In this work, we used Temporal Convolutional Networks (TCN)to modeling traces as sequential data. The proposed networkis trained in a supervised learning scheme on seismic data andtheir corresponding rock property traces (from well logs). Theproposed workflow encapsulates the best features of both RNNsand CNNs as is captures long term trends in the data withoutrequiring a large number of learnable parameters.

TEMPORAL CONVOLUTIONAL NETWORKS

One kind of sequence modeling task is to map a given a se-quence of inputs {x(0), ...,x(T −1)} to a sequence of outputsy(0), ...,y(T −1) of the same length, where T is the total num-ber of time steps. The core idea is that this kind of a mappingdescribed by the equation 1 can be represented by a neural net-work parameterized by Θ (i.e., FΘ).

y(t) = F (x(0), . . . ,x(t))∀t ∈ [0,T −1] (1)

Convolutional Neural Networks (CNNs) have been used exten-sively for sequence modeling tasks like document classifica-tion (Johnson and Zhang, 2015), machine translation (Kalch-brenner et al., 2016), audio synthesis(van den Oord et al., 2016),and language modeling(Dauphin et al., 2016). More recently,(Bai et al., 2018) performed a thorough comparison of canon-ical RNN architectures with their simple CNN architecture,which they call the Temporal Convolutional Network (TCN),and showed that the TCN was able to convincingly outperformRNNs on various sequence modeling tasks.

TCN is based on a series of dilated 1-D convolutions orga-nized into Temporal Blocks. Each temporal block has the samebasic structure. It has 2 convolution layers interspersed withweight normalization, dropout, and non-linearity layers. Fig-ure 1 shows the organization of the various layers inside a tem-poral block.

The weight normalization layers reparameterize the weightsof the network. Each weight parameter is split into 2 param-eters, one specifying its weight, and the other its direction.

Dilated Convolution

Weight Norm

ReLU

Dropout

Dilated Convolution

Weight Norm

ReLU

Dropout

1x1 Conv

OutputTemporal Block

Input

+

+

Figure 1: The structure of a Temporal Block.

This kind of reparameterization, as (Salimans and Kingma,2016) show, helps improve convergence. The Dropout layersrandomly zero out layer outputs, which helps prevent overfit-ting. The ReLU nonlinearity layers allow the network to learnmore powerful representations. Each Convolution layer addspadding to the input so that the output is of the same size asinput. There is also a skip connection from the input to theoutput of each temporal block. A distinguishing feature of theTCNs is their use of dilated convolutions, that allows the net-work to have a large receptive field, i.e., how many samplesof the input contribute to each output. The size of the dilationfactor increases exponentially at each temporal block. Withregular convolution layers, one would have to use a very deepnetwork to ensure the network has a large receptive field. Onthe other hand, using sequential dilated convolutions allowsthe network to look at large parts of the input without havingto use many layers. This enables TCNs to capture long termtrends better than RNNs. Skip connections in the TCN archi-tecture help stabilize training in case of deeper networks. Theconcept of receptive field sits at the core of TCNs. Smallerconvolution kernel sizes with fewer layers give the network asmaller receptive field, which allows it to capture local varia-tions in sequential data well. However, such a network fails tocapture the long term trends. On the other hand, larger kernelsizes with more layers give the network a large receptive fieldthat makes it good at capturing long term trends, but not asgood at preserving local variations. This is mainly due to thelarge number of successive convolutions which would dilutethis information. This is also why adding skip connections toeach residual block helps to overcome this drawback.

PROBLEM FORMULATION

Let X = [x1,x2, ...xN ] be a set of post-stack seismic traceswhere xi is the ith trace, and Y = [y1,y2, ...yN ] be the corre-sponding AI traces. A subset of X is inputted to the TCN inthe forward propagation step. The network predicts the corre-sponding AI traces. The predicted AI traces are then comparedto the true traces in the training dataset. The error between

them is computed and is then used to compute the gradients.The gradients are then used to update the weights of the TCNin a step known as back-propagation. Repeated applicationsof forward propagation followed by backpropagation changethe weights of the network to minimize the loss between theactual and predicted AI traces. We hypothesized that by treat-ing both the stacked seismic trace xn and the corresponding AItrace yn as sequential data, we would be able to use the TCNarchitecture to learn the mapping F from seismic to AI. Thetraining of the network can then be written mathematically asthe following optimization problem:

Θ = argminΘ

1N

N∑n=1

L(yn,FΘ(xn)) (2)

where L is a distance function between the actual and pre-dicted AI traces, F represents the forward propagation of theTCN on the input seismic to generate the corresponding pre-dicted AI trace, and Θ represents the network weights.

METHODOLOGY

The network architecture used is shown in Figure 2. The seis-mic traces are passed through a series of temporal blocks. Theoutput of the TCN is concatenated with the input seismic andthen mapped to predicted AI using a linear layer. As discussedearlier, when using a larger kernel size with more layers, thenetwork captures the low-frequency trend, but not the high-frequency fluctuations. On the other hand, with a smaller ker-nel size with fewer layers, the network captures the high fre-quencies but fails to capture the smoother trend. This is alsowhy we concatenated the original seismic directly with the out-put of the TCN, so that any loss of high-frequency informationdue to successive convolutions in the temporal blocks mightbe compensated for. We found that this slightly improved thequality of our results. We experimented with different kernelsizes and number of layers, and found the numbers reportedin Figure 2 worked best in terms of capturing both high- andlow-frequency contents.

Training the networkThere is a total of 2721 seismic and corresponding AI tracesfrom the Marmousi model over a total length of 17000m. Wesampled both the seismic section and the model at intervalsof 937m, to obtain a total of 19 training traces (≤ 1% of thetotal number of traces). We chose Mean Square Error (MSE)as the loss function. Adam was used as the optimizer witha learning rate of 0.001 and a weight decay of 0.0001. Weused a dropout of 0.2, kernel size of 5, and 6 temporal blocks.The TCN internally also uses weight normalization to improvetraining and speed up convergence. We trained the networkfor 2941 epochs, which took about 5 minutes to train on aNVIDIA GTX 1050 GPU. Once the network had been trained,inference on the whole seismic section was fast and took onlya fraction of a second.

Conc

aten

atio

n

Tem

pora

l Blo

ck

(1,3

)

Temporal Convolutional Network

Inputseismic

Linea

r Lay

er(7

,1)

Tem

pora

l Blo

ck

(3,5

)

Tem

pora

l Blo

ck

(5,5

)

Tem

pora

l Blo

ck

(5,5

)

Tem

pora

l Blo

ck

(5,5

)

Tem

pora

l Blo

ck

(5,6

) Predicted AI

Figure 2: TCN architecture for predicting AI. The TCN consists of a series of 6 temporal blocks, the input and output channels foreach specified in parentheses.

RESULTS AND DISCUSSION

Figure 3 shows the predicted and actual AI, along with theabsolute difference between the two. The predicted and actualAI sections show a high degree of visual similarity. The TCNis able to delineate most of the major structures. The differenceimage also shows that most of the discrepancy lies at the edgeboundaries, which is because of sudden transitions in AI thatthe network is not accurately able to predict.

(a) Predicted AI

(b) True AI

(c) Absolute Difference

Figure 3: Comparison of the predicted and true Acousticimpedance sections of the Marmousi 2 model along with theabsolute difference

We also show traces at 3400m, 6800m, 10200m, and 13600m,respectively in Figure 4. As can be seen, the AI and estimatedtraces at each location agree with each other to a large extent.

Figure 5 shows a scatter plot of the true and estimated AI. Thescatter plot show that there is a strong linear correlation be-tween the true and estimated AI sections.

Figure 4: Comparison of the predicted and true Acousticimpedance traces at selected locations along the horizontalaxis.

For a quantitative evaluation of the results, we computed thePearson’s Correlation coefficient (PCC) and the coefficient ofdetermination between estimated and true AI traces. PCC isa measure of the overall linear correlation between two traces.The coefficient of determination (r2) is a measure of goodnessof fit between two traces. The averaged values are shown inTable 1 for the training dataset and for the entire section (la-beled as validation data). As can be seen, both the training andvalidation traces report a high value for the PCC and r2 coef-ficients which confirms that the network was able to learn topredict AI from seismic traces well and to generalize beyondthe training data.

CONCLUSION

In this work, we proposed a novel scheme of predicting acous-tic impedance from seismic data using a Temporal Convolu-

Figure 5: Scatter Plot of the true and estimated AI

Metric Training ValidationPCC 0.96 0.96

r2 0.91 0.91

Table 1: Performance metrics for both the training and valida-tion datasets.

tional Network. The results were demonstrated on the Mar-mousi 2 model. The proposed workflow was trained on 19training traces, and was then used to predict Acoustic Impedancefor the entire Mamrousi model. Quantitative evaluation of thepredicted AI (PCC≈ 0.96, and r2≈ 0.91) shows great promiseof the proposed workflow for acoustic impedance prediction.Even though the proposed workflow has been used for AI es-timation in this paper, it can be used to predict any other prop-erty as well. Indeed, Temporal Convolutional Networks can beadapted to any problem that requires mapping one sequence toanother.

ACKNOWLEDGEMENTS

This work is supported by the Center for Energy and Geo Pro-cessing (CeGP) at Georgia Institute of Technology and KingFahd University of Petroleum and Minerals (KFUPM).

REFERENCES

Al-Anazi, A., and I. Gates, 2012, Support vector regressionto predict porosity and permeability: Effect of sample size:Computers & Geosciences, 39, 64 – 76.

Alfarraj, M., and G. AlRegib, 2018, Petrophysical property es-timation from seismic data using recurrent neural networks,in SEG Technical Program Expanded Abstracts 2018: So-ciety of Exploration Geophysicists, 2141–2146.

AlRegib, G., M. Deriche, Z. Long, H. Di, Z. Wang, Y. Alau-dah, M. A. Shafiq, and M. Alfarraj, 2018, Subsurface struc-ture analysis using computational interpretation and learn-ing: A visual signal processing perspective: IEEE SignalProcessing Magazine, 35, 82–98.

Bai, S., J. Z. Kolter, and V. Koltun, 2018, An empirical eval-uation of generic convolutional and recurrent networks forsequence modeling: CoRR, abs/1803.01271.

Biswas, R., A. Vassiliou, R. Stromberg, and M. Sen, 2018,Stacking velocity estimation using recurrent neural net-work: , 2241–2245.

Calder On-macas, C., M. Sen, and P. Stoffa, 1999, Automaticnmo correction and velocity estimation by a feedforwardneural network: GEOPHYSICS, 63.

Chaki, S., A. Routray, and W. K. Mohanty, 2015, A novel pre-processing scheme to improve the prediction of sand frac-tion from seismic attributes using neural networks: IEEEJournal of Selected Topics in Applied Earth Observationsand Remote Sensing, 8, 1808–1820.

Das, V., A. Pollack, U. Wollner, and T. Mukerji, 2018, Convo-lutional neural network for seismic impedance inversion, inSEG Technical Program Expanded Abstracts 2018: 2071–2075.

Dauphin, Y. N., A. Fan, M. Auli, and D. Grangier, 2016, Lan-guage modeling with gated convolutional networks: CoRR,abs/1612.08083.

Johnson, R., and T. Zhang, 2015, Semi-supervised convolu-tional neural networks for text categorization via regionembedding, in Advances in Neural Information ProcessingSystems 28: Curran Associates, Inc., 919–927.

Kalchbrenner, N., L. Espeholt, K. Simonyan, A. van den Oord,A. Graves, and K. Kavukcuoglu, 2016, Neural machinetranslation in linear time: CoRR, abs/1610.10099.

Lipari, V., F. Picetti, P. Bestagini, and S. Tubaro, 2018, Agenerative adversarial network for seismic imaging appli-cations: Presented at the .

Salimans, T., and D. P. Kingma, 2016, Weight normalization:A simple reparameterization to accelerate training of deepneural networks: CoRR, abs/1602.07868.

van den Oord, A., S. Dieleman, H. Zen, K. Simonyan, O.Vinyals, A. Graves, N. Kalchbrenner, A. W. Senior, and K.Kavukcuoglu, 2016, Wavenet: A generative model for rawaudio: CoRR, abs/1609.03499.

arxiv:1906.02684v1 [eess.sp] 6 jun 2019

Documents