large data in matlab: a seismic data processing … · large data in matlab: a seismic data...

18
1 © 2013 MathWorks, Inc. Large Data in MATLAB: A Seismic Data Processing Case Study U. M. Sundar Senior Application Engineer

Upload: lamngoc

Post on 17-Aug-2018

237 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

1 © 2013 MathWorks, Inc.

Large Data in MATLAB: A Seismic Data

Processing Case Study

U. M. Sundar

Senior Application Engineer

Page 2: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

2

Problem Statement:

Scaling Up Seismic Analysis

Challenge:

– Developing a seismic analysis

algorithm that can scale up to

large data

Solution:

– Analyze seismic data files

larger than available system

memory

– Use either Kirchhoff migration

or Reverse Time migration

approach

– Use parallel computing or

GPUs for faster processing

Page 3: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

3

Challenges in working with large data

More data than available memory

– From multiple files

– From large files

– From databases

– Generated during analysis/simulations

MATLAB and Toolboxes works on arrays that are “in

memory” only

Multiple versions of code required for

– In-memory vs. virtual arrays

– Parallel computations

– GPUs

Page 4: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

4

How do I scale my algorithm for large data? Kirchhoff Migration

Velocity Model

Travel Time Field

Shot Records (Field Data)

Migrate(Shot, TravelTime)

Reconstructed Image

20 GB

8 GB

Page 5: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

5

How do I scale my algorithm for large data? Reverse Time Migration

Velocity Model

Shot Records (Field Data)

Reconstructed Image

ForwardTime (Shot(1:t),V)

ReverseTime (Shot(t:-1:1),V)

Correlation(FT,RT)

FT(nz,nx,nt,nShots)

RT(nz,nx,nt,nShots) 40 TB

40 TB

Page 6: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

6

Solving Large Data Access Challenges

Create custom data import

classes

– Helps reading files in parts

Use virtual arrays to store

data

– Makes memory from the hard

disk available

– Allows storage more than that

of the process memory

Data from Multiple

sources

Access

Page 7: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

7

Running Algorithms on Large Data

Migrate(Shot,

TravelTime)

Reconstructed

Image

20 GB

8 GB

Travel Time Field

Shot Records

(Field Data)

Derive the Time travel field

using Ray tracing algorithm

Use virtual memory to

calculate the time section

by section

Take the shots data and

travel time to solve the finite

differences

Use parallel computing to

speed up calculations

Page 8: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

8

Scaling MATLAB Applications

Worker Worker

Worker

Worker

Worker Worker

Worker

Worker TOOLBOXES

BLOCKSETS

Page 9: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

9

Parallel Computing Tools Address…

Long computations

– Multiple independent

iterations

– Series of tasks

Large data problems

parfor i = 1 : n

% do something with i

end

Task 1 Task 2 Task 3 Task 4

11 26 41

12 27 42

13 28 43

14 29 44

15 30 45

16 31 46

17 32 47

17 33 48

19 34 49

20 35 50

21 36 51

22 37 52

Task-Parallel Data-Parallel

Page 10: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

10

Parallel and Distributed Computing Products

Desktop Computer

Parallel Computing Toolbox

Computer Cluster

MATLAB Distributed Computing Server

Scheduler

Page 11: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

11

What is a Graphics Processing Unit (GPU)

Originally for graphics acceleration, now also

used for scientific calculations

Massively parallel array of integer and

floating point processors

– Typically hundreds of processors per card

– GPU cores complement CPU cores

Dedicated high-speed memory

* Parallel Computing Toolbox requires NVIDIA GPUs with Compute Capability 1.3 or greater, including

NVIDIA Tesla 10-series and 20-series products. See http://www.nvidia.com/object/cuda_gpus.html

for a complete listing

Page 12: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

12

Summary of Options for Targeting GPUs

1) Use GPU array interface with MATLAB

built-in functions

2) Execute custom functions on elements of

the GPU array

3) Invoke your CUDA kernels directly from

MATLAB

Ease o

f U

se

Gre

ate

r Co

ntro

l

Page 13: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

13

Workflow: Scaling Up Seismic Analysis

Share Explore and Create

Data from

Multiple sources

Access

Get the velocity

data

Get the shot data

Use virtual arrays

to accommodate

large data

Derive travel time of

shock from velocity data

Solve the Shots data

using finite differences

Parallel computing with

GPU and CPU

Automatic publish

Share MATLAB files

Report

Application

Automate

Page 14: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

14

MathWorks India – Services and Offerings

Local website: www.mathworks.in

Technical Support India: www.mathworks.in/ myservicerequests

Customer Service for non-technical questions: [email protected]

Application Engineering

Product Training:

www.mathworks.in/training

Consulting

Page 15: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

15

Training Services Exploit the full potential of MathWorks products

Flexible delivery options:

Public training available in several cities

Onsite training with standard or

customized courses

Web-based training with live, interactive

instructor-led courses

More than 30 course offerings:

Introductory and intermediate training on MATLAB, Simulink,

Stateflow, code generation, and Polyspace products

Specialized courses in control design, signal processing, parallel computing,

code generation, communications, financial analysis,

and other areas

www.mathworks.in/training

Page 16: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

16

MATLAB Central

Community for MATLAB and Simulink

users

Over 1 million visits per month

File Exchange – Upload/download access to free files

including MATLAB code, Simulink models,

and documents

– Ability to rate files, comment, and ask questions

– More than 12,500 contributed files, 300

submissions per month, 50,000 downloads

per month

Newsgroup – Web forum for technical discussions about

MathWorks products

– More than 300 posts per day

Blogs – Commentary from engineers who design, build,

and support MathWorks products

– Open conversation at blogs.mathworks.com

Based on February 2011 data

Page 17: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

17

MathWorks India Contact Details

URL: http://www.mathworks.in

E-mail: [email protected]

Technical Support: www.mathworks.in/myservicerequests

Tel: +91-80-6632 6000

Fax: +91-80-6632 6010

Thank You for Attending

Talk to Us – We are Happy to Support You

MathWorks India Private Limited

Salarpuria Windsor Building

Third Floor,

No.3 Ulsoor Road

Bangalore - 560042, Karnataka

India

Page 18: Large Data in MATLAB: A Seismic Data Processing … · Large Data in MATLAB: A Seismic Data Processing Case Study ... –Analyze seismic data files ... 12 27 42 13 28 43 14 29 44

18

Questions?