Download - Managing Oilfield Data Management- Disk 2

32

Managing Oilfield Data Management

Steve BaloughVastar ResourcesHouston, Texas, USA

Peter BettsShell Internationale Petroleum Maatschappij BVThe Hague, The Netherlands

Jack BreigBruno KarcherPetrotechnical Open Software CorporationHouston, Texas

André ErlichCorte Madera, California, USA

Jim GreenUnocalSugar Land, Texas

Paul HainesKen LandgrenRich MarsdenDwight SmithHouston, Texas

John PohlmanPohlman & Associates, Inc.Houston, Texas

Wade ShieldsMobil Exploration and Producing Technical CenterDallas, Texas

Laramie WinczewskiFourth Wave Group, Inc.Houston, Texas

As oil companies push for greater efficiency, many are revising how

they manage exploration and production (E&P) data. On the way out is

proprietary software that ran the corporate data base on a mainframe,

and on the rise is third-party software operating on a networked com-

puting system. Computing standards are making practical the sharing of

data by diverse platforms. Here is a look at how these changes affect

the geoscientist’s work.

Today’s exploration and production geosci-entist works in a world of untidy data.Maps, logs, seismic sections and well testreports may be scattered on isolated com-puters. They may exist only as tapes orpaper copies filed in a library in Dubai orstuffed in cardboard boxes in Anchorage(next page). Often, most of the battle is sim-ply locating pertinent data.1 Once data arefound, the user may need to be multilingual,fluent in UNIX, VMS, DB2, SQL, DLIS andseveral dialects of SEG-Y.2 Then, even whendata are loaded onto workstations, twointerpretation packages may be unable toshare information if they understand datadifferently—one may define depth as truevertical depth, and another as measureddepth. The main concern for the geoscien-tist is not only: Where and in what form arethe data? but also: Once I find what I need,how much time must I spend working onthe data before I can work with the data?

Despite this picture of disarray, E&P dataare easier to work on now than they werejust a few years ago, freeing more time towork with them. Better tools are in place toorganize, access and share data. Data orga-nization is improving through more flexibledatabase management software—a kind of

librarian that finds and sometimes retrievesrelevant data, regardless of their form andlocation. Access to data is widening throughlinkage of data bases with networks thatunite offices, regions and opposite sides ofthe globe. And sharing of data by softwareof different disciplines and vendors, whilenot effortless, has become practical. It willbecome easier as the geoscience commu-nity inches closer to standards for how dataare defined, formatted, stored and viewed.

Data management—the tools and organi-zation for orderly control of data, fromacquisition and input through validation,processing, interpretation and storage—isgoing through changes far-reaching andpainful. This article highlights two kinds ofchanges affecting data management: physi-cal, the tools used for managing data; andconceptual, ideas about how data, datausers and tools should be organized.

To understand where E&P data manage-ment is going, look at where it has comefrom. Today as much as 75% of geosciencedata are still stored as paper.3 Yet, the direc-tion of data management is determined bythe 25% of data that are computerized.

Oilfield Review

33July 1994

nThe incredible shrinking data store. Log data from a single well have beentransferred from tapes and papercopies in a cardboard box to a 4-mmdigital audio tape (DAT), being passedfrom Karel Grubb, seated, to Mike Wille,both of GeoQuest. This cardboard datastore was among 2000 for a project inAlaska, USA. Logs from all 2000 wellswere loaded onto 18 DATs for off-sitestorage provided by the LogSAFE dataarchive service (page 40). One DAT canhold 1.8 gigabytes, equivalent to con-ventional logs from about 100 wells.

1. Taylor BW: “Cataloging: The First Step to Data Man-agement,” paper SPE 24440, presented at the SeventhSPE Petroleum Computer Conference, Houston, Texas,USA, July 19-22, 1992.

2. UNIX and VMS are operating systems; DB2 is adatabase management system for IBM mainframes;SQL is a language for querying relational data bases;and DLIS and SEG-Y are data formats.

3. Landgren K: “Data Management Eases IntegratedE&P,” Euroil (June 1993): 31-32.

In this article Charisma, Finder, GeoFrame, IES (IntegratedExploration System), LogDB, LOGNET, LogSAFE andSmartMap are marks of Schlumberger; IBM and AIX aremarks of International Business Machines Corp.; Lotus 1-2-3 is a mark of Lotus Development Corp.; Macintosh is amark of Apple Computer Inc.; VAX and VMS are marks ofDigital Equipment Corp.; MOTIF is a mark of the OpenSoftware Foundation, Inc.; PetroWorks, SeisWorks, Strat-Works, SyntheSeis, and Z-MAP Plus are marks of Land-mark Graphics Corp.; GeoGraphix is a mark ofGeoGraphix, Inc.; ORACLE is a mark of Oracle Corpora-tion; POSC is a mark of the Petrotechnical Open SoftwareCorporation; Stratlog is a mark of Sierra Geophysics, Inc.;Sun and Sun/OS are marks of Sun Microsystems, Inc.;UNIX is a mark of AT&T; and X Window System is a markof the Massachusetts Institute of Technology.

For help in preparation of this article, thanks to NajibAbusalbi, Jim Barnett, Scott Guthery and Bill Quinlivan,Schlumberger Austin Systems Center, Austin, Texas, USA;Bill Allaway, GeoQuest, Corte Madera, California, USA;Richard Ameen, Bobbie Ireland, Jan-Erik Johansson,Frank Marrone, Jon Miller, Howard Neal, Mario Rosso,Terry Sheehy, Mike Wille and Beth Wilson, GeoQuest,Houston, Texas; Bob Born, Unocal, Sugar Land, Texas,USA; Ian Bryant, Susan Herron, Karyn Muller and RaghuRamamoorthy, Schlumberger-Doll Research, Ridgefield,Connecticut, USA; Alain Citerne and Jean Marc Soler,Schlumberger Wireline & Testing, Montrouge, France;Peter Day, Unocal Energy Resources Division, Brea, Cal-ifornia; Mel Huszti, Public Petroleum Data Model Asso-ciation, Calgary, Alberta, Canada; Craig Hall, SteveKoinm, Kola Olayiwola and Keith Strandtman, VastarResources, Inc., Houston, Texas; Stewart McAdoo,Geoshare User Group, Houston, Texas; Bill Prins,Schlumberger Wireline & Testing, The Hague, TheNetherlands; Mike Rosenmayer, GeoQuest, Sedalia, Colorado, USA.

nEvolution inpetroleum comput-ing parallels evolu-tion of computingin general.

Networked computing usingclient/server architecturebased on standards

Common third-party softwarepackages; emphasis on sharingof data and interpretationsbetween disciplines and between products from different vendors

Centralized master data base linkedwith regional master and projectdata bases, giving geoscientistsaccess to any data from any region

Organization

Centralized data control by team unlinked or looselylinked with regions

Applications

In-house proprietary, leading to islands of computing; some proprietary multidisciplinary packages

Hardware

Proprietary mainframes and microcomputers

“Data management” can increasingly becalled “computer” or “digital” data manage-ment, and its development therefore paral-lels that of information technology (right). Inthe last 30 years, information technologyhas passed through four stages ofevolution.4 Viewed through a geosciencelens, they are:

First generation: Back-room computing.Mainframes in specially built rooms ranbatch jobs overseen by computer specialists.Response time was a few hours or overnight.Data processing was highly bureaucratic,overseen by computer scientists using datasupplied by geoscientists. The highly securecentral repository of data, called the corpo-rate data base, resided here. Evolution con-centrated on hardware. Data were oftenorganized hierarchically, like branches of atree that must be ascended and descendedeach time an item was retrieved.

Second generation: Shared interactivecomputing. Minicomputers were accessedby geoscientists or technicians working atremote keyboards. Response time was inminutes or seconds. Members of a usergroup could communicate with each other,provided they had the same hardware andsoftware. Bureaucracy declined; democracyincreased. Control was still centralized andoften exercised through in-house databasemanagement software.

Third generation: Isolated one-on-onecomputing. This was the short reign of therugged individualist. Minicomputers andpersonal computers (PCs) freed the userfrom the need to share resources or workthrough an organization controlling a cen-tralized system. Computing power was lim-ited, but response time rapid. There were

34

Three Element

Jack BreigPOSC

many copies of data, multiple formats andno formally approved data. The pace of soft-ware evolution accelerated and data beganto be organized in relational data bases,which function like a series of file drawersfrom which the user can assemble pieces ofinformation.5

Fourth generation: Networked comput-ing. Also called distributed computing, thistoo is one-on-one, but allows memory, pro-grams and data to be distributed whereverthey are needed, typically via a client/serverarchitecture.6 Data are no longer justalphanumeric but also images and willeventually include audio and video formats.Interaction is generally rapid and is graphicas well as alphanumeric, with parts of thedata base viewed through a map. Linkage isnow possible between heterogeneous sys-tems, such as a VAX computer and an IBMPC. Control of data is at the populist level,with server and database support from com-puting specialists who tend to have geo-

s of a Successful E&P Data Mod

science backgrounds. Interpretation anddata management software evolves rapidlyand tends to be from vendors rather thanhome grown. Sometimes the network con-nects to the mainframes, which are reservedfor data archiving or for a master data base(more about this later). Mainframes are alsoreemerging in the new role of “superservers,”controlling increasingly complexnetwork communication.

To one degree or another, all four genera-tions often still coexist in oil companies, butmost companies are moving rapidly towardnetworked computing.7 This shift to newtechnology—and, recently, to technologythat is bought rather than built in-house—ischanging the physical and conceptualshapes of data management. Since about1990 two key trends have emerged:• The monolithic data store is being

replaced, or augmented, with three levelsof flexible data storage. A single corporatedata store8 used to feed isolated islands of

el

Comprehensive information coverage

A successful data model captures all kinds of

information used in the E&P business, as well as

sufficient semantics about the meaning of data,

to ensure that it can be used correctly by diverse

applications and users, in both planned and

unplanned ways. In other words, a semantically

rich data model optimizes the sharing of data

across different disciplines.

Useful abstractions

The complexity and volume of information man-

aged by the E&P industry is overwhelming. This

is a serious impediment to information exchange,

especially where information is generated by pro-

fessionals in dissimilar business functions. The

successful data model reveals to the user the

places of similarity in information semantics.

This capability provides business opportunities

both for users, in accessing data from unexpected

sources, and for application developers to target

new, but semantically similar, business markets.

Implementable with today’s and tomor-

row’s technology

A data model is successful today if it has a struc-

ture, rules and semantics that enable it to be effi-

ciently implemented using today’s database man-

agement tools. A data model will be successful

for the long term if it can accommodate newer

database management products, such as object-

oriented, hybrid and extended relational

database management systems.

Oilfield Review

nStructure and properties of the three-level scheme for data management. Data shownat the bottom may be kept outside the database system for better performance.

Create projects

Project data base(shared data)

Inventory catalog

Data quality

Data approval

Data volume

Areal extent

Administrativecontrol

Incr

easi

ng

Dec

reas

ing

Data volatility

Access frequency

Data versions

Distribute Integrate

Distribute Integrate

Application

External database references

Master data base(approved data)

Applicationdata store

Logdata files

Seismicdata files

Archivedata files

computing, each limited to a single disci-pline. For example, a reservoir engineer inJakarta might occasionally dip into thecorporate data store, but would have noaccess to reservoir engineering knowl-edge—or any knowledge—fromAberdeen. Oil companies today are mov-ing toward sharing of data across theorganization and across disciplines. Non-electronic data, such are cores, films andscout tickets ,9 may also be cataloged online. A new way of managing data usesstorage at three levels: the master, the pro-ject and the application.

• E&P computing standards are approach-ing reality. The three data base levels can-not freely communicate if one speaksSpanish, one Arabic and the other Rus-sian. Efforts in computing standardsaddress this problem on two fronts. One isthe invention of a kind of a common lan-guage, a so-called data model, which is away of defining and organizing data thatallows diverse software packages to sharedifferent understandings of data—such as,is a “well” a wellhead or a borehole?—and therefore exchange them (see “ThreeElements of a Successful E&P DataModel,” previous page). Data modelshave been used in other industries for atleast 20 years, but only in the last few hasa broadly embraced E&P data modelbegun to emerge. On the other front isdevelopment of an interim means for dataexchange, a kind of Esperanto that allowsdifferent platforms to communicate. Inthis category is the Geoshare standard,now in its sixth version.10

We look first at the data storage troika, andthen at what is happening with standards.

July 1994

4. This evolutionary scheme was described by BernardLacroute in “Computing: The Fourth Generation,” SunTechnology (Spring 1988): 9. Also see Barry JA: Tech-nobabble. Cambridge, Massachusetts, USA: MITPress, 1992.For a general review of information technology:Haeckel SH and Nolan RL: “Managing by Wire,” Har-vard Business Review 71, no. 5 (September-October1993): 122-132.

5. A relational data base appears to users to store infor-mation in tables, with rows and columns, that allowconstruction of “relations” between different kinds ofinformation. One table may store well names, anotherformation names and another porosity values. A querycan then draw information from all three tables toselect wells in which a formation has a porositygreater than 20%, for example.

A New Data Storage TroikaIn the days of punch cards, corporate datawere typically kept on a couple of comput-erized data bases, operated by mainframes.Flash forward 30 years to networked com-puting, and the story gets complicated.Instead of a couple, there are now tens orhundreds of machines of several types,linked by one or more networks. Andinstead of one or two centralized databases, there tends to be many instances of

6. Client/server architecture, the heart of distributedcomputing, provides decentralized processing withsome level of central control. The client is the work-station in the geoscientist’s office. When the geoscien-tist needs a service—data or a program—the worksta-tion software requests it from another machine calleda server. Server software fulfills the request and returnscontrol to the client.

7. For an older paper describing a diversity of computersystems in Mobil: Howard ST, Mangham HD andSkruch BR: “Cooperative Processing: A TeamApproach,” paper SPE 20339, presented at the fifthSPE Petroleum Computer Conference, Denver, Col-orado, USA, June 25-28, 1990.

8. This article distinguishes between a data store and adata base. A data store is a collection of data that maynot be organized for browsing and retrieval and isprobably not checked for quality. A data base is a datastore that has been organized for browsing andretrieval and has undergone some validation and qual-ity checking. A log data store, for example, would bea collection of tapes, whereas a log data base wouldhave well names checked and validated to ensure thatall header information is correct and consistent.

data storage: master data base, project database and application data stores (above).Sometimes the old corporate data base hasbeen given a new job and a new name.Instead of holding all the data for the corpo-ration, it now holds only the definitiveinstances, and so is renamed the “master.”There may be one centralized master, buttypically there is one master per operatingregion or business unit.

35

9. A scout ticket is usually a report in a ring binder thatincludes all drilling-related data about a well, fromthe time it is permitted through to completion.

10. For a description of the evolution of the Geosharesystem: Darden S, Gillespie J, Geist L, King G, Guth-ery S, Landgren K, Pohlman J, Pool S, Simonson D,Tarantolo P and Turner D: “Taming the GeoscienceData Dragon,” Oilfield Review 4, no. 1 (January1992): 48-49.

In the master data base, data quality ishigh and the rate of change—what informa-tion scientists call volatility—is low. Thereare not multiple copies of data. Althoughthe master data base can be accessed orbrowsed by anyone who needs data,changes in content are controlled by a dataadministrator—one or more persons whodecide which data are worthy of residing inthe master data base. The master data basealso contains catalogs of other data basesthat may or may not be on-line. The masteris often the largest data base, in the gigabyterange (1 billion bytes), although it may beovershadowed by a project data base thathas accumulated multiple versions of inter-pretations. The master may be distributedover several workstations or microcomput-ers, or still reside on a mainframe.

From the master data base, the user with-draws relevant information, such as “allwells in block 12 with sonic logs between3000 and 4000 meters,” and transfers it toone or more project data bases. Key charac-teristics of a project data base are that ithandles some aspects of team interpretation,is accessed using software programs fromdifferent vendors and contains multidisci-plinary data. The project may contain datafrom almost any number of wells, from 15

Mobil Exploration & ProducingTechnical Center

A sign on the wall in one of Mobil’s Dallas offices

sums up the most significant aspect of how Mobil

is thinking about data management: “Any Mobil

geoscientist should have unrestricted and ready

worldwide access to quality data.” The sign

describes not where the company is today, but

where it is working to be in the near future.

The emphasis on data management is driven by

Mobil’s reengineering of its business. Data man-

agement is affected by fundamental changes in the

current business environment and in how data are

used within this environment.

Traditionally, exploration funds were allocated

among the exploration affiliates, largely in propor-

tion to the affiliate size. Within a changing busi-

ness environment requiring a global perspective,

36 Oilfield Review

to 150,000. Regardless of size, data at theproject level are volatile and the projectdata base may contain multiple versions ofthe same data. Updates are made only bythe team working on the project. Interpreta-tions are stored here, and when the teamagrees on an interpretation, it may be autho-rized for transfer to the master data base.

A third way of storing data is within theapplication itself. The only data stored thisway are those used by a specific vendor’sapplication program or class of applications,such as programs for log analysis. Applica-tion data stores are often not data base man-agement systems; they may simply be a filefor applications to read from and write to.Applications contain data from the projectdata base, as well as interpretations per-formed using the application. Consequently,the data may change after each work ses-sion, making them most volatile. Becauseapplications were historically developed byvendors to optimize processing by thosespecific applications, applications from onevendor cannot share data with applicationsfrom other vendors, unless special softwareis written. Vendors have begun to producemultidisciplinary applications, and manyapplications support the management ofinterpretation projects. In addition, the pro-prietary nature of application data stores

will change as vendors adopt industry stan-dards (see “Razing the Tower of Babel—Toward Computing Standards,” page 45). Asa result of these changes, data managementsystems of the future will have just two lev-els, master and project.

The Master LevelThe master data base is often divided intoseveral data bases, one for each major datatype (next page). A map-type program isusually used to give a common geographiccatalog for each data base. An example ofthis kind of program is the MobilView geo-graphical interface (see “Case Studies,”below).

Because the master data base is some-times cobbled together from separate, inde-pendent data bases, the same data may bestored in more than one place in the master.BP, for instance, today stores well headerinformation in three data bases, although itis merging those into a single instance. Anideal master data base, however, has a cata-log that tells the user what the master con-tains and what is stored outside the master.An example is the inventory and location ofphysical objects, such as floppy disks, films,cores or fluid samples. This is sometimescalled virtual storage. The master data base,therefore, can be thought of as not a single,

Case Studies

funds should be allocated to take advantage of the

best opportunities regardless of location. The

organization as a whole is also leaner, placing

more demands on a smaller pool of talent. A

petrophysicist in Calgary may be called to advise

on a project in Indonesia. All geoscientists, there-

fore, need to understand data the same way. When

an interpretation and analysis are performed with

the same rules, right down to naming conventions,

then not only can data be shared globally, but any

geoscientist can work on any project anywhere.

A second motivation for restructuring data man-

agement is a change in who is using data. Mobil

was traditionally structured in three businesses:

Exploration, Production, and Marketing and Refin-

ing. Exploration found oil for its client, Production,

which produced oil for its client, Marketing and

Refining. Now the picture is not so simple. Explo-

ration’s client may also be a government or

national production company. Production may also

work as a consultant, advising a partner or a

national company on the development of a field.

Now, with a broader definition of who is using

data, there is a greater need for data that can be

widely understood and easily shared.

For the past three years, the company has been

moving to meet this need. In 1991, Mobil estab-

lished 11 data principles as part of a new “enter-

prise architecture.” The company is using the prin-

ciples as the basis for policies, procedures and

standards. Mobil is now drafting its standards,

which include support of POSC efforts, down to

detail as fine as “in level five of Intergraph design,

a file will have geopolitical data.”

Mobil is reengineering data management on two

fronts, technology and organization. Technology

changes affect the tools and techniques for doing

things with data: how they are validated, loaded,

stored, retrieved, interpreted and updated. Organi-

zational changes concern attitudes about who uses

what tools for what ends. Both sides carry equal

weight. This involves a realization that technology

alone doesn’t solve problems. It also involves the

“people side,” most importantly, the attitude of

geoscientists toward data management.

An example of a new way to manage the people

side is the formation of a series of groups coordi-

nated by the Information Technology (IT) depart-

ment called Data Access Teams, which expedite

data management for the Mobil affiliates. A Data

Access Team comprises Dallas staff and contrac-

tors who validate, input, search and deliver data to

an affiliate site. Each site, in turn, collaborates

with the team by providing a coordinator familiar

with the local business needs and computer sys-

tems. Another means of addressing the people

side of data management is making data manage-

ment part of a geoscientist’s performance

appraisal. This becomes increasingly important as

more data go on-line.

In the past, geoscientists would rifle closets and

ask colleagues for data. Any bad data would get

tossed or worse, ignored. Now, if a geoscientist

37

finds bad data, it is that individual’s responsibility

to alert the data administrator and have the prob-

lem corrected. The new thinking is that data qual-

ity isn’t someone else’s job; it has to become part

of the culture.

A fundamental component of Mobil’s IT strategy

is a common storage and access model. Typically,

each of the affiliates maintains its own, stand-

alone E&P database system, along with links into

the central data bases in Dallas. The local systems

have evolved independently because local needs

can vary. Frontier areas, for example, work only

with paper data, whereas Nigeria is nearly all

electronic. Despite these differences, a common

data storage and access model was endorsed by

all affiliates in the early 1990s.

This model incorporates access to existing data

sources as well as the concept of master and pro-

ject data stores (right). The master data base

maintains the definitive version of the data while

the project data base is maintained as a more

ephemeral working environment interfacing with

the applications. The project builder serves as a

nBP’s master data base: present, nearfuture and long term. Currently, the masteris divided into discrete data bases andassociated applications that have beenbuilt to address the needs of specific disci-plines. The advantage of this approach isthat data ownership is clearly defined. Dis-advantages include duplicate data sets,difficulty ensuring integrity of data sharedby more than one discipline, andincreased data management workload.The near-term solution increases the per-meability of walls separating disciplines,and the ultimate solution dissolves thesewalls, making use of standards estab-lished by the Petrotechnical Open SoftwareCorporation (POSC).

July 1994

Data Stores: Ultimate Solution

Data Bus

Data Stores: Short-term Future

Data Bus

Common Well Data

Geoshare

Technical Data Commercial Data

Data Stores: Present

Geology &Production Petrophysics

Geophysics Cartography 3D Models

Drilling

Geology &Production Petrophysics Drilling Nonwell

related data

Vendorsupplied

data bases

POSCcompliant

DB #1

POSCcompliant

DB #2

POSCcompliant

DB #3

POSCcompliant

DB #4

nOverview of Mobil’s data management model. Thismodel, developed in the early 1990s, is part of a strat-egy for migration to POSC compliance.

Project Builder(searches master and

downloads data)

Appl icat ions

MasterData Store

Project Data Store

Mig

ratio

n to

war

dP

OS

C c

ompl

ianc

e

Legacy Data Management Systems

nActivity screen inPAGODA. The con-trol panel at leftallows the user toobtain detailsabout selectedfields in the mainscreen and navi-gate the system.The central screenshows curve sum-mary information,and the mud datascreen givesdetails of mudproperties.

38 Oilfield Review

PAGODA

Accept

Store

Quit

Help

Clear

Prev_occ

First_occ

Ret

Nex

Las

Add_occ

Rem_occ

Ins

Zoom

Print... Det

Panel PAGODA

Commands Session Form Record Field Utilities Help

WEN

SLB

TEMP/STM/DTT/DITE/SDT/SRT

TOH .001

72-453

-0.1524 3865.77 3079.54

5160.00

DEPT

BS

TOD

TIME

ETIM

Bit Size

Time of Day

Time

Elapsed Tim

-0.0508 3865.88 3079.54

15480.0

DEPT

RI0

RI1

SMNO

SMIN

Raw MSFL Co

Raw I1 Conc

Synthetic N

Synthetic N

Show Data Specification

Show Property Indicators

Browse T

Well/Hole

Activity

Service

Sequence

Curve Indexes:

1/1

1/1

1/41

1/5

1/2

----- Unit ----- Curves: 1/141

Inc/Frame Start Stop Name

Name

Type

Name

Type

MI0010:Only current Field,Hole,etc. RETRIEV

CURVE SUMMARY

WEN LIUField 1/1

Accept Quit Panel... Help...

Updated By

MUD DATA

Stop

API 440 SCHLUMBERGER

29-May-91

pagoda

29-May-91

15-dec-93

5:29 MAY 29

30 % OIL/WATER

2.12

75

12

1.1999

FLOWLINE

0.0998

27.1999

PRESS

0.072

g/cm3

s

cm3

ohm.m

degC

ohm.m

Drilling Fluid Loss

Activity Parameters ( 7/26)

Date Circulation Stopped

Time Circulation Stopped

Drilling Fluid Type

Drilling Fluid Viscosity

Drilling Fluid PH

Mud Sample Source

Resistivity of Mud Sample

Mud Sample Temperature

Mud Filtrate Sample Source

Resistivity of Mud Filtrate S

Unit

Drilling Fluid Density

Contractor

Activity Type

Activity Start

Loaded By

Session Form Record Field Utilities Help

Mud

Data

geographical search, retrieval and archival mech-

anism. The master and project data stores are

based on the POSC Epicentre model and imple-

mented in the Open System/UNIX environment.

The model calls for gradual replacement of most

of the legacy data bases, based on business

drivers. The company will maintain selected

legacy systems for the foreseeable future, but in a

way that minimizes resources and does not involve

formal development.

Being competitive in the current business envi-

ronment requires a change in business processes.

This change from a local focus to a global perspec-

tive is particularly important in IT. Within Mobil’s

E&P division, the major component of this initia-

tive is a paradigmatic shift in the way data are

handled at all levels and the development of a

POSC-based data storage and access model. In

the context of the corporate IT mission, this strat-

egy was developed from the ground up, with input

from all Mobil affiliates.

Unocal Energy Resources DivisionUnocal is an integrated energy resources company

that in 1993 produced an equivalent of 578,000

barrels of oil per day and held proven reserves

equivalent to 2.2 billion barrels of oil.1 By assets,

Unocal ranked thirteenth worldwide in 1993, com-

pared to Mobil in second place.2 Exploration and

production is carried out in about ten countries,

including operations in Southeast Asia, North

America, former Soviet Union and the Middle East.

Unocal began a reorganization in 1992 that will

eventually converge most all exploration activities

at a facility in Sugar Land, Texas, USA. Concurrent

with this restructuring is a change, started in 1990,

in the physical database system—from IBM main-

frames running in-house database, mapping and

interpretation software, to third-party software on

1. Unocal 1993 Annual Report: 8.2. National Petroleum News. “Oil Company Rankings by

Marketers with Assets of $1 Billion or More,” (June1993): 16.

Cou

rtesy

of S

hell

Inte

rnat

iona

le P

etro

leum

Maa

tsch

appi

j BV

.

a client/server system. The group responsible for

overseeing this shift is Technical Services, based

in Sugar Land (next page). Here, 13 staff scientists

and five on-site contractors manage 200 gigabytes

of data on about two million wells. This is the

largest portion of the company’s one terabyte of

E&P data, and is accessed by about 275 users in

both exploration and business units.

The Technical Services group performs two

main tasks: it supports client/server-based geo-

science applications, and it maintains and dis-

tributes application and database systems. There

are 23 software systems under its wing: 11 geo-

physical applications, five mapping and database

applications, five geologic applications and two

hard-copy management systems, which are cata-

logs of physical inventories such as seismic sec-

tions and scout tickets. Altogether, the 23 systems

represent products from at least 20 vendors,

Case Studies

(continued on page 40)

39July 1994

Local IS support

IS s

upp

ort

Energy ResourcesTechnology (Research)

WorldwideExploration

Energy Resources

BusinessDevelopment

TechnicalServices

GeoscienceComputing

Technology(application

configuration and database

systems)

Operations

Asset Teams(field development)

Local information

systems

Asset Teams(field

development)

Large Business Units

• US Gulf Coast• Louisiana• Thailand• Indonesia

Exploration Groups• CIS• Gulf/North America• Far East• Middle East/Latin America

• Alaska• Central US• Calgary• The Netherlands• Aberdeen• Syria

Small Business Units

Local informationsystems (IS)

comprehensive library, but as a large digitalfiling cabinet and a catalog directing theuser to other filing cabinets that may beadjacent, across the hall or across theocean, and may contain data in anyform—digital, paper, cores and so on.

One component of storage at the masterlevel is the archive. The name implies long-term storage with infrequent retrieval, butthe meaning of archive is shifting with newways of organizing and accessing data. Anarchive may be an off-site vault with seismictapes in original format, or an on-site opticaldisk server with seismic data already format-ted for the interpretation workstation. Thereare countless variations, with differingdegrees of “liveness” of the data. Two exam-ples of archives are the PAGODA/LogDBand LogSAFE systems.

The PAGODA/LogDB system is a jointdevelopment of Shell InternationalePetroleum Maatschappij (SIPM) and

Schlumberger. PAGODA designates theproduct in Shell Operating Companies andAffiliates; the LogDB name is used for theSchlumberger release. The objective of thejoint development was to produce a safestorage and retrieval system for informationgathered during logging and subsequentdata evaluation.

The PAGODA/LogDB system comprisessoftware and hardware for loading, storingand retrieving data acquired at wells, suchas wireline logs, data obtained whiledrilling, borehole geophysical surveys andmud logs. Original format data are scannedduring loading to ascertain header informa-tion and the integrity of the data format. Allpopular data formats are supported. Oncescanning is complete, well identifiers arevalidated against a master list of wellnames. Extracted header information is writ-ten to ORACLE tables—a type of relationaldatabase table—and bulk data are trans-

ferred to a storage device, such as an on-line jukebox of optical disks, magnetic disksor other archival media. Once data are vali-dated and loaded, they can be quicklyviewed, searched and retrieved for export toother applications (previous page).

The PAGODA/LogDB system fulfills tworoles. As an off-line archival data base, thesystem is capable of storing terabytes of datathat are safely held on a durable mediumand can be retrieved and written to disk ortape. Exported data can be provided ineither original format or via a Geoshare link-age (page 46). As an on-line data base, thesystem makes all header information avail-able for browsing and searching. The sys-tems are supported on VAX/VMS, Sun/OSand IBM/AIX operating systems.

The PAGODA and LogDB systems areessentially the same. To date, PAGODA hasbeen installed in most of Shell’s larger Oper-ating Companies, where log data held on

nA data management organization chart for Unocal’sEnergy Resources division. The company’s TechnicalServices group, based in Sugar Land, Texas, con-tributes to data management for the four explorationgroups, which are also based in Sugar Land, and largeand small business units. The business units are mov-ing their regional data models toward a single standard.

UNIX platform

1

PC platform

23

4

5

Dataindex(tbd)

Landmarkdata base

IntegratedLandmark

interpretationapplications

Commercialdata

sourceRegionaldata base(Finder)

GeoGraphixdata base

GeoGraphixmapping

applications

Catalog of digital and hard-copy data1

Best Unocal data from past projectsand commercial sources

2

Managed and converted by Geoscience Computing

3

SeisWorks, StratWorks, Z-MAP Plus, SyntheSeis, PetroWorks, etc.

4

Application for interpretations5

The LogSAFE system places logs from anyvendor in a relational data base, validatingdata and storing them in their original for-mat. Any tape format can be read and papercopies digitized. Before it is entered into thedata base, the entire file is inspected tomake sure it is readable, and the data owneris informed of any problems in data quality.Validated data are then archived on DATs(left). By the end of 1994, data will also bestored for faster, automated retrieval on ajukebox of optical disks. Clients have accessonly to their own data, and receive a cata-log of their data and quarterly summaries oftheir holdings. The LogSAFE system is basedin Sedalia, Colorado, USA, at the samefacility that houses the hub of the Schlum-berger Information Network. This permitsautomatic archiving of Schlumberger logsthat are transmitted from a North Americanwellsite via the LOGNET service, whichpasses all logs through the Sedalia hub.

LogSAFE data can be accessed severalways, giving it the similar flexibility and“live” feel of an in-house data base. Dataare either pulled down by the user, or

earlier systems or in tape storage are beingmigrated to PAGODA. The PAGODA sys-tem also caters to the storage of Shell propri-etary data formats and calculated results.The LogDB system has been installed in sev-eral oil companies and government agen-cies and is being made available as a load-ing and archive service at Schlumbergercomputing centers worldwide. The LogDBsystem can be coupled to the Finder datamanagement system, in which well surfacecoordinates can be stored together with lim-ited log sections for use during studies.

The LogSAFE service performs some ofthe same functions, but not as a tightly cou-pled data base. It is an off-line log archivethat costs about half as much as an internaldatabase system, yet can deliver logs overphone or network lines in a few hours. Foroil companies that prefer a vendor to man-age the master log data base, the LogSAFEsystem functions as an off-site data base,accessed a few times a week. For largecompanies—those with more than about 15users of the data base—it functions as anarchive, accessed once or twice a month.

which scan two project data bases or a project and

a master data base and produce a list of differ-

ences between them. Quick identification of differ-

ences between versions of data is essential in

deciding how to update projects with vendor data

or how to update the master data base with geo-

scientists’ interpretations from a project or

regional data base. Today this comparison is done

manually. “The client/server environment cannot

be considered mature until we have robust, auto-

mated data management tools,” Green says.

A debate within Unocal concerns finding the

best way to organize the work of geotechnicians,

making Unocal advanced in its move from in-

house software (right).

Jim Green, who manages Unocal’s exploration

computing services, lists two essential ingredients

for successful management of E&P data: training

for users of new data handling systems and auto-

mated tools for moving data. By 1995, Unocal will

move all computerized E&P data from flat-file,

hierarchical systems to ORACLE-based relational

systems.3 A company-wide program is bringing

geotechnicians up to speed on data loading and

quality checking.

The diversity of Unocal’s software underscores

the importance of managing multiple versions of

data. The company is encouraging vendors to

develop a family of automated software tools: data

movers, editors, loaders, unloaders and compari-

tors. Of particular importance are comparitors,

nArchitecture of Unocal’s interpretation applicationsand data stores. Pair-wise links connect all platforms.Long-term evolution is toward the UNIX platform.

40 Oilfield Review

3. A flat-file data base contains information in isolatedtables, such as a table for porosity data and another forpaleontology data. The tables cannot be linked. A rela-tional data base allows linkage of information in manytables to construct queries that may be useful, such as“show all wells with porosity above 20% that have coc-colithus.” Flat file data bases are simple and inexpensive.Relational data bases are complex, expensive and can beused to do much more work.

nBob Lewallen, senior log data technicianwith Schlumberger, returning 90 gigabytesof log data to the LogSAFE vault in Sedalia,Colorado, USA. Original logs are copied totwo sets of DATs stored in the vault. A thirdbackup is kept in a fireproof safe off-site.To ensure data integrity, data are vali-dated and rearchived every five years.

Jill O

rr p

hoto

who load data and contribute to quality assurance

(next page). The main issue is whether geotechni-

cians are most effective if they work by geographic

area, as they do now, or by discipline. The advan-

tage of focusing geotechnicians by area is that

they learn the nuances of a region, and develop

E & P Data Acquisition

scie

ntis

tsan

dag

emen

t

Data gatheredby Unocal

• Well info• UKOOA• Geopolitical

Data Purchased• Tobin• Petroleum Information• Petroconsultants• Seismic surveys

pushed from Sedalia at client request. Dataare transferred either over commercial ordedicated phone lines, or more commonly,through the Internet. Internet connectionsare generally faster than phone lines, andallow transmission of 500 feet of a boreholeimaging log in 5 to 10 minutes.

In operation since 1991, the LogSAFE ser-vice now serves 70 clients and holds datafrom 13 countries. As oil companies con-tinue to look to vendors for data manage-ment, the LogSAFE service is expected toexpand from its core of North Americanclients. Today it handles mostly logs and ahandful of seismic files, but is expected toexpand to archive many data types.

The Project and Application levelsTo the geoscientist, the project data base islike a desktop where much of the actiontakes place. Relevant resources from themaster data base are placed in the projectdata base. To work on project data, usersmove it from the project level into the appli-cation software. The process of interpreta-tion, therefore, involves frequent exchange

nUnocal’s data flow anddivision of labor. Vendordata are initially keptseparate from Unocaldata, mainly for qualityassurance.

Data Management

Geo

man

Dat

abas

ead

min

istr

ato

rsG

eosc

ient

ist

qua

lity

chec

k

Dat

abas

ead

min

istr

ato

ran

d lo

cal

info

rmat

ion

syst

ems

sup

po

rt

Validateddata

Validateddata

Application data bases

Vendor DataStore

Local InformationSystems Group

• Load• Periodic updates

Geotechnicians• Quality check• Load

Load projectdata

Projectdata base

Archive Data Base

• Exploration areas• Business units

Regional Data Base-organized geographically

cohesion with other team members and a sense of

ownership of the data. The advantage of organiz-

ing geotechnicians by discipline—seismic, pale-

ontology, logs and so on—is a deeper understand-

ing of each data type and better consistency in

naming conventions and data structure. The opti-

mal choice is not so obvious. Complicating the

picture is the changing profile of the typical

geotechnician. Ten or 15 years ago, a geotechni-

cian held only a high school diploma and often

preferred to develop an expertise in one data type.

Today, many have master’s degrees and prefer a

job with greater diversity that may move them

toward a professional geoscience position.

Whichever way Unocal moves, Green says, the

company is addressing a change in data culture.

Geotechnicians today are increasingly involved in

broader matters of data management, including

the operation of networks and interapplication

links. “We need both technicians and geoscien-

tists who understand what to do with a relational

data base,” Green says, “who understand the

potential of interleaving different kinds of data.

The future belongs to relational thinkers and that’s

what we need to cultivate.”

41July 1994

between the project and application levels,and when a team of geoscientists arrives atan interpretation and wants to hold thatthought, it is held in the project data base.

The distinction between the levels is notrigid. For example, software that works atthe project level may also overlap into themaster and application levels. What setsapart project database systems is where theylie on the continuum of master-project-application. Two examples are the Findersystem, which manages data at the projectand master levels, and the GeoFrame sys-tem, which manages project data for inter-pretation applications.

The Finder system wears two hats. It func-tions as a data base, holding the main typesof E&P data, and as a data access system,providing a set of programs to access and

nStructure of the Finder system. The system lebread-and-butter data: Well, log, seismic, leas

42

Cross Section SmartMap

Geoshare Tool Kit

GeoshareDatabas• SQL• Imbed

Operating System

Graphics (Graphical Kernel System)

Ap

plic

atio

nsLi

bra

ries

Sys

tem

X Window System User Interface (MO

• UNIX• VMS

• Sun/OS• AIX

Relational DManagemen

view data (below).11 The Finder system con-sists of three main components: the data,interfaces to the data, and the underlyingdata model, which organizes the data.

The data bases reside in two places.Inside the relational data base managementsystem are parametric data, which are singlepieces of data, such as those in a log header.Usually, relational data bases like the oneFinder uses (ORACLE), cannot efficientlyhandle vector data, also called bulk data.These are large chunks of data composed ofsmaller pieces retrieved in a specific order,such as a sequence of values on a porositycurve or a seismic trace. For efficiency andresource minimization, vector data arestored outside the relational structure, andreferences to them are contained in the database management system. Data bases are

ts the geoscientist work with five kinds ofe and cultural.

1 (Well)2 (Log)3 (Seismic)4 (Lease)5 (Cultural)

Loaders Utilities

Finder Tool Kit

e Access

ded SQL

Bulk Data1 (Well)2 (Log)3 (Seismic)4 (Lease)5 (Cultural)

Network

Tables

N-lists

Transmission ControlProtocol/Internet Protocol

Workstation

SQL*NET

TIF)

atabaset Server

typically set up for the most common typesof data—well, log, seismic, lease and cul-tural, such as political boundaries andcoastlines—but other data types and appli-cations can be accessed. The Finder systemalso includes tools for database administra-tors to create new projects, new data types,new data base instances and determinesecurity. Users can move data between themaster and project levels.

The component most visible to the user isthe interface with the data base and its con-tent. There are several types of interfaces.The most common is an interactive map orcross section (next page). Another commoninterface is the ORACLE form. Typical formslooks like a screen version of the paperform used to order data from a conven-tional library (page 44, top). In the future,

Well data are an encyclopedic description ofthe well. The 50-plus general parametersinclude well name, location, owner, casingand perforation points, well test resultsand production history.

1

Logs are represented either as curves or in tabular form. Multiple windows can be displayed for crosswell correlations. Log data are used to select tops of formations and are then combined with seismic “picks” to map horizons, one of the main goals in mapping.

2

Seismic includes the names of survey lines, who shot them, who processed the data, how it was shot (influencing the “fold”) and seismic navigation data. Also listed are “picks,” the time in milliseconds to a horizon, used to draw a subsurface contour map through picks of equal value.

3

Lease data include lease expiration, lessee and lessor and boundaries.

4

Cultural data include coastlines, rivers, lakes and political boundaries.

5

Oilfield Review

nTwo interfaces of the Finder system. The most popular interfaces are graphical.Two examples are the SmartMap software (top) and the Cross Section module (bot-tom). SmartMap software positions information from the data base on a map, andallows selection of information from the data base by location or other attributes.The user can tap into the data base by point-and-click selection of one or moremapped objects, such as a seismic line or a well. The Cross Section module is used

58°N

57°N

56°N0°W 3°E 6°E

forms will also be multimedia, mergingtext, graphics and maps. The user alsointeracts with the data base throughunloaders—utilities that convert data froman internal representation to external, stan-dard formats—and through external appli-cations, like a Lotus 1-2-3 spreadsheet.

A more powerful interface is StandardQuery Language (SQL—often spoken“sequel”), which is easier to use than a pro-gramming language but more difficult thanspreadsheet or word processing software.12

The Finder system includes at least 40 com-mon SQL queries, but others that will beused often can be written and added to amenu for quick access, much like a macro inPC programming. Often, a database admin-istrator is the SQL expert and works with thegeoscientist to write frequently used SQLqueries. A point-and-click interactive tool forbuilding SQL queries is also available forusers with little knowledge of SQL.

The less visible part of the Finder system,but its very foundation, is the data model.Today, the Finder data model is based onthe second and third versions of a modelcalled the Public Petroleum Data Model(PPDM), which was developed by a non-profit association of the same name.13 Tosatisfy new functionalities, the Finder modelhas evolved beyond the PPDM in someareas, notably in handling seismic andstratigraphic data. The evolutionary path ismoving the Finder model toward the Epi-centre/PPDM model, which is a merger ofthe PPDM model with Epicentre, a modeldeveloped by another industry consortium,the Petrotechnical Open Software Corpora-tion (POSC).

In the last two years, the Finder systemhas gone through several cycles of reengi-neering, making it more robust, more flexi-ble and easier to use. Updates are releasedtwice a year. Since 1993, for example,about 50 new functions have been added.Four changes most notable to the user are:• easier user interface. Full compliance

with MOTIF, a windowing system, speedsaccess to data, and makes cutting andpasting easier.

to post in a cross-section format depth-related data, such as lithology type, perfora-tions and, shown here, formation tops. Wells are selected graphically or by SQLquery, and can be picked from the base map.

43July 1994

11. For an accessible review of data bases and databasemanagement systems: Korth HF and Silberschatz A:Database System Concepts. New York, New York,USA: McGraw-Hill, Inc., 1986.

12. This language, developed by IBM in the 1970s, is forsearching and managing data in a relational database. It is the database programming language thatis the most like English. It was formerly called“Structured Query Language,” but the name changereflects its status as a de facto standard.

13. PPDM Association: “PPDM A Basis for Integration,”paper SPE 24423, presented at the Seventh SPEPetroleum Computer Conference, Houston, Texas,USA, July 19-22, 1992.

44

nAn ORACLE form for a scout ticket. The 80 types of forms preprogrammed into the Finder system do more thandisplay data. They include small programs that perform various functions, such as query a data base andcheck that the query is recognizable—that the query is for something contained in the data base. The programsalso check that the query is plausible, for instance, that a formation top isn’t at 90,000 feet [27,400 m]. Formsmay perform some calculation. The well form, for example, contains four commonly used algorithms for theautomated calculation of measured and true vertical depth. Forms can be accessed by typing a query or click-ing on a screen icon. A form showing all parameters available for a well, for example, can be accessed throughthe base map by clicking on a symbol for a well.

nThe GeoFramereservoir character-ization system. TheGeoFrame system isa suite of modularapplications thatare tightly linkedvia a common database and packageof software thathandles intertaskcommunication.This software auto-matically makesany change in thedata base visible toeach application. Italso allows the userto communicatedirectly betweenapplications. Forexample, a cursorin a cross-sectionview will highlightthe well in the mapview.

FLOW OILCECIL

OUT

H0508001

HUSKY ET AL CECIL 16-20-84-8

AR

01-OCT-9123-JUL-88

31-JAN-88

21-JAN-88

24-JAN-88

AO131589

1130

DOIG

618.9614.5

NAL73

56.305279-113.21853

37423

WS

BFSPDDYCDTTHRMNNTKNFLHAFLHBFLHCFLHDFLHEWLRCBLSKCTNGCDMNFRNIRCKKPKCPNRDG

IPLIPLIPLIPL

IPL

IPLIPLIPLIPLIPLIPLIPLIPLIPLIPLIPLIPLIPL

262.978.954.943.9

-404.1

28.9-49.1-80.1-105.1-121.1-161.1-201.1-285.1-304.1-339.1-347.1-361.1-378.1

356540564575

1023

590

699724740780820904925958966980997

262.978.954.943.9

-404.1

28.9-49.1-80.1-105.1-121.1-161.1-201.1-285.1-304.1-339.1-347.1-361.1-378.1

668

356540564575

1023

590

699724740780820904925958966980997

668

21 9

MICROLOGSONICCALIPERGAMMA RAYFMN DENSITYCALIPERGAMMA RAY

875263.8263.8

263.8

1122112111211121113011301130

263.8263.8263.8

1

DIAMOND 001 1054 1070 16 Y

3

JETJETACIDSO

1059.510571057

1063.51059.51063.5

24-FEB-8827-FEB-8829-FEB-88

3 2

LICENSEDSPUDDEDFINDRILLCASED

21-JAN-8824-JAN-8831-JAN-8804-FEB-88

25.231.771.03.04.08

27.1610.84.110.4

221344868.2134.7

1CLLK

2

CLOSED CHMBRCLOSED CHMBRCLOSED CHMBR

000100010001

555

105210521052

107010701070

01-FEB-8801-FEB-8801-FEB-88

15M MUD @NR/CL2.33E3 GASFLW @NR/CL424M OIL @NR/CL

100162008408W600

HelpAction Edit Block Field Record Query

scout_ticket

Operator:

Name:

Prov:Class:Status:Field:Z/Pool:

Spud:Fin D:Cur Status:On Prod:Lic. Date:

TD:TVD:PBD:Whip:Lic:

KB Elev:Ground:Deviated:Fault:Old frm:

Latitude:Longitude:N/S coord:E/W coord:

FormNo. of Formations

Source MD SS-MD TVD SS-TVDNumber of LogsLog Type Top Base

Number of CoresLog Type No Top Base Rec A

Number of CompletionsComp Type Top Base Date

No of Statuses Number of CasingsStatus Date Casing Type Size Depth

Initial Latest CumulativeNo of Prod ZonesProd Fluid FromOIL M3/dayGAS 1000M3/dayWATER M3/dayWORGOR 1000M3/M3

Number of DSTsTest Type No. VOT Top Base RDepth FFP Date

<Replace>Count: #1

UWI:

IntertaskCommunication

Log Editing &Marker Interpretation

GeologicalInterpretation

Geophysical Interpretation3D Visualization

and Interpretation

Engineering Model Building

Mapping

Project Data Management

Ava

ilabl

e ye

ar e

nd

In next major release

PetrophysicalInterpretation

nPair-wise versus data-bus linkage of applications.

Pair-wise Links

Geoshare Links

Geoshare Data Bus

Six applications, 15 links; an additional applicationmeans six new links.

Six applications, six links; an additional applicationrequires only one new link.

If vendor A program is modified, five linksmay need modification;who maintains the link: Vendor A? Vendor B? Customer?

If vendor A program is modified, only A half-links need modification. Eachvendor maintains its own half-links.

A

B

C

D

E

F

A B C

DEF

• data model expansion. The data modelnow accommodates stratigraphic and pro-duction data and improves handling ofseismic data.

• mapping and cross-section enhance-ments. As the data model expands, graph-ical capabilities expand to view newkinds of data. Enhancements have beenmade to cross-section viewing, and addi-tions include 3D seismic overlay andbubble mapping. A bubble is a circle on amap that can graphically display any datain the data base. A circle over a well, forexample, can be a pie chart representingrecoveries from a drillstem test.

• enhanced data entry. All data loadershave the same look and feel. Loadershave been added that handle data in LogInterpretation Standard (LIS), generalizedAmerican Standard Code for InformationInterchange (ASCII) and others.

Unlike the Finder system, which worksmainly on the master or project level, theGeoFrame reservoir characterization systemworks partly on the project level but mainlyon the application level. The GeoFrame sys-tem comprises control, interpretation andanalysis software for a range of E&P disci-plines. The system is based on the publicdomain Schlumberger data model, which issimilar to the first version of Epicentre andwill evolve to become compliant with thesecond version. Applications today coversingle-well analysis for a range of topics:petrophysics, log editing, data management,well engineering and geologic interpretation(previous page, bottom). Additions expectedby the end of the year include mapping withgrids and contours, model building and 3Dvisualization of geologic data.

GeoFrame applications are tightly inte-grated with a single data base, meaning,most importantly, that there is only oneinstance of data. From the user’s perspec-tive, tight integration means an updatemade in one application can become auto-matically available in all applications.(Loose integration means data may reside inseveral places. The user must thereforemanually ensure that each instance ischanged during an update). Project datamanagement is performed within GeoFrameapplications, but it may also be performedby the Finder system, which is linked todayto the GeoFrame system via the Geosharestandard (see “Petrophysical Interpretation,”page 13).

45July 1994

Razing the Tower of Babel—TowardComputing StandardsMoving data around the master, project andapplication levels is complicated by propri-etary software. Proprietary systems oftenprovide tight integration and therefore fast,efficient data handling on a single vendor’splatform. But they impede sharing of databy applications from different vendors, andmovement of data between applicationsand data bases. Nevertheless, many propri-etary systems are entrenched for the nearterm and users need to prolong the useful-ness of these investments by finding a wayto move data between them.

A common solution is called pair-wiseintegration—the writing of a software linkbetween two platforms that share data. This

solution is satisfactory as long as platformsoftware remains relatively unchanged andthe number of linked platforms remainssmall. Unocal, for example, maintains fivesuch links at its Energy Resources Divisionin Sugar Land, Texas (see “Case Studies,”page 38). A revision to one of the platforms,however, often requires a revision in thelinking software. With more softwareupdates and more platforms, maintenanceof linking programs becomes unwieldy andexpensive.

A solution that sidesteps this problem iscalled a data bus, which allows connectionof disparate systems without reformatting ofdata, or writing and maintaining many link-ing programs (above). A data bus that has

received widespread attention is theGeoshare standard, which was developedjointly by GeoQuest and Schlumberger andis now owned by an independent industryconsortium of more than 50 organizations,including oil companies, software vendorsand service companies. The Geoshare sys-tem consists of standards for data contentand encoding—that is, the kind of informa-tion it can hold, and the names used for thatinformation. It also includes programs calledhalf-links for moving data to and from thebus (right ). Geoshare uses a standard fordata exchange—which defines what dataare called and the sequence in which theyare stored—that is an industry standardcalled API RP66 (American Petroleum Insti-tute Recommended Practice 66). TheGeoshare data model is evolving to becomecompatible with the Epicentre data modeland its data exchange format, which is a“muscled” version of the Geoshare standard.

Half-links are programs that translate datainto and out of the Geoshare exchange for-mat. There are sending and receiving half-links. The sending half-link translates datafrom the application format into a formatdefined by the Geoshare data model, andthe receiving half-link translates data fromthe Geoshare format into a format readableby the receiving application. Schlumbergerhas written half-links for all its products, andmany half-links are now being developedby other vendors and users. As of July this

year, 20 half-links were commercially avail-able. This number may double by year end.

There are significant differences betweensending and receiving half-links, and in theeffort required to write them. The sendinghalf-link usually has to just map the applica-tion data to the Geoshare data model. Thisis because the Geoshare data model is

broad enough to accept many ways ofdefining data—like a multilingual translator,it understands almost anything it is told. Thereceiving half-link, however, needs to trans-late the many ways that data can be put inthe Geoshare model into a format under-standable by the receiving application. Lookat a simple example.

46 Oilfield Review

(Data Exchange Standard and Data Model)

Application Data Interface

Tight Integration of Applications Loose Integration of Interpretation Applications

Major Geoshare Data Model Categories• Seismic survey• Wavelet• Velocity data• Fault trace set• Lithostratigraphy code list

• Map• Land net list• Zone list• Field• Surface set• Drilling facility

Geoshare Data Bus

Data Base Data Store Data Store Data Store

Project or MasterData Base

Application 3

Application 1

Application 2

Ap

plic

atio

n 1

Ap

plic

atio

n 2

Ap

plic

atio

n 3

Sendinghalf-link

Receivinghalf-link

nHow Geoshare linkage works. Loosely integratedapplications (right) are linked with each other, withtightly linked applications (left), and with a project ormaster data base. GeoFrame applications are tightlyintegrated, like the system on the top left.

What is Object-Oriented—Really?

The term “object-oriented” has become increas-

ingly misunderstood as its usage expands to

cover a broad range of computer capabilities. The

meaning of object-oriented changes with the

noun it precedes—object-oriented interface,

graphics and programming. What makes them all

object-oriented is that they deal with

“objects”—usually a cluster of data and code,

treated as a discrete entity. The lumping together

of data is called encapsulation.

An object-oriented interface is the simplest

example. It is a user interface in which elements

of the system are represented by screen icons,

such as on the Macintosh desktop. Applications,

documents and disk drives are represented by

and manipulated through visual metaphors for

the thing itself. Files are contained in folders that

look like paper folders with tabs. Files to be

deleted are put in the trash, which is itself an

object—a trash can or dust bin. Object-oriented

displays do not necessarily mean there is object-

oriented programming behind them.

The object-oriented interface to data, at a

higher level, involves the concept of retrieving

information according to one or more classes of

things to which it belongs. For example, informa-

tion about a drill bit could be accessed by query-

ing “drill bit” or “well equipment” or “parts

inventory.” The user doesn’t need to know where

“drill bit” information is stored to retrieve it. This

is a breakthrough over relational database

queries, where one must know where data are

stored to access them.

When software developers talk to geoscientists

about the power of an “object-oriented program,”

they are often referring to object-oriented graph-

ics, which concerns the construction and manipu-

lation of graphic elements commonly used in map-

ping, such as points, lines, curves and surfaces.

Object-oriented graphics describes an image com-

ponent by a mathematical formula. This contrasts

with bit-mapped graphics, in which image compo-

nents are mapped to the screen dot by dot, not as

a single unit. In object-oriented graphics, every

graphic element can be defined and stored as a

separate object, defined by a series of end points.

Because a formula defines the limits of the object,

Suppose you want to send a seismic inter-pretation from a Charisma workstation tothe Finder system. Both the Finder andCharisma systems recognize an entity called“well location.” In the Charisma datamodel, well location is defined by X and Ycoordinates and by depth in one of severalunits, such as decimeters. The Finder data

model also defines well location by X and Ycoordinates, but it does not yet recognizedecimeters. The Finder model understandsdepth as time (seconds or milliseconds),inches, feet or meters (above). The Geosharedata model can understand all units ofdepth, so the Charisma sending half-linktherefore has only to send decimeters to the

Geoshare bus. The Finder receiving half-link,however, must recognize that depth can bequantified six ways—feet, inches, meters,decimeters, seconds and milliseconds. In thiscase, the Finder receiving half-link providesthe translation of decimeters into a unitunderstandable by the Finder system.

Writing a receiving half-link typicallytakes six to nine months, which is an orderof magnitude longer than writing a sendinghalf-link. The main difficulty in writing areceiving half-link is that it attempts to actlike a large funnel. It tries to translate datafrom Geoshare’s broad data model into theapplication’s focused data model, so it hasto consider many possible ways that datacan be represented in the Geoshare model.The ideal receiving half-link would look atevery way a fact could be represented, andtranslate each possibility into the variantused by the receiving application. Forexample, many interpretation systems use avariant of a grid structure to represent sur-faces. Such surfaces may represent seismicreflectors or the thickness of zones of inter-est. Geoshare provides three major repre-sentations for such surfaces: grids, contoursand X, Y and Z coordinates. If the receiverlooks only at incoming grids, the other twokinds of surface information may be lost.The policy taken by a receiver should be tolook for any of these possibilities, and trans-late into the internal representation used bythe applications.

47July 1994

The Problem

Sen

ding

half-

link Geoshare

Well location

xydecimetersmetersfeetinchessecondsmilliseconds

Rec

eivi

ngha

lf-lin

k

?

Charisma

Well location

xydecimeters

Finder

Well location

xymetersfeetsecondsmilliseconds

The Solution

Geoshare

Well location

xydecimetersmetersfeetinchessecondsmilliseconds

Finder

Well location

xymetersfeetsecondsmilliseconds

Finder receivinghalf-link convertsto meters

it can be manipulated—moved, reduced, enlarged,

rotated—without distortion or loss of resolution.

These are powerful capabilities in analyzing repre-

sentations of the earth and mapped interpretations.

“Object-oriented” to a programmer or software

developer tends to mean object-oriented program-

ming (OOP). It does not necessarily imply new func-

tionalities or a magic bullet. Its most significant

contribution is faster software development, since

large chunks of existing code can often be reused

with little modification. Conventional sets of

instructions that take eight weeks to develop may

take only six weeks when done in OOP. It is analo-

gous to modular construction in housing, in which

stairs, dormers and roofs are prefabricated, cut-

ting construction time in half.

The next wave in data bases themselves is a

shift toward object-oriented programming, in

which the user can define classes of related ele-

ments to be accessed as units. This saves having

to assemble the pieces from separate relational

tables each time, which can cut access time.

POSC’s Epicentre model uses essentially rela-

tional technology but adds three object-oriented

concepts:

• unique identifiers for real-world objects

• complex data types to reflect business usage

• class hierarchy for the drill bit example above.

Further reading:

Tryhall S, Greenlee JN and Martin D: “Object-Oriented DataManagement,” paper SPE 24282, presented at the SPE Euro-pean Petroleum Computer Conference, Stavanger, Norway,May 25-27, 1992.

Zdonik SB and Maier D (eds): “Fundamentals of Object-Ori-ented Databases,” in Readings in Object-Oriented DatabaseSystems. San Mateo, California, USA: Morgan KaufmannPublishers, Inc., 1990.

Nierstrasz O: “A Survey of Object-Oriented Concepts,” inKim W and Lachovsky FH (eds): Object-Oriented Concepts,Databases, and Applications. New York, New York, USA:ACM Press, 1989.

Williams R and Cummings S: Jargon: An Informal Dictio-nary of Computer Terms. Berkeley, California, USA: PeachpitPress, 1993.

Aronstam PS, Middleton JP and Theriot JC: “Use of anActive Object-Oriented Process to Isolate Workstation Appli-cations from Data Models and Underlying Databases,” paperSPE 24428, presented at the Seventh SPE Petroleum Com-puter Conference, Houston, Texas, USA, July 19-22, 1992.

nSending andreceiving Geosharehalf-links. Conver-sion of data—here,units of depth—to aform understand-able by the receiv-ing application isthe job of thereceiving half-linkbetween Geoshareand the Finder sys-tem. A well loca-tion can be a pointwith reference toone of several fea-tures, such as thekelly bushing, a for-mation top or aseismic surface.

The Geoshare standard can link two pro-grams if they can both talk about the samedata, even if not in the same vocabulary,such as the Finder and Charisma systemstalking about “well location.” But two pro-grams have a harder time communicating ifthey don’t understand the same concept. Byanalogy, a Bantu of equatorial and southernAfrica could not talk with a Lapp aboutsnow since the Bantu have no word forsnow. In the E&P world, this problem of amissing concept happens about a third ofthe time when linking applications throughthe Geoshare standard. But there is a wayaround it. Look again at an example.

The Finder system and IES IntegratedExploration System, which is an interpreta-tion system, both have conventions for des-ignating wells. For the IES system, a well hasa number, which can be determined in vari-ous ways, and a nonunique name. For theFinder system, a well is known by a specifickind of number, its API number. TheGeoshare data model accepts all conven-tions. When a formation top is picked in theIES system, the top is labeled by well nameand number. If it were shipped that way tothe Finder system through Geoshare, the

Finder system would not recognize the wellbecause it is looking for an API number. Thesolution is to add a slot in the IES datamodel for the API number. The IES systemitself does not use this slot, but when itsends data to the Finder system, the IESsending half-link adds the API number tothe message, enabling the Finder system torecognize the well.

While the Geoshare standard is designedto function as a universal translator, the E&Pcommunity is working toward a commondata model that will eventually eliminatethe translator, or at least reduce the need forone. Recognition of the need for a commondata model and related standards arose inthe 1980s as technical and economic forceschanged geoscience computing.14 The riseof distributed computing made interdisci-plinary integration possible, and flat oilprices shrank the talent pool within oil com-panies, resulting in smaller, interdisciplinarywork groups focused on E&P issues ratherthan software development.

When work on information standardsaccelerated in the late 1980s, much of theindustry support went to a nonprofit consor-tium, the Houston-based PetrotechnicalOpen Software Corp (POSC). An indepen-dent effort in Calgary, Alberta, Canada hadbeen established by another nonprofit con-sortium, the Public Petroleum Data ModelAssociation. Although the PPDM effort wason a smaller scale, in 1990 it was first torelease a usable E&P database schema,which has since been widely adopted in theE&P community.15

At its inception, the PPDM was generallyregarded outside Canada as a local, Calgarysolution for migrating from a mainframe to aclient/server architecture. The PPDM is aphysical implementation, meaning it definesexactly the structure of tables in a relationaldata base. The effort at POSC is to developa higher level logical data model, concen-trating less on specifics of table structureand more on defining rules for standardinterfacing of applications.16 By analogy, thePPDM is used as blueprints for a specifichouse, whereas POSC attempts to provide aset of general building codes for erectingany kind of building.

Since 1991, widespread use and matura-tion of the PPDM—the data model is now inversion 3—has increased its acceptance.This has changed POSC’s strategy. Today,POSC is preparing two expressions of itsEpicentre high-level logical data model, onethat is technology-independent, imple-mentable in relational or object-orienteddatabases (see “What is Object-Oriented—Really?” page 46), and another one toaddress immediate usage in today’s rela-tional database technology. The logicalmodel is called the Epicentre hybridbecause it adds some object-oriented con-cepts to a relational structure, although it

48 Oilfield Review

* A snapshot is an interim version, like an alpha or beta test in engineering.

Published ConformanceStatement Template snapshot

1990 1991 1992 1993 1994 1995

POSC established as consortium funded by 35 companies

Request for technology (RFT)for data model

Request for comments (RFC) for data access

Publication of Base Computer Standards (BCS) version 1.0

RFT for data access protocol

Published complete Epicentre “root model” snapshot

Published series of snapshot*Epicentre, Data Access & Exchange, Exchange Format

RFC for User Interface Style Guide

Published complete series of final specifications version 1.0

Published Epicentre, Data Access & Exchange, Exchange Format, User Interface Style Guide (Prentice-Hall)

Published BCS version 2.0 snapshot

Obtained PPDM agreement to proceed with merger

Started Internet information server, made project proposals and reviewsmore democratic

Member organizationsnumber 85

Helped start more than 20 migrations to BCS

Will publish BCS version 2.0

Will complete PPDMmerger, publishEpicentre version 2.0.

Will publish Data Access & Exchangeversion 1.1

ShareFest: An exhibition of POSC-compliant products sharing data from oil companies

nMilestones in POSC efforts.

Data Model Expressions of the Logical Model

Epicentre Hybrid Epicentre/PPDM

Implemented with a relational or partially or fully object-oriented database management system (DBMS)

Accessed via a POSC-specified application programing interface

User accesses a physical data base that isnot visibly different from the logical model

A single data model for applications thathave not yet been able to effectively usea relational DBMS, such as reservoirsimulators and geographic information systems

Opens a market for vendors of object-oriented DBMS and an evolutionarypath for relational DBMS

Implemented with a relational DBMS andoperating system or a DBMS bulk storagemechanism for large data sets

Accessed via a vendor-supported interfaceand a POSC-specified application programinterface for complex data types

User accesses a physical data base that is equivalent to the logical model or visibly different from it

Provides a set of highly similar relational data models that can be implementedwith current database managementsystem software

Provides a market for vendor products that meet the specifications of the logical model

Epicentre Logical Model and Data Access and Exchange FormatAn information model based on data requirements and data flow

characteristics of the E&P industry

leans heavily to the relational side. The firstversion of the hybrid was released in July1993. The relational solution, to be releasedthis summer, is POSC’s merger of the hybridand PPDM models, and is called informallythe Epicentre/PPDM model. This modelallows the few object-oriented features ofthe hybrid model to be used by relationalsoftware (right). The merged model is beingtested by a task force comprising PPDM,POSC and several oil companies.

Where is POSC Going?After two years of developing standards, lastsummer POSC released version 1.0 of itsEpicentre data model. Since then, Epicentrehas undergone review, validation, conver-gence with the PPDM, and use by develop-ers in pilot studies and in commercial prod-ucts. Release of Epicentre 2.0, which POSCconsiders an industrial-strength data model,is due this autumn (previous page). Todaythe Epicentre hybrid model as a wholeresides in a handful of database and com-mercial application products. Vendors havewritten parts of the specifications into proto-type software, which is increasing in num-ber, ambition and stability.

POSC has learned hard lessons aboutbuilding a data model. “There are a lot ofvery good local solutions,” says BrunoKarcher, director of operations for POSC,“that when screwed together make one big,bad general solution. What we have learnedis how to take those local solutions, whichare flavored by local needs and specificbusiness disciplines, and find a middleground. We know that the Epicentre logicalmodel is not a one-shot magic bullet. It isguaranteed to evolve.”

A key issue POSC is grappling with is thedefinition of “POSC-compliance,” whichtoday remains unspecified. From the user’sperspective “POSC-compliant” equipmentshould have interoperability—seamlessinteraction between software from differentvendors and data stores. The simplest defini-

July 1994

14. For a later paper summarizing the industry needs:Schwager RE: “Petroleum Computing in the 1990’s:The Case for Industry Standards,” paper SPE 20357,presented at the Fifth SPE Petroleum Computer Con-ference, Denver, Colorado, USA, June 25-28, 1990.

15. A schema is a description of a data base used by thedatabase management system. It provides informa-tion about the form and location of attributes, whichare components of a record. For example, sonic,resistivity and dipmeter might be attributes of eachrecord in a log data base.

tion of compliance means “applications canaccess data from POSC data stores,” but theE&P community is still a long way fromconsensus on how this can be achievedpractically. POSC has started building amigration path with grades of compliance.In these early stages, compliance itself maymean commitment to move toward POSCspecifications over a given period.

A consequence of having all data fit thePOSC model is disappearance of the appli-cation data store as it exists today—a propri-etary data store, accessible only by the hostapplication. For this to happen meansrethinking the three levels—master, projectand application—and blurring or removingthe boundaries between them. Ultimately,there may be a new kind of master data

16. Chalon P, Karcher B and Allen CN: “AniInnovativeData Modeling Methodology Applied to PetroleumEngineering Problems,” paper SPE 26236, presentedat the Eighth SPE Petroleum Computer Conference,New Orleans, Louisiana, USA, July 11-14, 1993.Karcher B and Aydelotte SR: “A Standard IntegratedData Model for the Petroleum Industry,” paper SPE24421, presented at the Seventh SPE PetroleumComputer Conference, Houston, Texas, USA, July19-22, 1992.Karcher B: “POSC The 1993 Milestone,” Houston,Texas, USA: POSC Document TR-2 (September1993).Karcher B: “Effort to Develop Standards for E&P Soft-ware Advances,” American Oil & Gas Reporter 36,no. 7 (July 15, 1993): 42-46.

base that will allow the user to zoom in andout of the data, between what is now themaster and application levels. There will befewer sets of data, therefore less redundancyand less physical isolation. At the concep-tual level, applications that manipulate datawill no longer affect the data structure.Applications and data will be decoupled.

“The problems of multiple data versionsand updating may disappear,” says Karcher,“but we will be looking forward to newproblems, and as-yet unknown issues ofdata administration, organization and secu-rity. I think even when standards are imple-mented in most commercial products,POSC will continue to play a role in bring-ing together a body of experience to solvethose new problems.” —JMK

49

Download - Managing Oilfield Data Management- Disk 2

Top Related