Download - Managing Oilfield Data Management- Disk 2
32
Managing Oilfield Data Management
Steve BaloughVastar ResourcesHouston, Texas, USA
Peter BettsShell Internationale Petroleum Maatschappij BVThe Hague, The Netherlands
Jack BreigBruno KarcherPetrotechnical Open Software CorporationHouston, Texas
André ErlichCorte Madera, California, USA
Jim GreenUnocalSugar Land, Texas
Paul HainesKen LandgrenRich MarsdenDwight SmithHouston, Texas
John PohlmanPohlman & Associates, Inc.Houston, Texas
Wade ShieldsMobil Exploration and Producing Technical CenterDallas, Texas
Laramie WinczewskiFourth Wave Group, Inc.Houston, Texas
As oil companies push for greater efficiency, many are revising how
they manage exploration and production (E&P) data. On the way out is
proprietary software that ran the corporate data base on a mainframe,
and on the rise is third-party software operating on a networked com-
puting system. Computing standards are making practical the sharing of
data by diverse platforms. Here is a look at how these changes affect
the geoscientist’s work.
Today’s exploration and production geosci-entist works in a world of untidy data.Maps, logs, seismic sections and well testreports may be scattered on isolated com-puters. They may exist only as tapes orpaper copies filed in a library in Dubai orstuffed in cardboard boxes in Anchorage(next page). Often, most of the battle is sim-ply locating pertinent data.1 Once data arefound, the user may need to be multilingual,fluent in UNIX, VMS, DB2, SQL, DLIS andseveral dialects of SEG-Y.2 Then, even whendata are loaded onto workstations, twointerpretation packages may be unable toshare information if they understand datadifferently—one may define depth as truevertical depth, and another as measureddepth. The main concern for the geoscien-tist is not only: Where and in what form arethe data? but also: Once I find what I need,how much time must I spend working onthe data before I can work with the data?
Despite this picture of disarray, E&P dataare easier to work on now than they werejust a few years ago, freeing more time towork with them. Better tools are in place toorganize, access and share data. Data orga-nization is improving through more flexibledatabase management software—a kind of
librarian that finds and sometimes retrievesrelevant data, regardless of their form andlocation. Access to data is widening throughlinkage of data bases with networks thatunite offices, regions and opposite sides ofthe globe. And sharing of data by softwareof different disciplines and vendors, whilenot effortless, has become practical. It willbecome easier as the geoscience commu-nity inches closer to standards for how dataare defined, formatted, stored and viewed.
Data management—the tools and organi-zation for orderly control of data, fromacquisition and input through validation,processing, interpretation and storage—isgoing through changes far-reaching andpainful. This article highlights two kinds ofchanges affecting data management: physi-cal, the tools used for managing data; andconceptual, ideas about how data, datausers and tools should be organized.
To understand where E&P data manage-ment is going, look at where it has comefrom. Today as much as 75% of geosciencedata are still stored as paper.3 Yet, the direc-tion of data management is determined bythe 25% of data that are computerized.
Oilfield Review
33July 1994
nThe incredible shrinking data store. Log data from a single well have beentransferred from tapes and papercopies in a cardboard box to a 4-mmdigital audio tape (DAT), being passedfrom Karel Grubb, seated, to Mike Wille,both of GeoQuest. This cardboard datastore was among 2000 for a project inAlaska, USA. Logs from all 2000 wellswere loaded onto 18 DATs for off-sitestorage provided by the LogSAFE dataarchive service (page 40). One DAT canhold 1.8 gigabytes, equivalent to con-ventional logs from about 100 wells.
1. Taylor BW: “Cataloging: The First Step to Data Man-agement,” paper SPE 24440, presented at the SeventhSPE Petroleum Computer Conference, Houston, Texas,USA, July 19-22, 1992.
2. UNIX and VMS are operating systems; DB2 is adatabase management system for IBM mainframes;SQL is a language for querying relational data bases;and DLIS and SEG-Y are data formats.
3. Landgren K: “Data Management Eases IntegratedE&P,” Euroil (June 1993): 31-32.
In this article Charisma, Finder, GeoFrame, IES (IntegratedExploration System), LogDB, LOGNET, LogSAFE andSmartMap are marks of Schlumberger; IBM and AIX aremarks of International Business Machines Corp.; Lotus 1-2-3 is a mark of Lotus Development Corp.; Macintosh is amark of Apple Computer Inc.; VAX and VMS are marks ofDigital Equipment Corp.; MOTIF is a mark of the OpenSoftware Foundation, Inc.; PetroWorks, SeisWorks, Strat-Works, SyntheSeis, and Z-MAP Plus are marks of Land-mark Graphics Corp.; GeoGraphix is a mark ofGeoGraphix, Inc.; ORACLE is a mark of Oracle Corpora-tion; POSC is a mark of the Petrotechnical Open SoftwareCorporation; Stratlog is a mark of Sierra Geophysics, Inc.;Sun and Sun/OS are marks of Sun Microsystems, Inc.;UNIX is a mark of AT&T; and X Window System is a markof the Massachusetts Institute of Technology.
For help in preparation of this article, thanks to NajibAbusalbi, Jim Barnett, Scott Guthery and Bill Quinlivan,Schlumberger Austin Systems Center, Austin, Texas, USA;Bill Allaway, GeoQuest, Corte Madera, California, USA;Richard Ameen, Bobbie Ireland, Jan-Erik Johansson,Frank Marrone, Jon Miller, Howard Neal, Mario Rosso,Terry Sheehy, Mike Wille and Beth Wilson, GeoQuest,Houston, Texas; Bob Born, Unocal, Sugar Land, Texas,USA; Ian Bryant, Susan Herron, Karyn Muller and RaghuRamamoorthy, Schlumberger-Doll Research, Ridgefield,Connecticut, USA; Alain Citerne and Jean Marc Soler,Schlumberger Wireline & Testing, Montrouge, France;Peter Day, Unocal Energy Resources Division, Brea, Cal-ifornia; Mel Huszti, Public Petroleum Data Model Asso-ciation, Calgary, Alberta, Canada; Craig Hall, SteveKoinm, Kola Olayiwola and Keith Strandtman, VastarResources, Inc., Houston, Texas; Stewart McAdoo,Geoshare User Group, Houston, Texas; Bill Prins,Schlumberger Wireline & Testing, The Hague, TheNetherlands; Mike Rosenmayer, GeoQuest, Sedalia, Colorado, USA.
nEvolution inpetroleum comput-ing parallels evolu-tion of computingin general.
Networked computing usingclient/server architecturebased on standards
Common third-party softwarepackages; emphasis on sharingof data and interpretationsbetween disciplines and between products from different vendors
Centralized master data base linkedwith regional master and projectdata bases, giving geoscientistsaccess to any data from any region
Organization
Centralized data control by team unlinked or looselylinked with regions
Applications
In-house proprietary, leading to islands of computing; some proprietary multidisciplinary packages
Hardware
Proprietary mainframes and microcomputers
“Data management” can increasingly becalled “computer” or “digital” data manage-ment, and its development therefore paral-lels that of information technology (right). Inthe last 30 years, information technologyhas passed through four stages ofevolution.4 Viewed through a geosciencelens, they are:
First generation: Back-room computing.Mainframes in specially built rooms ranbatch jobs overseen by computer specialists.Response time was a few hours or overnight.Data processing was highly bureaucratic,overseen by computer scientists using datasupplied by geoscientists. The highly securecentral repository of data, called the corpo-rate data base, resided here. Evolution con-centrated on hardware. Data were oftenorganized hierarchically, like branches of atree that must be ascended and descendedeach time an item was retrieved.
Second generation: Shared interactivecomputing. Minicomputers were accessedby geoscientists or technicians working atremote keyboards. Response time was inminutes or seconds. Members of a usergroup could communicate with each other,provided they had the same hardware andsoftware. Bureaucracy declined; democracyincreased. Control was still centralized andoften exercised through in-house databasemanagement software.
Third generation: Isolated one-on-onecomputing. This was the short reign of therugged individualist. Minicomputers andpersonal computers (PCs) freed the userfrom the need to share resources or workthrough an organization controlling a cen-tralized system. Computing power was lim-ited, but response time rapid. There were
34
Three Element
Jack BreigPOSC
many copies of data, multiple formats andno formally approved data. The pace of soft-ware evolution accelerated and data beganto be organized in relational data bases,which function like a series of file drawersfrom which the user can assemble pieces ofinformation.5
Fourth generation: Networked comput-ing. Also called distributed computing, thistoo is one-on-one, but allows memory, pro-grams and data to be distributed whereverthey are needed, typically via a client/serverarchitecture.6 Data are no longer justalphanumeric but also images and willeventually include audio and video formats.Interaction is generally rapid and is graphicas well as alphanumeric, with parts of thedata base viewed through a map. Linkage isnow possible between heterogeneous sys-tems, such as a VAX computer and an IBMPC. Control of data is at the populist level,with server and database support from com-puting specialists who tend to have geo-
s of a Successful E&P Data Mod
science backgrounds. Interpretation anddata management software evolves rapidlyand tends to be from vendors rather thanhome grown. Sometimes the network con-nects to the mainframes, which are reservedfor data archiving or for a master data base(more about this later). Mainframes are alsoreemerging in the new role of “superservers,”controlling increasingly complexnetwork communication.
To one degree or another, all four genera-tions often still coexist in oil companies, butmost companies are moving rapidly towardnetworked computing.7 This shift to newtechnology—and, recently, to technologythat is bought rather than built in-house—ischanging the physical and conceptualshapes of data management. Since about1990 two key trends have emerged:• The monolithic data store is being
replaced, or augmented, with three levelsof flexible data storage. A single corporatedata store8 used to feed isolated islands of
el
Comprehensive information coverage
A successful data model captures all kinds of
information used in the E&P business, as well as
sufficient semantics about the meaning of data,
to ensure that it can be used correctly by diverse
applications and users, in both planned and
unplanned ways. In other words, a semantically
rich data model optimizes the sharing of data
across different disciplines.
Useful abstractions
The complexity and volume of information man-
aged by the E&P industry is overwhelming. This
is a serious impediment to information exchange,
especially where information is generated by pro-
fessionals in dissimilar business functions. The
successful data model reveals to the user the
places of similarity in information semantics.
This capability provides business opportunities
both for users, in accessing data from unexpected
sources, and for application developers to target
new, but semantically similar, business markets.
Implementable with today’s and tomor-
row’s technology
A data model is successful today if it has a struc-
ture, rules and semantics that enable it to be effi-
ciently implemented using today’s database man-
agement tools. A data model will be successful
for the long term if it can accommodate newer
database management products, such as object-
oriented, hybrid and extended relational
database management systems.
Oilfield Review
nStructure and properties of the three-level scheme for data management. Data shownat the bottom may be kept outside the database system for better performance.
Create projects
Project data base(shared data)
Inventory catalog
Data quality
Data approval
Data volume
Areal extent
Administrativecontrol
Incr
easi
ng
Dec
reas
ing
Data volatility
Access frequency
Data versions
Distribute Integrate
Distribute Integrate
Application
External database references
Master data base(approved data)
Applicationdata store
Logdata files
Seismicdata files
Archivedata files
computing, each limited to a single disci-pline. For example, a reservoir engineer inJakarta might occasionally dip into thecorporate data store, but would have noaccess to reservoir engineering knowl-edge—or any knowledge—fromAberdeen. Oil companies today are mov-ing toward sharing of data across theorganization and across disciplines. Non-electronic data, such are cores, films andscout tickets ,9 may also be cataloged online. A new way of managing data usesstorage at three levels: the master, the pro-ject and the application.
• E&P computing standards are approach-ing reality. The three data base levels can-not freely communicate if one speaksSpanish, one Arabic and the other Rus-sian. Efforts in computing standardsaddress this problem on two fronts. One isthe invention of a kind of a common lan-guage, a so-called data model, which is away of defining and organizing data thatallows diverse software packages to sharedifferent understandings of data—such as,is a “well” a wellhead or a borehole?—and therefore exchange them (see “ThreeElements of a Successful E&P DataModel,” previous page). Data modelshave been used in other industries for atleast 20 years, but only in the last few hasa broadly embraced E&P data modelbegun to emerge. On the other front isdevelopment of an interim means for dataexchange, a kind of Esperanto that allowsdifferent platforms to communicate. Inthis category is the Geoshare standard,now in its sixth version.10
We look first at the data storage troika, andthen at what is happening with standards.
July 1994
4. This evolutionary scheme was described by BernardLacroute in “Computing: The Fourth Generation,” SunTechnology (Spring 1988): 9. Also see Barry JA: Tech-nobabble. Cambridge, Massachusetts, USA: MITPress, 1992.For a general review of information technology:Haeckel SH and Nolan RL: “Managing by Wire,” Har-vard Business Review 71, no. 5 (September-October1993): 122-132.
5. A relational data base appears to users to store infor-mation in tables, with rows and columns, that allowconstruction of “relations” between different kinds ofinformation. One table may store well names, anotherformation names and another porosity values. A querycan then draw information from all three tables toselect wells in which a formation has a porositygreater than 20%, for example.
A New Data Storage TroikaIn the days of punch cards, corporate datawere typically kept on a couple of comput-erized data bases, operated by mainframes.Flash forward 30 years to networked com-puting, and the story gets complicated.Instead of a couple, there are now tens orhundreds of machines of several types,linked by one or more networks. Andinstead of one or two centralized databases, there tends to be many instances of
6. Client/server architecture, the heart of distributedcomputing, provides decentralized processing withsome level of central control. The client is the work-station in the geoscientist’s office. When the geoscien-tist needs a service—data or a program—the worksta-tion software requests it from another machine calleda server. Server software fulfills the request and returnscontrol to the client.
7. For an older paper describing a diversity of computersystems in Mobil: Howard ST, Mangham HD andSkruch BR: “Cooperative Processing: A TeamApproach,” paper SPE 20339, presented at the fifthSPE Petroleum Computer Conference, Denver, Col-orado, USA, June 25-28, 1990.
8. This article distinguishes between a data store and adata base. A data store is a collection of data that maynot be organized for browsing and retrieval and isprobably not checked for quality. A data base is a datastore that has been organized for browsing andretrieval and has undergone some validation and qual-ity checking. A log data store, for example, would bea collection of tapes, whereas a log data base wouldhave well names checked and validated to ensure thatall header information is correct and consistent.
data storage: master data base, project database and application data stores (above).Sometimes the old corporate data base hasbeen given a new job and a new name.Instead of holding all the data for the corpo-ration, it now holds only the definitiveinstances, and so is renamed the “master.”There may be one centralized master, buttypically there is one master per operatingregion or business unit.
35
9. A scout ticket is usually a report in a ring binder thatincludes all drilling-related data about a well, fromthe time it is permitted through to completion.
10. For a description of the evolution of the Geosharesystem: Darden S, Gillespie J, Geist L, King G, Guth-ery S, Landgren K, Pohlman J, Pool S, Simonson D,Tarantolo P and Turner D: “Taming the GeoscienceData Dragon,” Oilfield Review 4, no. 1 (January1992): 48-49.
In the master data base, data quality ishigh and the rate of change—what informa-tion scientists call volatility—is low. Thereare not multiple copies of data. Althoughthe master data base can be accessed orbrowsed by anyone who needs data,changes in content are controlled by a dataadministrator—one or more persons whodecide which data are worthy of residing inthe master data base. The master data basealso contains catalogs of other data basesthat may or may not be on-line. The masteris often the largest data base, in the gigabyterange (1 billion bytes), although it may beovershadowed by a project data base thathas accumulated multiple versions of inter-pretations. The master may be distributedover several workstations or microcomput-ers, or still reside on a mainframe.
From the master data base, the user with-draws relevant information, such as “allwells in block 12 with sonic logs between3000 and 4000 meters,” and transfers it toone or more project data bases. Key charac-teristics of a project data base are that ithandles some aspects of team interpretation,is accessed using software programs fromdifferent vendors and contains multidisci-plinary data. The project may contain datafrom almost any number of wells, from 15
Mobil Exploration & ProducingTechnical Center
A sign on the wall in one of Mobil’s Dallas offices
sums up the most significant aspect of how Mobil
is thinking about data management: “Any Mobil
geoscientist should have unrestricted and ready
worldwide access to quality data.” The sign
describes not where the company is today, but
where it is working to be in the near future.
The emphasis on data management is driven by
Mobil’s reengineering of its business. Data man-
agement is affected by fundamental changes in the
current business environment and in how data are
used within this environment.
Traditionally, exploration funds were allocated
among the exploration affiliates, largely in propor-
tion to the affiliate size. Within a changing busi-
ness environment requiring a global perspective,
36 Oilfield Review
to 150,000. Regardless of size, data at theproject level are volatile and the projectdata base may contain multiple versions ofthe same data. Updates are made only bythe team working on the project. Interpreta-tions are stored here, and when the teamagrees on an interpretation, it may be autho-rized for transfer to the master data base.
A third way of storing data is within theapplication itself. The only data stored thisway are those used by a specific vendor’sapplication program or class of applications,such as programs for log analysis. Applica-tion data stores are often not data base man-agement systems; they may simply be a filefor applications to read from and write to.Applications contain data from the projectdata base, as well as interpretations per-formed using the application. Consequently,the data may change after each work ses-sion, making them most volatile. Becauseapplications were historically developed byvendors to optimize processing by thosespecific applications, applications from onevendor cannot share data with applicationsfrom other vendors, unless special softwareis written. Vendors have begun to producemultidisciplinary applications, and manyapplications support the management ofinterpretation projects. In addition, the pro-prietary nature of application data stores
will change as vendors adopt industry stan-dards (see “Razing the Tower of Babel—Toward Computing Standards,” page 45). Asa result of these changes, data managementsystems of the future will have just two lev-els, master and project.
The Master LevelThe master data base is often divided intoseveral data bases, one for each major datatype (next page). A map-type program isusually used to give a common geographiccatalog for each data base. An example ofthis kind of program is the MobilView geo-graphical interface (see “Case Studies,”below).
Because the master data base is some-times cobbled together from separate, inde-pendent data bases, the same data may bestored in more than one place in the master.BP, for instance, today stores well headerinformation in three data bases, although itis merging those into a single instance. Anideal master data base, however, has a cata-log that tells the user what the master con-tains and what is stored outside the master.An example is the inventory and location ofphysical objects, such as floppy disks, films,cores or fluid samples. This is sometimescalled virtual storage. The master data base,therefore, can be thought of as not a single,
Case Studies
funds should be allocated to take advantage of the
best opportunities regardless of location. The
organization as a whole is also leaner, placing
more demands on a smaller pool of talent. A
petrophysicist in Calgary may be called to advise
on a project in Indonesia. All geoscientists, there-
fore, need to understand data the same way. When
an interpretation and analysis are performed with
the same rules, right down to naming conventions,
then not only can data be shared globally, but any
geoscientist can work on any project anywhere.
A second motivation for restructuring data man-
agement is a change in who is using data. Mobil
was traditionally structured in three businesses:
Exploration, Production, and Marketing and Refin-
ing. Exploration found oil for its client, Production,
which produced oil for its client, Marketing and
Refining. Now the picture is not so simple. Explo-
ration’s client may also be a government or
national production company. Production may also
work as a consultant, advising a partner or a
national company on the development of a field.
Now, with a broader definition of who is using
data, there is a greater need for data that can be
widely understood and easily shared.
For the past three years, the company has been
moving to meet this need. In 1991, Mobil estab-
lished 11 data principles as part of a new “enter-
prise architecture.” The company is using the prin-
ciples as the basis for policies, procedures and
standards. Mobil is now drafting its standards,
which include support of POSC efforts, down to
detail as fine as “in level five of Intergraph design,
a file will have geopolitical data.”
Mobil is reengineering data management on two
fronts, technology and organization. Technology
changes affect the tools and techniques for doing
things with data: how they are validated, loaded,
stored, retrieved, interpreted and updated. Organi-
zational changes concern attitudes about who uses
what tools for what ends. Both sides carry equal
weight. This involves a realization that technology
alone doesn’t solve problems. It also involves the
“people side,” most importantly, the attitude of
geoscientists toward data management.
An example of a new way to manage the people
side is the formation of a series of groups coordi-
nated by the Information Technology (IT) depart-
ment called Data Access Teams, which expedite
data management for the Mobil affiliates. A Data
Access Team comprises Dallas staff and contrac-
tors who validate, input, search and deliver data to
an affiliate site. Each site, in turn, collaborates
with the team by providing a coordinator familiar
with the local business needs and computer sys-
tems. Another means of addressing the people
side of data management is making data manage-
ment part of a geoscientist’s performance
appraisal. This becomes increasingly important as
more data go on-line.
In the past, geoscientists would rifle closets and
ask colleagues for data. Any bad data would get
tossed or worse, ignored. Now, if a geoscientist
37
finds bad data, it is that individual’s responsibility
to alert the data administrator and have the prob-
lem corrected. The new thinking is that data qual-
ity isn’t someone else’s job; it has to become part
of the culture.
A fundamental component of Mobil’s IT strategy
is a common storage and access model. Typically,
each of the affiliates maintains its own, stand-
alone E&P database system, along with links into
the central data bases in Dallas. The local systems
have evolved independently because local needs
can vary. Frontier areas, for example, work only
with paper data, whereas Nigeria is nearly all
electronic. Despite these differences, a common
data storage and access model was endorsed by
all affiliates in the early 1990s.
This model incorporates access to existing data
sources as well as the concept of master and pro-
ject data stores (right). The master data base
maintains the definitive version of the data while
the project data base is maintained as a more
ephemeral working environment interfacing with
the applications. The project builder serves as a
nBP’s master data base: present, nearfuture and long term. Currently, the masteris divided into discrete data bases andassociated applications that have beenbuilt to address the needs of specific disci-plines. The advantage of this approach isthat data ownership is clearly defined. Dis-advantages include duplicate data sets,difficulty ensuring integrity of data sharedby more than one discipline, andincreased data management workload.The near-term solution increases the per-meability of walls separating disciplines,and the ultimate solution dissolves thesewalls, making use of standards estab-lished by the Petrotechnical Open SoftwareCorporation (POSC).
July 1994
Data Stores: Ultimate Solution
Data Bus
Data Stores: Short-term Future
Data Bus
Common Well Data
Geoshare
Technical Data Commercial Data
Data Stores: Present
Geology &Production Petrophysics
Geophysics Cartography 3D Models
Drilling
Geology &Production Petrophysics Drilling Nonwell
related data
Vendorsupplied
data bases
POSCcompliant
DB #1
POSCcompliant
DB #2
POSCcompliant
DB #3
POSCcompliant
DB #4
nOverview of Mobil’s data management model. Thismodel, developed in the early 1990s, is part of a strat-egy for migration to POSC compliance.
Project Builder(searches master and
downloads data)
Appl icat ions
MasterData Store
Project Data Store
Mig
ratio
n to
war
dP
OS
C c
ompl
ianc
e
Legacy Data Management Systems
nActivity screen inPAGODA. The con-trol panel at leftallows the user toobtain detailsabout selectedfields in the mainscreen and navi-gate the system.The central screenshows curve sum-mary information,and the mud datascreen givesdetails of mudproperties.
38 Oilfield Review
PAGODA
Accept
Store
Quit
Help
Clear
Prev_occ
First_occ
Ret
Nex
Las
Add_occ
Rem_occ
Ins
Zoom
Print... Det
Panel PAGODA
Commands Session Form Record Field Utilities Help
WEN
SLB
TEMP/STM/DTT/DITE/SDT/SRT
TOH .001
72-453
-0.1524 3865.77 3079.54
5160.00
DEPT
BS
TOD
TIME
ETIM
Bit Size
Time of Day
Time
Elapsed Tim
-0.0508 3865.88 3079.54
15480.0
DEPT
RI0
RI1
SMNO
SMIN
Raw MSFL Co
Raw I1 Conc
Synthetic N
Synthetic N
Show Data Specification
Show Property Indicators
Browse T
Well/Hole
Activity
Service
Sequence
Curve Indexes:
1/1
1/1
1/41
1/5
1/2
----- Unit ----- Curves: 1/141
Inc/Frame Start Stop Name
Name
Type
Name
Type
MI0010:Only current Field,Hole,etc. RETRIEV
CURVE SUMMARY
WEN LIUField 1/1
Accept Quit Panel... Help...
Updated By
MUD DATA
Stop
API 440 SCHLUMBERGER
29-May-91
pagoda
29-May-91
15-dec-93
5:29 MAY 29
30 % OIL/WATER
2.12
75
12
1.1999
FLOWLINE
0.0998
27.1999
PRESS
0.072
g/cm3
s
cm3
ohm.m
degC
ohm.m
Drilling Fluid Loss
Activity Parameters ( 7/26)
Date Circulation Stopped
Time Circulation Stopped
Drilling Fluid Type
Drilling Fluid Viscosity
Drilling Fluid PH
Mud Sample Source
Resistivity of Mud Sample
Mud Sample Temperature
Mud Filtrate Sample Source
Resistivity of Mud Filtrate S
Unit
Drilling Fluid Density
Contractor
Activity Type
Activity Start
Loaded By
Session Form Record Field Utilities Help
Mud
Data
geographical search, retrieval and archival mech-
anism. The master and project data stores are
based on the POSC Epicentre model and imple-
mented in the Open System/UNIX environment.
The model calls for gradual replacement of most
of the legacy data bases, based on business
drivers. The company will maintain selected
legacy systems for the foreseeable future, but in a
way that minimizes resources and does not involve
formal development.
Being competitive in the current business envi-
ronment requires a change in business processes.
This change from a local focus to a global perspec-
tive is particularly important in IT. Within Mobil’s
E&P division, the major component of this initia-
tive is a paradigmatic shift in the way data are
handled at all levels and the development of a
POSC-based data storage and access model. In
the context of the corporate IT mission, this strat-
egy was developed from the ground up, with input
from all Mobil affiliates.
Unocal Energy Resources DivisionUnocal is an integrated energy resources company
that in 1993 produced an equivalent of 578,000
barrels of oil per day and held proven reserves
equivalent to 2.2 billion barrels of oil.1 By assets,
Unocal ranked thirteenth worldwide in 1993, com-
pared to Mobil in second place.2 Exploration and
production is carried out in about ten countries,
including operations in Southeast Asia, North
America, former Soviet Union and the Middle East.
Unocal began a reorganization in 1992 that will
eventually converge most all exploration activities
at a facility in Sugar Land, Texas, USA. Concurrent
with this restructuring is a change, started in 1990,
in the physical database system—from IBM main-
frames running in-house database, mapping and
interpretation software, to third-party software on
1. Unocal 1993 Annual Report: 8.2. National Petroleum News. “Oil Company Rankings by
Marketers with Assets of $1 Billion or More,” (June1993): 16.
Cou
rtesy
of S
hell
Inte
rnat
iona
le P
etro
leum
Maa
tsch
appi
j BV
.
a client/server system. The group responsible for
overseeing this shift is Technical Services, based
in Sugar Land (next page). Here, 13 staff scientists
and five on-site contractors manage 200 gigabytes
of data on about two million wells. This is the
largest portion of the company’s one terabyte of
E&P data, and is accessed by about 275 users in
both exploration and business units.
The Technical Services group performs two
main tasks: it supports client/server-based geo-
science applications, and it maintains and dis-
tributes application and database systems. There
are 23 software systems under its wing: 11 geo-
physical applications, five mapping and database
applications, five geologic applications and two
hard-copy management systems, which are cata-
logs of physical inventories such as seismic sec-
tions and scout tickets. Altogether, the 23 systems
represent products from at least 20 vendors,
Case Studies
(continued on page 40)
39July 1994
Local IS support
IS s
upp
ort
Energy ResourcesTechnology (Research)
WorldwideExploration
Energy Resources
BusinessDevelopment
TechnicalServices
GeoscienceComputing
Technology(application
configuration and database
systems)
Operations
Asset Teams(field development)
Local information
systems
Asset Teams(field
development)
Large Business Units
• US Gulf Coast• Louisiana• Thailand• Indonesia
Exploration Groups• CIS• Gulf/North America• Far East• Middle East/Latin America
• Alaska• Central US• Calgary• The Netherlands• Aberdeen• Syria
Small Business Units
Local informationsystems (IS)
comprehensive library, but as a large digitalfiling cabinet and a catalog directing theuser to other filing cabinets that may beadjacent, across the hall or across theocean, and may contain data in anyform—digital, paper, cores and so on.
One component of storage at the masterlevel is the archive. The name implies long-term storage with infrequent retrieval, butthe meaning of archive is shifting with newways of organizing and accessing data. Anarchive may be an off-site vault with seismictapes in original format, or an on-site opticaldisk server with seismic data already format-ted for the interpretation workstation. Thereare countless variations, with differingdegrees of “liveness” of the data. Two exam-ples of archives are the PAGODA/LogDBand LogSAFE systems.
The PAGODA/LogDB system is a jointdevelopment of Shell InternationalePetroleum Maatschappij (SIPM) and
Schlumberger. PAGODA designates theproduct in Shell Operating Companies andAffiliates; the LogDB name is used for theSchlumberger release. The objective of thejoint development was to produce a safestorage and retrieval system for informationgathered during logging and subsequentdata evaluation.
The PAGODA/LogDB system comprisessoftware and hardware for loading, storingand retrieving data acquired at wells, suchas wireline logs, data obtained whiledrilling, borehole geophysical surveys andmud logs. Original format data are scannedduring loading to ascertain header informa-tion and the integrity of the data format. Allpopular data formats are supported. Oncescanning is complete, well identifiers arevalidated against a master list of wellnames. Extracted header information is writ-ten to ORACLE tables—a type of relationaldatabase table—and bulk data are trans-
ferred to a storage device, such as an on-line jukebox of optical disks, magnetic disksor other archival media. Once data are vali-dated and loaded, they can be quicklyviewed, searched and retrieved for export toother applications (previous page).
The PAGODA/LogDB system fulfills tworoles. As an off-line archival data base, thesystem is capable of storing terabytes of datathat are safely held on a durable mediumand can be retrieved and written to disk ortape. Exported data can be provided ineither original format or via a Geoshare link-age (page 46). As an on-line data base, thesystem makes all header information avail-able for browsing and searching. The sys-tems are supported on VAX/VMS, Sun/OSand IBM/AIX operating systems.
The PAGODA and LogDB systems areessentially the same. To date, PAGODA hasbeen installed in most of Shell’s larger Oper-ating Companies, where log data held on
nA data management organization chart for Unocal’sEnergy Resources division. The company’s TechnicalServices group, based in Sugar Land, Texas, con-tributes to data management for the four explorationgroups, which are also based in Sugar Land, and largeand small business units. The business units are mov-ing their regional data models toward a single standard.
UNIX platform
1
PC platform
23
4
5
Dataindex(tbd)
Landmarkdata base
IntegratedLandmark
interpretationapplications
Commercialdata
sourceRegionaldata base(Finder)
GeoGraphixdata base
GeoGraphixmapping
applications
Catalog of digital and hard-copy data1
Best Unocal data from past projectsand commercial sources
2
Managed and converted by Geoscience Computing
3
SeisWorks, StratWorks, Z-MAP Plus, SyntheSeis, PetroWorks, etc.
4
Application for interpretations5
The LogSAFE system places logs from anyvendor in a relational data base, validatingdata and storing them in their original for-mat. Any tape format can be read and papercopies digitized. Before it is entered into thedata base, the entire file is inspected tomake sure it is readable, and the data owneris informed of any problems in data quality.Validated data are then archived on DATs(left). By the end of 1994, data will also bestored for faster, automated retrieval on ajukebox of optical disks. Clients have accessonly to their own data, and receive a cata-log of their data and quarterly summaries oftheir holdings. The LogSAFE system is basedin Sedalia, Colorado, USA, at the samefacility that houses the hub of the Schlum-berger Information Network. This permitsautomatic archiving of Schlumberger logsthat are transmitted from a North Americanwellsite via the LOGNET service, whichpasses all logs through the Sedalia hub.
LogSAFE data can be accessed severalways, giving it the similar flexibility and“live” feel of an in-house data base. Dataare either pulled down by the user, or
earlier systems or in tape storage are beingmigrated to PAGODA. The PAGODA sys-tem also caters to the storage of Shell propri-etary data formats and calculated results.The LogDB system has been installed in sev-eral oil companies and government agen-cies and is being made available as a load-ing and archive service at Schlumbergercomputing centers worldwide. The LogDBsystem can be coupled to the Finder datamanagement system, in which well surfacecoordinates can be stored together with lim-ited log sections for use during studies.
The LogSAFE service performs some ofthe same functions, but not as a tightly cou-pled data base. It is an off-line log archivethat costs about half as much as an internaldatabase system, yet can deliver logs overphone or network lines in a few hours. Foroil companies that prefer a vendor to man-age the master log data base, the LogSAFEsystem functions as an off-site data base,accessed a few times a week. For largecompanies—those with more than about 15users of the data base—it functions as anarchive, accessed once or twice a month.
which scan two project data bases or a project and
a master data base and produce a list of differ-
ences between them. Quick identification of differ-
ences between versions of data is essential in
deciding how to update projects with vendor data
or how to update the master data base with geo-
scientists’ interpretations from a project or
regional data base. Today this comparison is done
manually. “The client/server environment cannot
be considered mature until we have robust, auto-
mated data management tools,” Green says.
A debate within Unocal concerns finding the
best way to organize the work of geotechnicians,
making Unocal advanced in its move from in-
house software (right).
Jim Green, who manages Unocal’s exploration
computing services, lists two essential ingredients
for successful management of E&P data: training
for users of new data handling systems and auto-
mated tools for moving data. By 1995, Unocal will
move all computerized E&P data from flat-file,
hierarchical systems to ORACLE-based relational
systems.3 A company-wide program is bringing
geotechnicians up to speed on data loading and
quality checking.
The diversity of Unocal’s software underscores
the importance of managing multiple versions of
data. The company is encouraging vendors to
develop a family of automated software tools: data
movers, editors, loaders, unloaders and compari-
tors. Of particular importance are comparitors,
nArchitecture of Unocal’s interpretation applicationsand data stores. Pair-wise links connect all platforms.Long-term evolution is toward the UNIX platform.
40 Oilfield Review
3. A flat-file data base contains information in isolatedtables, such as a table for porosity data and another forpaleontology data. The tables cannot be linked. A rela-tional data base allows linkage of information in manytables to construct queries that may be useful, such as“show all wells with porosity above 20% that have coc-colithus.” Flat file data bases are simple and inexpensive.Relational data bases are complex, expensive and can beused to do much more work.
nBob Lewallen, senior log data technicianwith Schlumberger, returning 90 gigabytesof log data to the LogSAFE vault in Sedalia,Colorado, USA. Original logs are copied totwo sets of DATs stored in the vault. A thirdbackup is kept in a fireproof safe off-site.To ensure data integrity, data are vali-dated and rearchived every five years.
Jill O
rr p
hoto
who load data and contribute to quality assurance
(next page). The main issue is whether geotechni-
cians are most effective if they work by geographic
area, as they do now, or by discipline. The advan-
tage of focusing geotechnicians by area is that
they learn the nuances of a region, and develop
E & P Data Acquisition
scie
ntis
tsan
dag
emen
t
Data gatheredby Unocal
• Well info• UKOOA• Geopolitical
Data Purchased• Tobin• Petroleum Information• Petroconsultants• Seismic surveys
pushed from Sedalia at client request. Dataare transferred either over commercial ordedicated phone lines, or more commonly,through the Internet. Internet connectionsare generally faster than phone lines, andallow transmission of 500 feet of a boreholeimaging log in 5 to 10 minutes.
In operation since 1991, the LogSAFE ser-vice now serves 70 clients and holds datafrom 13 countries. As oil companies con-tinue to look to vendors for data manage-ment, the LogSAFE service is expected toexpand from its core of North Americanclients. Today it handles mostly logs and ahandful of seismic files, but is expected toexpand to archive many data types.
The Project and Application levelsTo the geoscientist, the project data base islike a desktop where much of the actiontakes place. Relevant resources from themaster data base are placed in the projectdata base. To work on project data, usersmove it from the project level into the appli-cation software. The process of interpreta-tion, therefore, involves frequent exchange
nUnocal’s data flow anddivision of labor. Vendordata are initially keptseparate from Unocaldata, mainly for qualityassurance.
Data Management
Geo
man
Dat
abas
ead
min
istr
ato
rsG
eosc
ient
ist
qua
lity
chec
k
Dat
abas
ead
min
istr
ato
ran
d lo
cal
info
rmat
ion
syst
ems
sup
po
rt
Validateddata
Validateddata
Application data bases
Vendor DataStore
Local InformationSystems Group
• Load• Periodic updates
Geotechnicians• Quality check• Load
Load projectdata
Projectdata base
Archive Data Base
• Exploration areas• Business units
Regional Data Base-organized geographically
cohesion with other team members and a sense of
ownership of the data. The advantage of organiz-
ing geotechnicians by discipline—seismic, pale-
ontology, logs and so on—is a deeper understand-
ing of each data type and better consistency in
naming conventions and data structure. The opti-
mal choice is not so obvious. Complicating the
picture is the changing profile of the typical
geotechnician. Ten or 15 years ago, a geotechni-
cian held only a high school diploma and often
preferred to develop an expertise in one data type.
Today, many have master’s degrees and prefer a
job with greater diversity that may move them
toward a professional geoscience position.
Whichever way Unocal moves, Green says, the
company is addressing a change in data culture.
Geotechnicians today are increasingly involved in
broader matters of data management, including
the operation of networks and interapplication
links. “We need both technicians and geoscien-
tists who understand what to do with a relational
data base,” Green says, “who understand the
potential of interleaving different kinds of data.
The future belongs to relational thinkers and that’s
what we need to cultivate.”
41July 1994
between the project and application levels,and when a team of geoscientists arrives atan interpretation and wants to hold thatthought, it is held in the project data base.
The distinction between the levels is notrigid. For example, software that works atthe project level may also overlap into themaster and application levels. What setsapart project database systems is where theylie on the continuum of master-project-application. Two examples are the Findersystem, which manages data at the projectand master levels, and the GeoFrame sys-tem, which manages project data for inter-pretation applications.
The Finder system wears two hats. It func-tions as a data base, holding the main typesof E&P data, and as a data access system,providing a set of programs to access and
nStructure of the Finder system. The system lebread-and-butter data: Well, log, seismic, leas
42
Cross Section SmartMap
Geoshare Tool Kit
GeoshareDatabas• SQL• Imbed
Operating System
Graphics (Graphical Kernel System)
Ap
plic
atio
nsLi
bra
ries
Sys
tem
X Window System User Interface (MO
• UNIX• VMS
• Sun/OS• AIX
Relational DManagemen
view data (below).11 The Finder system con-sists of three main components: the data,interfaces to the data, and the underlyingdata model, which organizes the data.
The data bases reside in two places.Inside the relational data base managementsystem are parametric data, which are singlepieces of data, such as those in a log header.Usually, relational data bases like the oneFinder uses (ORACLE), cannot efficientlyhandle vector data, also called bulk data.These are large chunks of data composed ofsmaller pieces retrieved in a specific order,such as a sequence of values on a porositycurve or a seismic trace. For efficiency andresource minimization, vector data arestored outside the relational structure, andreferences to them are contained in the database management system. Data bases are
ts the geoscientist work with five kinds ofe and cultural.
1 (Well)2 (Log)3 (Seismic)4 (Lease)5 (Cultural)
Loaders Utilities
Finder Tool Kit
e Access
ded SQL
Bulk Data1 (Well)2 (Log)3 (Seismic)4 (Lease)5 (Cultural)
Network
Tables
N-lists
Transmission ControlProtocol/Internet Protocol
Workstation
SQL*NET
TIF)
atabaset Server
typically set up for the most common typesof data—well, log, seismic, lease and cul-tural, such as political boundaries andcoastlines—but other data types and appli-cations can be accessed. The Finder systemalso includes tools for database administra-tors to create new projects, new data types,new data base instances and determinesecurity. Users can move data between themaster and project levels.
The component most visible to the user isthe interface with the data base and its con-tent. There are several types of interfaces.The most common is an interactive map orcross section (next page). Another commoninterface is the ORACLE form. Typical formslooks like a screen version of the paperform used to order data from a conven-tional library (page 44, top). In the future,
Well data are an encyclopedic description ofthe well. The 50-plus general parametersinclude well name, location, owner, casingand perforation points, well test resultsand production history.
1
Logs are represented either as curves or in tabular form. Multiple windows can be displayed for crosswell correlations. Log data are used to select tops of formations and are then combined with seismic “picks” to map horizons, one of the main goals in mapping.
2
Seismic includes the names of survey lines, who shot them, who processed the data, how it was shot (influencing the “fold”) and seismic navigation data. Also listed are “picks,” the time in milliseconds to a horizon, used to draw a subsurface contour map through picks of equal value.
3
Lease data include lease expiration, lessee and lessor and boundaries.
4
Cultural data include coastlines, rivers, lakes and political boundaries.
5
Oilfield Review
nTwo interfaces of the Finder system. The most popular interfaces are graphical.Two examples are the SmartMap software (top) and the Cross Section module (bot-tom). SmartMap software positions information from the data base on a map, andallows selection of information from the data base by location or other attributes.The user can tap into the data base by point-and-click selection of one or moremapped objects, such as a seismic line or a well. The Cross Section module is used
58°N
57°N
56°N0°W 3°E 6°E
forms will also be multimedia, mergingtext, graphics and maps. The user alsointeracts with the data base throughunloaders—utilities that convert data froman internal representation to external, stan-dard formats—and through external appli-cations, like a Lotus 1-2-3 spreadsheet.
A more powerful interface is StandardQuery Language (SQL—often spoken“sequel”), which is easier to use than a pro-gramming language but more difficult thanspreadsheet or word processing software.12
The Finder system includes at least 40 com-mon SQL queries, but others that will beused often can be written and added to amenu for quick access, much like a macro inPC programming. Often, a database admin-istrator is the SQL expert and works with thegeoscientist to write frequently used SQLqueries. A point-and-click interactive tool forbuilding SQL queries is also available forusers with little knowledge of SQL.
The less visible part of the Finder system,but its very foundation, is the data model.Today, the Finder data model is based onthe second and third versions of a modelcalled the Public Petroleum Data Model(PPDM), which was developed by a non-profit association of the same name.13 Tosatisfy new functionalities, the Finder modelhas evolved beyond the PPDM in someareas, notably in handling seismic andstratigraphic data. The evolutionary path ismoving the Finder model toward the Epi-centre/PPDM model, which is a merger ofthe PPDM model with Epicentre, a modeldeveloped by another industry consortium,the Petrotechnical Open Software Corpora-tion (POSC).
In the last two years, the Finder systemhas gone through several cycles of reengi-neering, making it more robust, more flexi-ble and easier to use. Updates are releasedtwice a year. Since 1993, for example,about 50 new functions have been added.Four changes most notable to the user are:• easier user interface. Full compliance
with MOTIF, a windowing system, speedsaccess to data, and makes cutting andpasting easier.
to post in a cross-section format depth-related data, such as lithology type, perfora-tions and, shown here, formation tops. Wells are selected graphically or by SQLquery, and can be picked from the base map.
43July 1994
11. For an accessible review of data bases and databasemanagement systems: Korth HF and Silberschatz A:Database System Concepts. New York, New York,USA: McGraw-Hill, Inc., 1986.
12. This language, developed by IBM in the 1970s, is forsearching and managing data in a relational database. It is the database programming language thatis the most like English. It was formerly called“Structured Query Language,” but the name changereflects its status as a de facto standard.
13. PPDM Association: “PPDM A Basis for Integration,”paper SPE 24423, presented at the Seventh SPEPetroleum Computer Conference, Houston, Texas,USA, July 19-22, 1992.
44
nAn ORACLE form for a scout ticket. The 80 types of forms preprogrammed into the Finder system do more thandisplay data. They include small programs that perform various functions, such as query a data base andcheck that the query is recognizable—that the query is for something contained in the data base. The programsalso check that the query is plausible, for instance, that a formation top isn’t at 90,000 feet [27,400 m]. Formsmay perform some calculation. The well form, for example, contains four commonly used algorithms for theautomated calculation of measured and true vertical depth. Forms can be accessed by typing a query or click-ing on a screen icon. A form showing all parameters available for a well, for example, can be accessed throughthe base map by clicking on a symbol for a well.
nThe GeoFramereservoir character-ization system. TheGeoFrame system isa suite of modularapplications thatare tightly linkedvia a common database and packageof software thathandles intertaskcommunication.This software auto-matically makesany change in thedata base visible toeach application. Italso allows the userto communicatedirectly betweenapplications. Forexample, a cursorin a cross-sectionview will highlightthe well in the mapview.
FLOW OILCECIL
OUT
H0508001
HUSKY ET AL CECIL 16-20-84-8
AR
01-OCT-9123-JUL-88
31-JAN-88
21-JAN-88
24-JAN-88
AO131589
1130
DOIG
618.9614.5
NAL73
56.305279-113.21853
37423
WS
BFSPDDYCDTTHRMNNTKNFLHAFLHBFLHCFLHDFLHEWLRCBLSKCTNGCDMNFRNIRCKKPKCPNRDG
IPLIPLIPLIPL
IPL
IPLIPLIPLIPLIPLIPLIPLIPLIPLIPLIPLIPLIPL
262.978.954.943.9
-404.1
28.9-49.1-80.1-105.1-121.1-161.1-201.1-285.1-304.1-339.1-347.1-361.1-378.1
356540564575
1023
590
699724740780820904925958966980997
262.978.954.943.9
-404.1
28.9-49.1-80.1-105.1-121.1-161.1-201.1-285.1-304.1-339.1-347.1-361.1-378.1
668
356540564575
1023
590
699724740780820904925958966980997
668
21 9
MICROLOGSONICCALIPERGAMMA RAYFMN DENSITYCALIPERGAMMA RAY
875263.8263.8
263.8
1122112111211121113011301130
263.8263.8263.8
1
DIAMOND 001 1054 1070 16 Y
3
JETJETACIDSO
1059.510571057
1063.51059.51063.5
24-FEB-8827-FEB-8829-FEB-88
3 2
LICENSEDSPUDDEDFINDRILLCASED
21-JAN-8824-JAN-8831-JAN-8804-FEB-88
25.231.771.03.04.08
27.1610.84.110.4
221344868.2134.7
1CLLK
2
CLOSED CHMBRCLOSED CHMBRCLOSED CHMBR
000100010001
555
105210521052
107010701070
01-FEB-8801-FEB-8801-FEB-88
15M MUD @NR/CL2.33E3 GASFLW @NR/CL424M OIL @NR/CL
100162008408W600
HelpAction Edit Block Field Record Query
scout_ticket
Operator:
Name:
Prov:Class:Status:Field:Z/Pool:
Spud:Fin D:Cur Status:On Prod:Lic. Date:
TD:TVD:PBD:Whip:Lic:
KB Elev:Ground:Deviated:Fault:Old frm:
Latitude:Longitude:N/S coord:E/W coord:
FormNo. of Formations
Source MD SS-MD TVD SS-TVDNumber of LogsLog Type Top Base
Number of CoresLog Type No Top Base Rec A
Number of CompletionsComp Type Top Base Date
No of Statuses Number of CasingsStatus Date Casing Type Size Depth
Initial Latest CumulativeNo of Prod ZonesProd Fluid FromOIL M3/dayGAS 1000M3/dayWATER M3/dayWORGOR 1000M3/M3
Number of DSTsTest Type No. VOT Top Base RDepth FFP Date
<Replace>Count: #1
UWI:
IntertaskCommunication
Log Editing &Marker Interpretation
GeologicalInterpretation
Geophysical Interpretation3D Visualization
and Interpretation
Engineering Model Building
Mapping
Project Data Management
Ava
ilabl
e ye
ar e
nd
In next major release
PetrophysicalInterpretation
nPair-wise versus data-bus linkage of applications.
Pair-wise Links
Geoshare Links
Geoshare Data Bus
Six applications, 15 links; an additional applicationmeans six new links.
Six applications, six links; an additional applicationrequires only one new link.
If vendor A program is modified, five linksmay need modification;who maintains the link: Vendor A? Vendor B? Customer?
If vendor A program is modified, only A half-links need modification. Eachvendor maintains its own half-links.
A
B
C
D
E
F
A B C
DEF
• data model expansion. The data modelnow accommodates stratigraphic and pro-duction data and improves handling ofseismic data.
• mapping and cross-section enhance-ments. As the data model expands, graph-ical capabilities expand to view newkinds of data. Enhancements have beenmade to cross-section viewing, and addi-tions include 3D seismic overlay andbubble mapping. A bubble is a circle on amap that can graphically display any datain the data base. A circle over a well, forexample, can be a pie chart representingrecoveries from a drillstem test.
• enhanced data entry. All data loadershave the same look and feel. Loadershave been added that handle data in LogInterpretation Standard (LIS), generalizedAmerican Standard Code for InformationInterchange (ASCII) and others.
Unlike the Finder system, which worksmainly on the master or project level, theGeoFrame reservoir characterization systemworks partly on the project level but mainlyon the application level. The GeoFrame sys-tem comprises control, interpretation andanalysis software for a range of E&P disci-plines. The system is based on the publicdomain Schlumberger data model, which issimilar to the first version of Epicentre andwill evolve to become compliant with thesecond version. Applications today coversingle-well analysis for a range of topics:petrophysics, log editing, data management,well engineering and geologic interpretation(previous page, bottom). Additions expectedby the end of the year include mapping withgrids and contours, model building and 3Dvisualization of geologic data.
GeoFrame applications are tightly inte-grated with a single data base, meaning,most importantly, that there is only oneinstance of data. From the user’s perspec-tive, tight integration means an updatemade in one application can become auto-matically available in all applications.(Loose integration means data may reside inseveral places. The user must thereforemanually ensure that each instance ischanged during an update). Project datamanagement is performed within GeoFrameapplications, but it may also be performedby the Finder system, which is linked todayto the GeoFrame system via the Geosharestandard (see “Petrophysical Interpretation,”page 13).
45July 1994
Razing the Tower of Babel—TowardComputing StandardsMoving data around the master, project andapplication levels is complicated by propri-etary software. Proprietary systems oftenprovide tight integration and therefore fast,efficient data handling on a single vendor’splatform. But they impede sharing of databy applications from different vendors, andmovement of data between applicationsand data bases. Nevertheless, many propri-etary systems are entrenched for the nearterm and users need to prolong the useful-ness of these investments by finding a wayto move data between them.
A common solution is called pair-wiseintegration—the writing of a software linkbetween two platforms that share data. This
solution is satisfactory as long as platformsoftware remains relatively unchanged andthe number of linked platforms remainssmall. Unocal, for example, maintains fivesuch links at its Energy Resources Divisionin Sugar Land, Texas (see “Case Studies,”page 38). A revision to one of the platforms,however, often requires a revision in thelinking software. With more softwareupdates and more platforms, maintenanceof linking programs becomes unwieldy andexpensive.
A solution that sidesteps this problem iscalled a data bus, which allows connectionof disparate systems without reformatting ofdata, or writing and maintaining many link-ing programs (above). A data bus that has
received widespread attention is theGeoshare standard, which was developedjointly by GeoQuest and Schlumberger andis now owned by an independent industryconsortium of more than 50 organizations,including oil companies, software vendorsand service companies. The Geoshare sys-tem consists of standards for data contentand encoding—that is, the kind of informa-tion it can hold, and the names used for thatinformation. It also includes programs calledhalf-links for moving data to and from thebus (right ). Geoshare uses a standard fordata exchange—which defines what dataare called and the sequence in which theyare stored—that is an industry standardcalled API RP66 (American Petroleum Insti-tute Recommended Practice 66). TheGeoshare data model is evolving to becomecompatible with the Epicentre data modeland its data exchange format, which is a“muscled” version of the Geoshare standard.
Half-links are programs that translate datainto and out of the Geoshare exchange for-mat. There are sending and receiving half-links. The sending half-link translates datafrom the application format into a formatdefined by the Geoshare data model, andthe receiving half-link translates data fromthe Geoshare format into a format readableby the receiving application. Schlumbergerhas written half-links for all its products, andmany half-links are now being developedby other vendors and users. As of July this
year, 20 half-links were commercially avail-able. This number may double by year end.
There are significant differences betweensending and receiving half-links, and in theeffort required to write them. The sendinghalf-link usually has to just map the applica-tion data to the Geoshare data model. Thisis because the Geoshare data model is
broad enough to accept many ways ofdefining data—like a multilingual translator,it understands almost anything it is told. Thereceiving half-link, however, needs to trans-late the many ways that data can be put inthe Geoshare model into a format under-standable by the receiving application. Lookat a simple example.
46 Oilfield Review
(Data Exchange Standard and Data Model)
Application Data Interface
Tight Integration of Applications Loose Integration of Interpretation Applications
Major Geoshare Data Model Categories• Seismic survey• Wavelet• Velocity data• Fault trace set• Lithostratigraphy code list
• Map• Land net list• Zone list• Field• Surface set• Drilling facility
Geoshare Data Bus
Data Base Data Store Data Store Data Store
Project or MasterData Base
Application 3
Application 1
Application 2
Ap
plic
atio
n 1
Ap
plic
atio
n 2
Ap
plic
atio
n 3
Sendinghalf-link
Receivinghalf-link
nHow Geoshare linkage works. Loosely integratedapplications (right) are linked with each other, withtightly linked applications (left), and with a project ormaster data base. GeoFrame applications are tightlyintegrated, like the system on the top left.
What is Object-Oriented—Really?
The term “object-oriented” has become increas-
ingly misunderstood as its usage expands to
cover a broad range of computer capabilities. The
meaning of object-oriented changes with the
noun it precedes—object-oriented interface,
graphics and programming. What makes them all
object-oriented is that they deal with
“objects”—usually a cluster of data and code,
treated as a discrete entity. The lumping together
of data is called encapsulation.
An object-oriented interface is the simplest
example. It is a user interface in which elements
of the system are represented by screen icons,
such as on the Macintosh desktop. Applications,
documents and disk drives are represented by
and manipulated through visual metaphors for
the thing itself. Files are contained in folders that
look like paper folders with tabs. Files to be
deleted are put in the trash, which is itself an
object—a trash can or dust bin. Object-oriented
displays do not necessarily mean there is object-
oriented programming behind them.
The object-oriented interface to data, at a
higher level, involves the concept of retrieving
information according to one or more classes of
things to which it belongs. For example, informa-
tion about a drill bit could be accessed by query-
ing “drill bit” or “well equipment” or “parts
inventory.” The user doesn’t need to know where
“drill bit” information is stored to retrieve it. This
is a breakthrough over relational database
queries, where one must know where data are
stored to access them.
When software developers talk to geoscientists
about the power of an “object-oriented program,”
they are often referring to object-oriented graph-
ics, which concerns the construction and manipu-
lation of graphic elements commonly used in map-
ping, such as points, lines, curves and surfaces.
Object-oriented graphics describes an image com-
ponent by a mathematical formula. This contrasts
with bit-mapped graphics, in which image compo-
nents are mapped to the screen dot by dot, not as
a single unit. In object-oriented graphics, every
graphic element can be defined and stored as a
separate object, defined by a series of end points.
Because a formula defines the limits of the object,
Suppose you want to send a seismic inter-pretation from a Charisma workstation tothe Finder system. Both the Finder andCharisma systems recognize an entity called“well location.” In the Charisma datamodel, well location is defined by X and Ycoordinates and by depth in one of severalunits, such as decimeters. The Finder data
model also defines well location by X and Ycoordinates, but it does not yet recognizedecimeters. The Finder model understandsdepth as time (seconds or milliseconds),inches, feet or meters (above). The Geosharedata model can understand all units ofdepth, so the Charisma sending half-linktherefore has only to send decimeters to the
Geoshare bus. The Finder receiving half-link,however, must recognize that depth can bequantified six ways—feet, inches, meters,decimeters, seconds and milliseconds. In thiscase, the Finder receiving half-link providesthe translation of decimeters into a unitunderstandable by the Finder system.
Writing a receiving half-link typicallytakes six to nine months, which is an orderof magnitude longer than writing a sendinghalf-link. The main difficulty in writing areceiving half-link is that it attempts to actlike a large funnel. It tries to translate datafrom Geoshare’s broad data model into theapplication’s focused data model, so it hasto consider many possible ways that datacan be represented in the Geoshare model.The ideal receiving half-link would look atevery way a fact could be represented, andtranslate each possibility into the variantused by the receiving application. Forexample, many interpretation systems use avariant of a grid structure to represent sur-faces. Such surfaces may represent seismicreflectors or the thickness of zones of inter-est. Geoshare provides three major repre-sentations for such surfaces: grids, contoursand X, Y and Z coordinates. If the receiverlooks only at incoming grids, the other twokinds of surface information may be lost.The policy taken by a receiver should be tolook for any of these possibilities, and trans-late into the internal representation used bythe applications.
47July 1994
The Problem
Sen
ding
half-
link Geoshare
Well location
xydecimetersmetersfeetinchessecondsmilliseconds
Rec
eivi
ngha
lf-lin
k
?
Charisma
Well location
xydecimeters
Finder
Well location
xymetersfeetsecondsmilliseconds
The Solution
Geoshare
Well location
xydecimetersmetersfeetinchessecondsmilliseconds
Finder
Well location
xymetersfeetsecondsmilliseconds
Finder receivinghalf-link convertsto meters
it can be manipulated—moved, reduced, enlarged,
rotated—without distortion or loss of resolution.
These are powerful capabilities in analyzing repre-
sentations of the earth and mapped interpretations.
“Object-oriented” to a programmer or software
developer tends to mean object-oriented program-
ming (OOP). It does not necessarily imply new func-
tionalities or a magic bullet. Its most significant
contribution is faster software development, since
large chunks of existing code can often be reused
with little modification. Conventional sets of
instructions that take eight weeks to develop may
take only six weeks when done in OOP. It is analo-
gous to modular construction in housing, in which
stairs, dormers and roofs are prefabricated, cut-
ting construction time in half.
The next wave in data bases themselves is a
shift toward object-oriented programming, in
which the user can define classes of related ele-
ments to be accessed as units. This saves having
to assemble the pieces from separate relational
tables each time, which can cut access time.
POSC’s Epicentre model uses essentially rela-
tional technology but adds three object-oriented
concepts:
• unique identifiers for real-world objects
• complex data types to reflect business usage
• class hierarchy for the drill bit example above.
Further reading:
Tryhall S, Greenlee JN and Martin D: “Object-Oriented DataManagement,” paper SPE 24282, presented at the SPE Euro-pean Petroleum Computer Conference, Stavanger, Norway,May 25-27, 1992.
Zdonik SB and Maier D (eds): “Fundamentals of Object-Ori-ented Databases,” in Readings in Object-Oriented DatabaseSystems. San Mateo, California, USA: Morgan KaufmannPublishers, Inc., 1990.
Nierstrasz O: “A Survey of Object-Oriented Concepts,” inKim W and Lachovsky FH (eds): Object-Oriented Concepts,Databases, and Applications. New York, New York, USA:ACM Press, 1989.
Williams R and Cummings S: Jargon: An Informal Dictio-nary of Computer Terms. Berkeley, California, USA: PeachpitPress, 1993.
Aronstam PS, Middleton JP and Theriot JC: “Use of anActive Object-Oriented Process to Isolate Workstation Appli-cations from Data Models and Underlying Databases,” paperSPE 24428, presented at the Seventh SPE Petroleum Com-puter Conference, Houston, Texas, USA, July 19-22, 1992.
nSending andreceiving Geosharehalf-links. Conver-sion of data—here,units of depth—to aform understand-able by the receiv-ing application isthe job of thereceiving half-linkbetween Geoshareand the Finder sys-tem. A well loca-tion can be a pointwith reference toone of several fea-tures, such as thekelly bushing, a for-mation top or aseismic surface.
The Geoshare standard can link two pro-grams if they can both talk about the samedata, even if not in the same vocabulary,such as the Finder and Charisma systemstalking about “well location.” But two pro-grams have a harder time communicating ifthey don’t understand the same concept. Byanalogy, a Bantu of equatorial and southernAfrica could not talk with a Lapp aboutsnow since the Bantu have no word forsnow. In the E&P world, this problem of amissing concept happens about a third ofthe time when linking applications throughthe Geoshare standard. But there is a wayaround it. Look again at an example.
The Finder system and IES IntegratedExploration System, which is an interpreta-tion system, both have conventions for des-ignating wells. For the IES system, a well hasa number, which can be determined in vari-ous ways, and a nonunique name. For theFinder system, a well is known by a specifickind of number, its API number. TheGeoshare data model accepts all conven-tions. When a formation top is picked in theIES system, the top is labeled by well nameand number. If it were shipped that way tothe Finder system through Geoshare, the
Finder system would not recognize the wellbecause it is looking for an API number. Thesolution is to add a slot in the IES datamodel for the API number. The IES systemitself does not use this slot, but when itsends data to the Finder system, the IESsending half-link adds the API number tothe message, enabling the Finder system torecognize the well.
While the Geoshare standard is designedto function as a universal translator, the E&Pcommunity is working toward a commondata model that will eventually eliminatethe translator, or at least reduce the need forone. Recognition of the need for a commondata model and related standards arose inthe 1980s as technical and economic forceschanged geoscience computing.14 The riseof distributed computing made interdisci-plinary integration possible, and flat oilprices shrank the talent pool within oil com-panies, resulting in smaller, interdisciplinarywork groups focused on E&P issues ratherthan software development.
When work on information standardsaccelerated in the late 1980s, much of theindustry support went to a nonprofit consor-tium, the Houston-based PetrotechnicalOpen Software Corp (POSC). An indepen-dent effort in Calgary, Alberta, Canada hadbeen established by another nonprofit con-sortium, the Public Petroleum Data ModelAssociation. Although the PPDM effort wason a smaller scale, in 1990 it was first torelease a usable E&P database schema,which has since been widely adopted in theE&P community.15
At its inception, the PPDM was generallyregarded outside Canada as a local, Calgarysolution for migrating from a mainframe to aclient/server architecture. The PPDM is aphysical implementation, meaning it definesexactly the structure of tables in a relationaldata base. The effort at POSC is to developa higher level logical data model, concen-trating less on specifics of table structureand more on defining rules for standardinterfacing of applications.16 By analogy, thePPDM is used as blueprints for a specifichouse, whereas POSC attempts to provide aset of general building codes for erectingany kind of building.
Since 1991, widespread use and matura-tion of the PPDM—the data model is now inversion 3—has increased its acceptance.This has changed POSC’s strategy. Today,POSC is preparing two expressions of itsEpicentre high-level logical data model, onethat is technology-independent, imple-mentable in relational or object-orienteddatabases (see “What is Object-Oriented—Really?” page 46), and another one toaddress immediate usage in today’s rela-tional database technology. The logicalmodel is called the Epicentre hybridbecause it adds some object-oriented con-cepts to a relational structure, although it
48 Oilfield Review
* A snapshot is an interim version, like an alpha or beta test in engineering.
Published ConformanceStatement Template snapshot
1990 1991 1992 1993 1994 1995
POSC established as consortium funded by 35 companies
Request for technology (RFT)for data model
Request for comments (RFC) for data access
Publication of Base Computer Standards (BCS) version 1.0
RFT for data access protocol
Published complete Epicentre “root model” snapshot
Published series of snapshot*Epicentre, Data Access & Exchange, Exchange Format
RFC for User Interface Style Guide
Published complete series of final specifications version 1.0
Published Epicentre, Data Access & Exchange, Exchange Format, User Interface Style Guide (Prentice-Hall)
Published BCS version 2.0 snapshot
Obtained PPDM agreement to proceed with merger
Started Internet information server, made project proposals and reviewsmore democratic
Member organizationsnumber 85
Helped start more than 20 migrations to BCS
Will publish BCS version 2.0
Will complete PPDMmerger, publishEpicentre version 2.0.
Will publish Data Access & Exchangeversion 1.1
ShareFest: An exhibition of POSC-compliant products sharing data from oil companies
nMilestones in POSC efforts.
Data Model Expressions of the Logical Model
Epicentre Hybrid Epicentre/PPDM
Implemented with a relational or partially or fully object-oriented database management system (DBMS)
Accessed via a POSC-specified application programing interface
User accesses a physical data base that isnot visibly different from the logical model
A single data model for applications thathave not yet been able to effectively usea relational DBMS, such as reservoirsimulators and geographic information systems
Opens a market for vendors of object-oriented DBMS and an evolutionarypath for relational DBMS
Implemented with a relational DBMS andoperating system or a DBMS bulk storagemechanism for large data sets
Accessed via a vendor-supported interfaceand a POSC-specified application programinterface for complex data types
User accesses a physical data base that is equivalent to the logical model or visibly different from it
Provides a set of highly similar relational data models that can be implementedwith current database managementsystem software
Provides a market for vendor products that meet the specifications of the logical model
Epicentre Logical Model and Data Access and Exchange FormatAn information model based on data requirements and data flow
characteristics of the E&P industry
leans heavily to the relational side. The firstversion of the hybrid was released in July1993. The relational solution, to be releasedthis summer, is POSC’s merger of the hybridand PPDM models, and is called informallythe Epicentre/PPDM model. This modelallows the few object-oriented features ofthe hybrid model to be used by relationalsoftware (right). The merged model is beingtested by a task force comprising PPDM,POSC and several oil companies.
Where is POSC Going?After two years of developing standards, lastsummer POSC released version 1.0 of itsEpicentre data model. Since then, Epicentrehas undergone review, validation, conver-gence with the PPDM, and use by develop-ers in pilot studies and in commercial prod-ucts. Release of Epicentre 2.0, which POSCconsiders an industrial-strength data model,is due this autumn (previous page). Todaythe Epicentre hybrid model as a wholeresides in a handful of database and com-mercial application products. Vendors havewritten parts of the specifications into proto-type software, which is increasing in num-ber, ambition and stability.
POSC has learned hard lessons aboutbuilding a data model. “There are a lot ofvery good local solutions,” says BrunoKarcher, director of operations for POSC,“that when screwed together make one big,bad general solution. What we have learnedis how to take those local solutions, whichare flavored by local needs and specificbusiness disciplines, and find a middleground. We know that the Epicentre logicalmodel is not a one-shot magic bullet. It isguaranteed to evolve.”
A key issue POSC is grappling with is thedefinition of “POSC-compliance,” whichtoday remains unspecified. From the user’sperspective “POSC-compliant” equipmentshould have interoperability—seamlessinteraction between software from differentvendors and data stores. The simplest defini-
July 1994
14. For a later paper summarizing the industry needs:Schwager RE: “Petroleum Computing in the 1990’s:The Case for Industry Standards,” paper SPE 20357,presented at the Fifth SPE Petroleum Computer Con-ference, Denver, Colorado, USA, June 25-28, 1990.
15. A schema is a description of a data base used by thedatabase management system. It provides informa-tion about the form and location of attributes, whichare components of a record. For example, sonic,resistivity and dipmeter might be attributes of eachrecord in a log data base.
tion of compliance means “applications canaccess data from POSC data stores,” but theE&P community is still a long way fromconsensus on how this can be achievedpractically. POSC has started building amigration path with grades of compliance.In these early stages, compliance itself maymean commitment to move toward POSCspecifications over a given period.
A consequence of having all data fit thePOSC model is disappearance of the appli-cation data store as it exists today—a propri-etary data store, accessible only by the hostapplication. For this to happen meansrethinking the three levels—master, projectand application—and blurring or removingthe boundaries between them. Ultimately,there may be a new kind of master data
16. Chalon P, Karcher B and Allen CN: “AniInnovativeData Modeling Methodology Applied to PetroleumEngineering Problems,” paper SPE 26236, presentedat the Eighth SPE Petroleum Computer Conference,New Orleans, Louisiana, USA, July 11-14, 1993.Karcher B and Aydelotte SR: “A Standard IntegratedData Model for the Petroleum Industry,” paper SPE24421, presented at the Seventh SPE PetroleumComputer Conference, Houston, Texas, USA, July19-22, 1992.Karcher B: “POSC The 1993 Milestone,” Houston,Texas, USA: POSC Document TR-2 (September1993).Karcher B: “Effort to Develop Standards for E&P Soft-ware Advances,” American Oil & Gas Reporter 36,no. 7 (July 15, 1993): 42-46.
base that will allow the user to zoom in andout of the data, between what is now themaster and application levels. There will befewer sets of data, therefore less redundancyand less physical isolation. At the concep-tual level, applications that manipulate datawill no longer affect the data structure.Applications and data will be decoupled.
“The problems of multiple data versionsand updating may disappear,” says Karcher,“but we will be looking forward to newproblems, and as-yet unknown issues ofdata administration, organization and secu-rity. I think even when standards are imple-mented in most commercial products,POSC will continue to play a role in bring-ing together a body of experience to solvethose new problems.” —JMK
49