towards owl-based knowledge representation in petrology

32
Towards OWL-based Knowledge Representation in Petrology A.Shkotin, V.Ryakhovsky, D.Kudryavtsev GIS Department Vernadsky State Geological Museum Russian Academy of Sciences www.sgm.ru [email protected]

Upload: alex-shkotin

Post on 15-Jul-2015

1.227 views

Category:

Technology


2 download

TRANSCRIPT

Towards OWL-based Knowledge Representation in Petrology

A.Shkotin, V.Ryakhovsky, D.Kudryavtsev

GIS DepartmentVernadsky State Geological Museum

Russian Academy of Scienceswww.sgm.ru

[email protected]

2

Contents

Introduction Fact formalization Dictionary formalization Formal definitions Conclusions and further plans Acknowledgments References

3

Introduction

Petrology is a science investigating rocks and their formation conditions. Large volume of petrological information requiring systematization, integration and maintenance in a consistent state is accumulated at present.

These tasks can be solved through knowledge formalization.

4

Knowledge Formalization

Final goal: Create a formal theory for the key concepts of petrology and relationships among them.

Ontologies, that is conceptual structures organized on the basis of mathematical logic, play a pivotal role in the process of creation.

Definitions play a decisive role in a formal theory, as they specify exactly those concepts whose properties will be used and studied.

5

Formal Theory

Building a formal theory of the field of natural science similar of a mathematical one enables:

Revealing primary conceptsGiving definitions to other conceptsStating axioms and theorems

6

Fact Formalization

7

The Approach

Databases are not a knowledge.

They require an essential and thorough processing to obtain a knowledge.

DB conversion to the traditional form of knowledge, i.e. knowledge in a natural language, is the direct way of obtaining knowledge from data.

The natural language is limited to a CNL. CNL is a universal means of formal knowledge presentation.

8

9

CNL Sentences. The Approach

Create templates of a CNL sentences to present all the facts contained in the Proba DB.

Use local (‘internal’) proper names.

Connect words in composite terms using ‘_’ letter.

Global commonly known proper names such as Iceland, Atlantic_Ocean can be found in a text.

10

CNL Sentences. Example

PUB5633 is a publication.PUB5633 title is "A CONTRIBUTION TO THE GEOLOGY OF THE K...".SAM32994 is a sample. SAM32994 is a rhyolite.PLC32994 is a place. PLC32994 is a part of Iceland.SUB469812 is a substance. WPC469812 is a weight_percent. WPC469812 value is 73.95.

PUB5633 describes SAM32994.SAM32994 gathering_place is PLC32994.SAM32994 includes SUB469812.SUB469812 is a WPC469812 component.

11

OWL Ontology

All generated sentences are ACE* language statements.

The sentences are so that the APE* translator translates them to OWL.

The DB is converted to 1,174 ontologies.

* Attempto Project. http://attempto.ifi.uzh.ch/site/

12

OWL Ontology #5633

Classes: place, publication, rhyolite, sample, substance, weight_percent.

Object properties: component, gathering_place, includes, mixture, part...

Individuails: Iceland, PLC32994..., PUB5633, SAM32994..., SUB469812..., WPC469812...

Data properties: authorial_number, chemical_formula, first_page, latitude, longitude, reference, title, value, year.

13

Dictionary Ontology and

Definition Concentrator

14

Definition Dictionary

is an important and specific type of knowledge

contains the terms of a subject area and informal definitions of these terms.

Informal definitions are provided by experts usually belonging to a scientific school.

Tasks: Convert a definition dictionary into formal knowledgeGather definitions given by various schools

15

Dictionary Entry Example

HARZBURGITE. An ultramafic plutonic rockcomposed essentially of olivine andorthopyroxene. Now defined modally in theultramafic rock classification (Fig. 2.9, p.28).(Rosenbusch, 1887, p.269; Harzburg, HarzMts, Lower Saxony, Germany; Tröger 732;Johannsen v.4, p.438; Tomkeieff p.247)

[IRCGT], p.88

16

From Dictionary Text to Ontology

Dictionary“Dictionary of Terms of Igneous Rocks. 1,567 entries, the overwhelming majority of them being rock names.OwnerInter-Departmental Petrographic Committee in the Geoscience Division of the Russian Academy of Sciences.Texthttp://www.igem.ru/site/petrokomitet/slovar.htmOntology http://earth.jscc.ru/ontologies/dic.owl

17

Definition Concentrator

The goal is to collectively maintain definitions of scientific terms, including formal definitions.

The ontology of igneous rocks is contained in webProtege under the dic name.

Some terms are complemented with definitions from other dictionaries.

Address at Geology portal:http://earth.jscc.ru/webprotege/

18

19

Name Spaces

prefix pgc: <http://www.igem.ru/site/petrokomitet/slovar#>

prefix dic: <http://earth.jscc.ru/ontologies/dic.owl#>

prefix gwr: <http://wiki.web.ru/wiki#>

prefix pgcc: <http://www.igem.ru/site/petrokomitet/code#>

20

Formal Term Meaning Definition

abessedite is

peridotite and mineral_mixture and contains_mineral only (olivine or hornblende or phlogopite)

OWL syntax – Manchester.

21

Formal Definitions

22

Primary Source

Le Maitre, L.E., ed. 2002. Igneous Rocks: A Classification and Glossary of Terms 2nd edition, Cambridge.http://amigoreader.com/book/?b=29372

23

Building an Algorithm

The classification rules described in methodologies and recommendations have to be used to obtain precise definitions and to formalize them.

We start with a revision of all parts of the algorithm.

24

VPC Definitions

Modal content of pyroxenes

VPC_Px(x) = VPC_Opx(x)+VPC_Cpx(x)

Mineral groups for diagrams

VPC_OOC(x) = VPC_Ol(x)+VPC_Opx(x)+VPC_Cpx(x)VPC_OPH(x) = VPC_Ol(x)+VPC_Px(x)+VPC_hornblende(x)

Modal content of mafic minerals

VPC_M(x) = 100 - (VPC_Q(x)+VPC_A(x)+VPC_P(x)+VPC_F(x))

25

Qualitative Characteristics

Predicatespyroclastic, kimberlite, lamproite, lamprophyre, charnockite, plutonic, volcanic.

Definition pyroclastic(x) = clastic(x) and ( y clast(y) part_of(y,x)→ volcanic_eruption_result(y))∀ ⋀

DLPyroclastic ≡ clastic (part_of id(clast))-.⊓∀ ∘volcanic_eruption_result

26

27

28

The Use of Reasoners

The described properties can be automatically verified by loading definitions into a reasonerworking with linear inequalities.

Such reasoners do exist (e.g. Racer), and linear inequalities can be written using the OWL 2 extension [OWL2LE].

29

harzburgite

harzburgite(x) =plutonic(x) and not (pyroclastic(x) or kimberlite(x) or lamproite(x) or lamprophyre(x) or charnockite(x))and VPC_carbonates(x)≤50 and VPC_melilite(x)≤10 and VPC_M(x) ≥ 90 and VPC_kalsilite(x)=0 and VPC_leucite(x)=0 and VPC_hornblende(x)=0 and 0.4*VPC_OOC(x)≤VPC_Ol(x)≤0.9*VPC_OOC(x) and VPC_Cpx(x)<0.05*VPC_OOC(x)

30

Conclusions and Further Plans

A formula is possible The construction of a

formal theory started Tools of formal

knowledge maintenance tested

A definition concentrator prototype created

Ontology project in petrology:

Definition concentrator

Formalization of [IRCGT]

GIS formalization CRL (Controlled

Russian language)

31

Acknowledgments

We would like to thank

Dr. Stephen M. Richard from Arizona Geological Survey for comments on the report [otch10], a helpful discussion and a reference to [BGSRCS]

Pavel Klinov from University of Manchester for numerous invaluable comments

Dr. Kaarel Kaljurand from Attempto group for the idea of using proper names

32

References

[IRCGT] Le Maitre, L.E., ed. 2002. Igneous Rocks: A Classification and Glossary of Terms 2nd edition, Cambridge. url

[BGSRCS] Gillespie, M R, and Styles, M T. 1999. BGS Rock Classification Scheme, Volume 1, Classification of igneous rocks. British Geological Survey Research Report, (2nd edition), RR 99–06. url

[OWL2LE] OWL 2 Web Ontology Language. Data Range Extension: Linear Equations. url