ontology tutorial: semantic technology for intelligence, defense and security

198
Dr. Barry Smith Director National Center for Ontological Research What is an Ontology and What is it Useful for? 1

Upload: barry-smith

Post on 18-May-2015

1.672 views

Category:

Education


6 download

TRANSCRIPT

  • 1. Dr. Barry SmithDirectorNational Center for Ontological ResearchWhat is an Ontology and What is it Usefulfor?1

2. Barry Smith who am I?Director: National Center for Ontological Research (Buffalo)Founder: Ontology for the Intelligence Community (OIC, now STIDS)conference seriesOntology work for:NextGen (Next Generation) Air Transportation SystemNational Nuclear Security Administration, DoEJoint-Forces Command Joint Warfighting CenterArmy Net-Centric Data Strategy Center of Excellenceand for many national and international science and healthcareagencies2 3. The problem: many, many silos DoD spends more than $6B annually developing aportfolio of more than 2,000 business systemsand Web services these systems are poorly integrated deliver redundant capabilities, make data hard to access, foster error and waste prevent secondary uses of datahttps://ditpr.dod.mil/ Based on FY11 Defense Information TechnologyRepository (DITPR) data3 4. Some questions How to find data? How to understand data when you find it? How to use data when you find it? How to compare and integrate with other data? How to avoid data silos? How to make a battlefield situation rapidlyunderstandable How to decide what to do next in a battlefieldsituation 4 5. The problem of retrieval, integrationand analysis of siloed data is not confined to the DoD affects every domain due to massive legacy ofnon-interoperable data models and datasystems and as new systems are created along thesame lines, the situation is constantly gettingworse.5/24 6. One solution: ber-model must be built en bloc beforehand inflexible, unresponsive to warfighter needs heavy-duty manual effort for both constructionand ingestion, with loss and/or distortion ofsource data and data-semantics might help with data retrieval and integration but offers limited analytic capability has a limited lifespan because it rests on one pointof view6 7. A better solution, begins with the Web(net-centricity) You build a site Others discover the site and they link to it The more they link, the more well known thepage becomes (Google ) Your data becomes discoverable7 8. 1. Each group creates a controlled vocabulary ofthe terms commonly used in its domain, andcreates an ontology out of these terms usingOWL syntax4. Binds this ontology to its data and makes thesedata available on the Web5. The ontologies are linked e.g. through their useof some common terms6. These links create links among all thedatasets, thereby creating a web of dataThe roots of Semantic Technology 9. Where we stand today increasing availability of semantically enhanceddata and semantic software increasing use of OWL in attempts to createuseful integration of on-line data andinformation Linked Open Data the New Big Thing9 10. as of September 2010 10 11. The problem: the more SemanticTechnology is successful, they more it failsThe original idea was to break down silos viacommon controlled vocabularies for the taggingof dataThe very success of the approach leads to thecreation of ever new controlled vocabularies semantic silos as ever more ontologies arecreated in ad hoc waysEvery organization and sub-organization nowwants to have its own ontologyThe Semantic Web framework as currentlyconceived and governed by the W3C yieldsminimal standardization11 12. Divided we fail12 13. United we also fail13 14. 14The problem of joint / coalition operationsFireSupportLogisticsAir OperationsIntelligenceCivil-MilitaryOperationsTargetingManeuver&BlueForceTracking 15. An alternative solution:Semantic EnhancementA distributed incremental strategy of coordinated annotation data remain in their original state (is treated at arms length) tagged using interoperable ontologies created in tandem allows flexible response to new needs, adjustable in real time can be as complete as needed, lossless, long-lasting becauseflexible and responsive big bang for buck measurable benefit even from first smallinvestments strong tool support for data analysis multiple successful precedentsThe strategy works only to the degree that it rests on sharedgovernance and training15 16. 16Jonathan Underly - EIW ProgramManager25BusinessEnterpriseArchitecture(BEA)ectly:priseticslianceortfoliogementDoD EAHR DomainVocabularyAcq DomainVocabularyLog DomainVocabularyFin DomainVocabularyReal Prop DomainVocabularySvc Member OUID(GFMDI)(EDIPI)Warfighter DomainVocabularyE2E BP executes via BEA directlyBP modelsuniformlydescribedOMG PrimitivesConformance class2.0Data described in RDF Relationship described in OWLandards Legend: DoD Authoritative Data Sources End-to-ProcessP)Dennis E. Wisnosky: A Vision for DoD Solution Architectures 17. What can semantic technology dofor you? software, hardware, business processes, target domainsof interest change rapidly but meanings of common words change only slowly semantic technology allows these meanings to beencoded separately from data files and from applicationcode decoupling of semantics from data andapplications ontologies (controlled, logicallystructured, vocabularies), which are used to enhancelegacy and source content to make these contents retrievable even by those notinvolved in their creation to support integration of data deriving from heterogeneoussources to allow unanticipated secondary uses17 18. The capability for massing timely andaccurate artillery fires by dispersedbatteries upon single targets required real-time communications of a sort that could create a common operational picture that could take accountof new developments in the field thereby transforming dispersed batteries into a single systemof interoperable modules. this was achieved (in Ft. Sill around 1939) through a new type of information support a new type of governance and training new artillery doctrine18/24 19. The capability for massing timely andaccurate intelligence fireswill similarly require real-time pooling of information of asort that can create a common operational picture able to be constantlyupdated in light of new developments in the field thereby transforming dispersed data artifacts within theCloud into a single system of interoperable modulesThis will require in turn a new type of support (for semantic enhancement of data) a new type of governance and training new intelligence doctrine to include applied semantics19/24 20. ICODEShttp://digitalcommons.calpoly.edu/cadrc/20Integrated Computerized DeploymentSystema decision-support system for rapiddevelopment of conveyance plans. Used byunit personnel to react rapidly and efficientlyto changing transportation requirements 21. ICODES from 2 days to 10 minutes manual coding effort more elaborate loading scenarios can be supported different forces can share the same ships becausetheir loading categories are built into the sameontology high flexibility as cargoes, ships and loadingtechnologies change21Performance MetricsTested Procedure V 3.0 (1998) V 5.0( 2001) V 5.4 (2005)Create 2-ship load-plan, 2,400 normal cargo items 20 min 8 min 1.5 minCreate 2-ship load-plan, 1,200 hazardous cargo items 25 min 11 min 2.5 minUnload inventory of 2,400 items from 2 ships 10 min 5 min 1.0 min 22. Towards ontology coordinationBarry Smith22 23. On June 22, 1799, in Paris,everything changed23 24. International System of Units24 25. How to find data?How to find other peoples data?How to find your own data?How to reason with data when you find it?How to combine data from multiple sources?25 26. 26How to solve the problem of makingthe data we find queryable and re-usable by others?Part of the solution must involve:standardized terminologies andcoding schemes, analogous to the SISystem of Units 27. 27ontologies = standardized labelsdesigned for use in annotationsto make the data cognitivelyaccessible to human beingsand algorithmically accessibleto computers 28. Uses of ontology in PubMed abstracts28 29. 29 30. 30ontologies = high quality controlledstructured vocabularies for theannotation (description, tagging) ofdata, images, emails, documents, 31. compare: legends for mapscompare: legends for maps31 32. compare: legends for mapscommon legends allow (cross-border) integration32 33. 33 34. 34types vs. instances 35. 35names of instances 36. 36names of types 37. The Gene OntologyMouseEcotope GlyProtDiabetInGeneGluChemsphingolipidtransporteractivity37 38. The Gene OntologyMouseEcotope GlyProtDiabetInGeneGluChemHolliday junctionhelicase complex38 39. The Gene OntologyMouseEcotope GlyProtDiabetInGeneGluChemsphingolipidtransporteractivity39 40. Common legends help human beings use and understand complexrepresentations of reality help human beings create useful complexrepresentations of reality help computers process complex representations ofreality help glue data togetherBut common legends serve these purposes onlyif the ontologies themselves are developed in acoordinated, non-redundant fashion40 41. A good solution to this silo problem must be: modular incremental independent of hardware and software bottom-up evidence-based revisable incorporate a strategy for motivating potentialdevelopers and users41 42. RELATIONTO TIMEGRANULARITYCONTINUANT OCCURRENTINDEPENDENT DEPENDENTORGAN ANDORGANISMOrganism(NCBITaxonomy)AnatomicalEntity(FMA,CARO)OrganFunction(FMP, CPRO) PhenotypicQuality(PaTO)BiologicalProcess(GO)CELL ANDCELLULARCOMPONENTCell(CL)CellularComponent(FMA, GO)CellularFunction(GO)MOLECULEMolecule(ChEBI, SO,RnaO, PrO)Molecular Function(GO)Molecular Process(GO)The Open Biomedical Ontologies (OBO) Foundry42 43. CONTINUANT OCCURRENTINDEPENDENT DEPENDENTORGAN ANDORGANISMOrganism(NCBITaxonomy)AnatomicalEntity(FMA,CARO)OrganFunction(FMP, CPRO) PhenotypicQuality(PaTO)Organism-LevelProcess(GO)CELL ANDCELLULARCOMPONENTCell(CL)CellularComponent(FMA, GO)CellularFunction(GO)Cellular Process(GO)MOLECULEMolecule(ChEBI, SO,RNAO, PRO)Molecular Function(GO)MolecularProcess(GO)rationale of OBO Foundry coverageGRANULARITYRELATION TOTIME43 44. RELATIONTO TIMEGRANULARITYCONTINUANT OCCURRENTINDEPENDENT DEPENDENTCOMPLEX OFORGANISMSFamily, Community,Deme, PopulationOrganFunction(FMP, CPRO)PopulationPhenotypePopulationProcessORGAN ANDORGANISMOrganism(NCBITaxonomy)AnatomicalEntity(FMA,CARO) PhenotypicQuality(PaTO)BiologicalProcess(GO)CELL ANDCELLULARCOMPONENTCell(CL)CellularComponent(FMA, GO)CellularFunction(GO)MOLECULEMolecule(ChEBI, SO,RnaO, PrO)Molecular Function(GO)Molecular Process(GO)Population-level ontologies 44 45. RELATIONTO TIMEGRANULARITYCONTINUANT OCCURRENTINDEPENDENT DEPENDENTORGAN ANDORGANISMOrganism(NCBITaxonomy)AnatomicalEntity(FMA,CARO)OrganFunction(FMP, CPRO)PhenotypicQuality(PaTO)BiologicalProcess(GO)CELL ANDCELLULARCOMPONENTCell(CL)CellularComponent(FMA, GO)CellularFunction(GO)MOLECULEMolecule(ChEBI, SO,RnaO, PrO)Molecular Function(GO)Molecular Process(GO)Environment Ontologyenvironments45 46. Creation of new ontologyconsortia, modeled on the OBO Foundry46NIF Standard Neuroscience InformationFrameworkeagle-IOntologiesused by VIVO and CTSAconnect for publications,patents, credentials, data andsample collectionsIDO Consortium Infectious Disease Ontology 47. 47 48. Levels of coordination What there is XML syntactic interoperability RDF, OWL, CL representational interoperability(URIs plus triples) and semantic interoperability (=exposed semantics) -- you use Person, I useP, he uses Persn Creating genuine semanticinteroperability, especially with more expressivelanguages such as OWL or Common Logic is inpractice impossible across broad heterogeneouscommunities48 49. The problem Interoperability is necessary but not sufficient. It allowsautomatic processing of content, but only if you cansustain alignments between multiple independentvocabularies simultaneously. For example, you have class Human Being with subclassPerson in one system and P in another. If you want toestablish that Person = P, you can do this in such a waythat it will be understandable to you and everyone else,and the system will work without any additional help. However, if you then have to say that Persn = P, and willhave to make similar assertions over and over again andto keep the alignments consistent over time, then youwill rapidly lose control.49 50. The SE solution: Ontology (only) at the center Establish common ontology content, which we andour collaborators (and our software control) Keep this content consistent and non-redundant as itevolves. Seek semantic sharing only in the SE environment. so what SE brings is semantic interoperability plusconstrained syntax it brings a kind of substitute for semanticinteroperability of source data models, throughthe use by annotators of ontologies from thesingle evolving SE suite50 51. SE annotations applied to source data in the DSCCloudDr Malyuta will discuss the DSC Cloud Dataspacein the next segmentSee also tomorrows presentation: 11:30 Horizontal Integration of WarfighterIntelligence Data51 52. Distributed Common Ground System Army(DCGS-A)Semantic Enhancementof the Dataspaceon the CloudDr. Tatiana MalyutaNew York City College of Technologyof the City University of New York 53. Integrated Store of Intelligence Data Lossless integration without heavy pre-processing Ability to: Incorporate multiple integration models / approaches /points of view of data and data-semantics Perform continuous semantic enrichment of the integratedstore Scalability53 54. Solution Components Cloud implementation Cloudbase (Accumulo) Data Representation and IntegrationFramework Comprehensive unified representation ofdata, data semantics, and metadata This work was funded by US Army CERDECIntelligence and Information WarfareDirectorate (I2WD).54 55. Dealing with Semantic HeterogeneityPhysical Integration.A separate data storehomogenizingsemantics in aparticular data-model works only forspecial cases, entailsloss and distortion ofdata and semantics,creates a new datasilo.Virtual integration. Aprojection onto ahomogeneous data-model exposed tousers is moreflexible, but may havethe problem of dataavailability (e.g.military, intelligence).Also, a particularhomogeneous modelhas limited usage,does not expose allcontent, and does notsupport enrichment55 56. Pursuit of the Holy Grail of IntelligenceData IntegrationIn a highly dynamic semantic environmentevolving in ad hoc ways how to have it all and have it available immediatelyand at any time? Traditional physical and virtual integration approachesfail to respond to these requirements how to use these data resources efficiently(integrate, query, and analyze)?56 57. Workable SolutionA physical storeincorporatingheterogeneous contents.Data Representation andIntegration Framework(DRIF) is based on adecomposedrepresentation ofstructured data (RDF-style)and allows collection ofdata resources without lossand or distortion andthereby achieverepresentational integrationLight Weight SemanticEnhancement (SE)supports semanticintegration andprovides a decentutilization capabilitywithout adding storageand processing weightto the already storage-and processing-heavyDataspace57 58. DRIF Dataspace Integration without heavy pre-processing (ad-hoc rapidintegration): Of any data artifact regardless of the model (or absence of it) andmodality Without loss and or distortion of data and data-semantics Continuous evolution and enrichment Pay-as-you-go solution While data and data-semantics are expected to be enriched andrefined, they can be efficiently utilized immediately after enteringthe DataSpace through querying, navigation, and drillingD. Salmen et al. Integration of Intelligence Data through Semantic Enhancement.http://stids.c4i.gmu.edu/STIDS2011/papers/STIDS2011_CR_T1_SalmenEtAl.pdf 58 59. Organization of the DRIF DataspaceRegistrationIngestionExtraction [Transformation] / Enrichment 60. Goals of Semantic Enhancement Simple yet efficient harmonization strategy Takes place not by changing the data semantics to which it is applied , butrather by adding an extra semantic layer to it Long-lasting solution that can be applied consistently and in cumulativefashion to new models entering the Dataspace Strategy compliant with and complementing the DRIF Source data models are not changed Be used efficiently, and in a unified fashion, in search,reasoning, and analytics Provides views of the Dataspace of different level of detail Mapping to a particular ber-model or choosing a singlecomprehensive model for harmonization do not providethe benefits described60 61. Ontology vs. Data Model Each ontology provides a comprehensive synoptic view of adomain as opposed to the flat and partial representationprovided by a data modelComputerSkillSingle Ontology Multiple Data modelsPersonPersonPersonNameFirstNameLastNamePersonSkillPersonName NetworkSkill ProgrammingSkillIs-a Bearer-ofSkillLast Name First Name SkillPerson Name Computer SkillProgrammingSkillNetworkSkillSkill61 62. Illustration DRIF Dataspace with lots of data models Incremental annotations of these data modelsthrough SE ontologies Preserving the native content of dataresources Presenting the native content via the SEannotations Benefits of the approach62 63. Sources Source database Db1, with tables Person and Skill, containingperson data and data pertaining to skills of differentkinds, respectively. Source database Db2, with the table Person, containing dataabout IT personnel and their skills: Source database Db3, with the table ProgrSkill, containing dataabout programmers skills:PersonID SkillID111 222SkillID Name Description222 Java ProgrammingID SkillDescr333 SQLEmplID SkillName444 Java63 64. Representation in the DataspaceValue andAssociated LabelRelation Value andAssociated Label111, Db1.PersonID hasSkillID 222, Db1.SkillID222, Db1.SkillID hasName Java, Db1.Name222, Db1.SkillID hasDescription Programming,Db1.Description333, Db2.ID hasSkillDescr SQL, Db2.SkillDescr444, Db3.EmplID hasSkillName Java, Db3.SkillNameLabel Relation SE LabelDb1.Name Is-a SE.SkillDb2.SkillDescr Is-a SE.ComputerSkillDb3.SkillName Is-a SE.ProgrammingSkillDb1.PersonID Is-a SE.PersonIDDb2.ID Is-a SE.PersonIDDb3.EmplID Is-a SE.PersonIDSE.ComputerSkill Is-a SE.SkillSE.ProgrammingSkill Is-a SE.ComputerSkillRepresentation ofdata-models, SE andSE annotations asConcepts andConceptAssociationsBlue SE annotationsRed SE hierarchiesNativerepresentationof structureddata64 65. Indexed Contents Based on the SEIndex entries based on the SE and native (blue) vocabulariesIndex Entry Associated Field-Value111,PersonIDType: PersonSkill: JavaDb1.Description:Programming333,PersonIDType: PersonComputerSkill: SQL444,PersonIDType: PersonProgrammingSkill: Java65 66. Benefits of DRIF + SE Leverages syntactic integration provided by DRIF, semanticintegration provided by the SE vocabulary and annotations ofnative sources, and rich semantics provided by ontologies ingeneral Entering Skill = Java (which will be re-written at run time as: Skill = JavaOR ComputerSkill = Java OR ProgrammingSkill = Java OR NetworkSkill =Java) will return: persons 111 and 444 Entering ComputerSkill = Java OR ComputerSkill = SQL will return:persons 333 and 444 entering ProgrammingSkill = Java will return: person 444 entering Description = Programming will return: person 111 Allows to query/search and manipulate native representations Light-weight non-intrusive approach that can be improved andrefined without impacting the Dataspace66 67. Index Contents without the SEIndex Entry Associated Field-Value111, PersonID Type: PersonName: JavaDescription: Programming333, ID Type: PersonSkillDescr: SQL444, EmplID Type: PersonSkillName: JavaIndex entries based on native vocabularies67 68. Problems Even for our toy example we can see how muchmanual effort the analyst needs to apply inperforming search without SE and even thenthe information he will gain will be meager incomparison with what is made availablethrough the Index with SE.For example, if an analyst is familiar with the labelsused in Db1 and is thus in a position to enter Name= Java, his query will still return only: person 111.Directly salient Db4 information will thus be missed.68 69. Additional Notes on the SE process Original data and data-semantics are included in the Dataspacewithout loss and or distortion; thus there is no need to cover allsemantics of the Dataspace what is unlikely to be used in search oris not important for integration will still be available when needed A complex ontology is not needed a common and sharedvocabulary is sufficient for virtual semantic integration andsearch/analytics The approach is very flexible, and investments can be made inspecific areas according to need (pay-as-you-go) The approach is tunable if the chosen annotations of a particularsubset of a source data-model are too general for data analyses, therespective LLOs can be further developed and source models re-annotated69 70. Benefits of the Approach Does not interfere with the source content Enhancement enables this content to evolve in a cumulative fashion asit accommodates new kinds of data Does not depend on the data resources and can be developedindependently from them in an incremental and distributed fashion Provides a more consistent, homogeneous, and well-articulatedpresentation of the content which originates in multiple internallyinconsistent and heterogeneous systems Makes management and exploitation of the content more cost-effective The use of the selected ontologies brings integration with othergovernment initiatives and brings the system closer to the federallymandated net-centric data strategy Creates an integrated content that is effectively searchable and thatprovides content to which more powerful analytics can be applied70 71. Data Models and SEPersonID Name Description111 Java Programming222 SQL DatabaseSQL Java C++ProgrammingSkillComputerSkillSkill EducationTechnicalEducation71 72. The Meaning of Enhancement Amazing semantic enhancement/enrichment of data withoutany change to data on a string of the annotation, we put onthe top of a database field the whole knowledge system. Forexample, not only can analysts analyze the data aboutcomputer skills vertically along the Skill hierarchy, they cananalyze it horizontally via relations between Skill andEducation, and further While data in the database does notchange, its analysis can be richer and richer as ourunderstanding of the reality changes For this richness to be leveraged by differentcommunities, persons, and applications it needs to beconstructed in accordance with the principles of the SE72 73. Towards Globalization and Sharing Using the SE approachto create a SharedSemantic Resource forthe IntelligenceCommunity to enableinteroperability acrosssystems Applying it directly to orprojecting its contentson a particularintegration solution73 74. Building the Shared Semantic Resource Methodology of distributed incrementaldevelopment Training Governance Common Architecture of Ontologies to supportconsistency, non-redundancy, modularity Upper Level Ontology (BFO) Mid-Level Ontologies Low Level Ontologies74 75. Governance Common governance coordinating editors, one from each ontology, responsiblefor managing changes and ensuring use of common bestpractices small high-level board to manage interoperability How much can we embed governance into software? How much can we embed governance into training? analogy with military doctrine75 76. Governance principles1. All ontologies are expressed in a common shared syntax (initiallyOWL 2.0; perhaps later supplemented by CLIF) (Syntax forannotations needs to be fixed later; potentially RDF.)2. Each ontology possesses a unique identifier space (namespace) andeach term has a unique ID ending with an alphanumeric string ofthe form GO:00001234563. Each ontology has a unique responsible authority (a human being)4. If ontologies import segments from other ontologies then importedterms should preserve the original term ID (URI).5. Versioning: The ontology uses procedures for identifying distinctsuccessive versions (via URIs).6. Each ontology must be created through a process of downwardpopulation from existing higher-level ontologies76 77. Governance principles7. Each ontology extends from BFO 2.08. Each lower-level ontology is orthogonal to the other ontologies atthe same level within the ontology hierarchy9. The ontologies include textual (human readable) and logicaldefinitions for all terms.10. The ontology uses relations which are unambiguously definedfollowing the pattern of definitions laid down in the RelationOntology that is incorporated into BFO 2.011. Each ontology is developed collaboratively, so that in areas ofoverlap between neighboring ontologies authors will settle on adivision of terms.77 78. Orthogonality For each domain, ensure convergence upon a singleontology recommended for use by those who wish tobecome involved with the Foundry initiative Thereby: avoid the need for mappings which are in tooexpensive, too fragile, too difficult to keep up-to-date asmapped ontologies change Orthogonality means: everyone knows where to look to find out how toannotate each kind of data everyone knows where to look to find content forapplication ontologies78 79. Orthogonality = non-redundancyfor reference ontology modules onany given level application ontologies can overlap, but thenonly in those areas where common coverageis supplied by a reference ontology79 80. Definitions (one example of OBOFoundry traffic rules) all definitions should be of the genus-speciesformA =def. a B which Cswhere B is the parent term of A in the ontologyhierarchy80 81. Because the ontologies in theFoundryare built as orthogonal modules which form anincrementally evolving network scientists are motivated to commit todeveloping ontologies because they will need intheir own work ontologies that fit into thisnetwork users are motivated by the assurance that theontologies they turn to are maintained byexperts81 82. More benefits of orthogonality helps those new to ontology to find what theyneed to find models of good practice ensures mutual consistency of ontologies(trivially) and thereby ensures additivity of annotations82 83. More benefits of orthogonality No need to reinvent the wheel for each newdomain Can profit from storehouse of lessons learned Can more easily reuse what is made by others Can more easily reuse training Can more easily inspect and criticize results ofothers work Leads to innovations (e.g. Mireot, Ontofox) instrategies for combining ontologies 83 84. Continuant OccurrentIndependentContinuantDependentContinuantcell componentbiological processmolecular functionBasic Formal Ontology84 85. Anatomy Ontology(FMA*, CARO)EnvironmentOntology(EnvO)InfectiousDiseaseOntology(IDO*)BiologicalProcessOntology (GO*)CellOntology(CL)CellularComponentOntology(FMA*, GO*) PhenotypicQualityOntology(PaTO)Subcellular Anatomy Ontology (SAO)Sequence Ontology(SO*) MolecularFunction(GO*)Protein Ontology(PRO*)Extension Strategy + Modular Organization 85top levelmid-leveldomainlevelInformation ArtifactOntology(IAO)Ontology forBiomedicalInvestigations(OBI)Spatial Ontology(BSPO)Basic Formal Ontology (BFO) 86. continuantindependentcontinuantportion ofmaterialobjectfiat objectpartobjectaggregateobjectboundarysitedependentcontinuantgenericallydependentcontinuantinformationartifactspecificallydependentcontinuantqualityrealizableentityfunctionroledispositionspatialregion0D-region1D-region2D-region3D-regionBFO:continuant86 87. occurrentprocessualentityprocessfiat processpartprocessaggregateprocessboundaryprocessualcontextspatiotemporalregionscatteredspatiotemporalregionconnectedspatiotemporalregionspatiotemporalinstantspatiotemporalintervaltemporalregionscatteredtemporalregionconnectedtemporalregiontemporalinstanttemporalintervalBFO:occurrent87 88. More than 100 Ontologyprojects using BFOhttp://www.ifomis.org/bfo/users 89. Basic Formal OntologyContinuant Occurrentprocess, eventIndependentContinuantthingDependentContinuantquality.... ..... .......typesinstances 90. Blinding Flash of the ObviousContinuant Occurrentprocess, eventIndependentContinuantthingDependentContinuantquality.... ..... .......quality dependson bearer 91. Blinding Flash of the ObviousContinuant Occurrentprocess, eventIndependentContinuantthingDependentContinuantquality, .... ..... .......event dependson participant 92. Occurrents depend on participantsinstances15 May bombing5 April insurgency attackoccurrent typesbombingattackparticipant typesexplosive deviceterrorist group 93. Blinding Flash of the ObviousContinuantOccurrentprocess, eventIndependentContinuantthingDependentContinuantquality.... ..... .......process is changein quality 94. What is a datum?Continuant OccurrentprocessIndependentContinuantlaptop, bookDependentContinuantquality.... ..... .......datum: a pattern in somemedium with a certainkind of provenance 95. General lessons for ontology successStrategy of low hanging fruitLessons learned and disseminated ascommon guidelines developers are doingit the same wayOntologies built by domain expertsOntologies based on real thinking (not textmining) 96. Low Hanging FruitStart with simple assertions which youknow to be universally truehand part_of bodycell death is_a deathpneumococcal bacterium is_a bacterium(Computers need to be led by the hand)Use only the lowest node in the tree ofwhich you are sure that it holds 97. Examples How to cope with ontology change (roleof versioning, authority structure to ensureevolution in tandem within the networkedontology structure) how to ensure thatresources invested in ontology do not losetheir value when the ontology changes Versioning demands term-IDs whichchange whenever a term or definitionchanges 98. Experience with BFO inbuilding ontologies providesa community of skilled ontology developers andusersassociated logical toolsdocumentation for different types of usersa methodology for building conformantontologies by starting with BFO and populatingdownwards 99. Cell Type Ontology99 100. Cell TypeOntology100 101. ConclusionOntologists have established best practices for building ontologies for linking ontologies for evaluating ontologies for applying ontologieswhich have been thoroughly tested in useand which conform precisely to the hub-and-spokes strategy of the UCore and C2 efforts 102. Dr. Bill MandrickSenior OntologistData TacticsA Strategy for Military Ontology102 103. Agenda Introductory Remarks Previous Information Revolution Ontology & Military Symbology Asserted Ontologies Inferencing Realizing the strategy103 104. Introductory Remarks104 105. Ontology DefinedOntology is the science of representing, defining, andrelating the kinds and structures ofobjects, properties, events, processes and relations in everyarea of reality.An ontology is an exhaustive classification of entities insome sphere of being, which results in the formulation ofrobust and shareable descriptions of a given domain.(e.g. Physics, Biology, Medicine, Intelligence, etc.).105 106. Ontology Defined106...partial semantic account of theintended conceptualization of alogical theory... 107. 107Orders of Reality1st Order. Reality as it is. In the action in theupper image to the right, reality is what is, notwhat we think is happening, as we peerthrough the fog of war.2d Order. Participant Perceptions. What webelieve is happening as we peer through thefog of war. Examples: as a participant in theaction shown in the upper image or as amember of an operations center in the lowerimage.3rd Order. Reality as we record it. In the lowerimage, the computer displays are 3rd orderreality.The gaps between the orders of reality introduce risk. These gaps are not the onlyform of risk but reducing these gaps contributes to reducing risk.107 108. Conflation108 109. 1092 Examples of Conflation 110. 110Warfighters Information Sharing EnvironmentFireSupportLogisticsAir OperationsIntelligenceCivil-MilitaryOperationsTargetingManeuver&BlueForceTracking 111. Merriam-Websters CollegiateDictionaryJoint Publication 1-02 DoD Dictionaryof Military and Related TermsJoint Publication 3-0 Joint OperationsJoint Publication 3-13 Joint Commandand ControlJoint Publication 3-24CounterinsurgencyJoint Publication 3-57 Civil-MilitaryOperationsJP 3-10, Joint Security Operations inTheaterJoint Publication 3-16 MultinationalOperationsJoint Publication 5-0 Joint OperationsPlanningAuthoritative Referenceshttp://www.dtic.mil/doctrine/Warfighter LexiconControlled VocabularyStableHorizontally IntegratedCommon Operational Picture111 112. 112Ontology(ies) that enables interoperability among members of an Operations Centerand other warfighters. 113. 113JP 3-0OperationsJP 2-0IntelligenceJP 6-0CommSupportJP 4-0LogisticsJP 3-16MultinationalOperationsJP 3-33JTFHeadquartersJP 1-02DoD DictionaryCivil-Military OperationsArea of Operations XXX XArea of Responsibility XXXXXC2 Systems XXX XDoctrinal PublicationsConsistent Terminology (Data Elements, Names and Definitions)Area of Interest X XXKey: word for word 114. Previous Information Revolution114 115. Previous Information Revolution 1800 Cartographic Revolution Explosion of production, dissemination and useof cartography Revolutionary and Napoleonic wars Several individual armies in the extended terrain New spatial order of warfare Urgent need for new methods of spatialmanagement**SOURCE: PAPER EMPIRES: MILITARY CARTOGRAPHY AND THE MANAGEMENT OF SPACE115 116. Standardizing Geospatial InformationTriangulationMilitary Grid Reference SystemLatitude Longitude116 117. Interoperable Semantics(example: Anatomy & Physiology) Standardized Labels Anatomical Continuants Physiological Occurrents Teachable Inferencing Horizontally Integrated Sharing of Observations Accumulated Knowledge117 118. Standardized Symbols118GroundResistorCapacitor 119. Ontology & Military Symbology Elements of Military Ontology Represent Entities and Events found in militarydomains Used to develop the Common OperationalPicture Used to develop Situational Awareness Used to develop Situational Understanding Used for Operational Design Used to Task Organize Forces Used to Design/Create Information Networks Enhance the Military Decision Making Process119 120. Military SymbologySample of Military Standard 2525 Military Symbology120 121. Map Overlays121 122. Task OrganizingOntological methods are used in the process ofTask-OrganizingA Task-Organization is the Output (Product) ofTask OrganizingA Task-Organization is a Plan or part of a PlanA Plan is an Information Content EntityTask-Organizing The act of designing an operatingforce, support staff, or logistic package of specific sizeand composition to meet a unique task or mission.Characteristics to examine when task-organizing theforce include, but are not limited to: training,experience, equipage, sustainability, operatingenvironment, enemy threat, and mobility. (JP 3-05)122 123. Operational DesignSource: FM 3-0 OperationsMilitary Ontologies help planners and operators see andunderstand the relations between Entities and Events in thearea of operations.Military Ontologies are prerequisites of military innovationssuch as Airborne Operations, Combined Fires and JointOperations.Military Ontologies are prerequisites for the creation of effectiveinformation systems.Operational Design The conception and construction of theframework that underpins a campaign or major operation planand its subsequent execution. See also campaign; majoroperation. (JP 3-0)123 124. Asserted (Reference) Ontologies Generic Content Aggressive Reuse Multiple Different Types of Context Better Definitions Prerequisite for InferencingTarget ListTargetNominationListCandidateTarget ListHigh-PayoffTarget ListProtectedTarget ListIntelligenceProductGeospatialIntelligenceProductTargetIntelligenceProductSignalsIntelligenceProductHumanIntelligenceProduct 124 125. 125 126. Grids &Coordinates126 127. Geospatial Feature Descriptions127 128. Designative Information Artifacts128 129. Descriptive Information Artifacts129 130. Geographic Features& Geospatial Regions130 131. Geospatial Regions*CombineGeographic/Geospatialcontent from otherontology (see next slide)131 132. Artifacts132 133. Facilities133 134. Facility by Role134 135. Vehicles135 136. Weapons136 137. Human Ontologies137 138. Organizations138 139. Processes139 140. Inferencing140 141. Infantry Company is part_of a Battalion (Continuant to Continuant)Civil Affairs Team participates_in a Civil Reconnaissance (Continuant to Occurrent)Military Engagement is part_of a Battle Event (Occurrent to Occurrent)House is a Building (Universal to Universal)3rd Platoon, Alpha Company participates_in Combat Operations (Instance to Universal)3rd Platoon, Alpha Company is located_at Forward Operating Base Warhorse (Instance to Instance)Relations How we make sense of the world Message In Plain English (MIPE) How Data becomes Information141 142. Standardized Relations142 143. 2 Data Model Labels Region.water.distanceBetweenLatrinesAndWaterSource Region.water.fecalOrOralTransmittedDiseases How are these labels used? No way to standardize or horizontally integrate Trying to pack too much into each label Contain elements from several asserted ontologies Need to be Decomposed into elements Relating elements from different asserted ontologies Common events and objects in an Area of Operations143 144. Asserted OntologiesRegion.water.distanceBetweenLatrinesAndWaterSourceTribalRegionArea OfOperationsGeospatialRegionVillageWater SourceLatrineWellPondCesspoolAct OfMeasurementAct OfAnalysisAct OfObservationActMeasurement ResultDepthMeasurementResultHeightMeasurement ResultDistanceMeasurement ResultGeographicCoordinatesLatitudeLongitudeCoordinatesMilitary GridReference SystemCoordinatesUniversalTransverseMercatorCoordinates144 145. locatedinhasroleRegion.water.fecalOrOralTransmittedDiseasesWellVillageAssessmentBacterialPathogenRoleParticularBacterialPathogenRoleCollectionofBacteriumCholeraEpidemicCholeracause_ofcause_ofinstance_ofinstance_of instance_ofReportDate TimeGroup150029OCT2010stampsinstance_ofdescribesinstance_of 146. Relating Asserted OntologiesRegion.water.fecalOrOralTransmittedDiseasesVirusProtozoanMicroorganismBacteriumWater SourceLatrineWellPondCesspoolPathogenRoleConsumableRoleMedicinalRoleRoleDiseaseHepatitis AShigellosisCholeraEventContaminationEventEpidemic EventDiseaseTransmissionEventhas_rolehas_locationpart_ofcause_of146 147. locatednearUnpacking: Region.water.distanceBetweenLatrinesAndWaterSourceLatrineWellVT 334 569DistanceMeasurementResultVillageNameKhanabadVillageVillageis_ainstance_ofGeopoliticalEntitySpatialRegionGeographicCoordinatesSetdesignatesinstance_oflocatedininstance_ofhaslocation designateshaslocationinstance_ofinstance_of16 metersinstance_ofmeasurement_of 148. Sample Ontology Terms/LabelsContamination Event Consumable Role Disease Event Epidemic EventGeographicCoordinates SetGeospatial Region Measurement Result MicroorganismPathogen Role Tribal Region Village Water Source Application Neutral Labels (Preferred Labels) Much better for Horizontal Integration database query Inferencing Standardization Reuse (e.g. Disease Ontology already exists) Training148 149. Conclusions Situational Understanding Shared Lexicon Horizontal Integration of Preferred Labels Need Training & Governance149 150. Barry Smith&Bill MandrickRealizing the Strategy:A Practical Introduction toOntology Building150 151. Agenda Standardized Processes Scoping the Domain Creating Initial Lexicon Initial Ontology Feedback and Iteration Publish and Share151 152. 152Method:1.A particular procedure for accomplishing orapproaching something, esp. a systematic orestablished one.2. Orderliness of thought or behavior; systematicplanning or action: combination of knowledgeand method.Merriam-Websters Collegiate DictionaryStandardized Process 153. 153Sample of Repeatable ProcessesThe Repeatable Targeting Process 154. Repeatable Processes aredeveloped though repeatedobservation, abstraction, anddocumentation154 155. The Repeatable Process forthe...Military Decision MakingProcess (MDMP)...as depicted in DoctrineThis is a highly refined anddocumented processAll Leaders, Planners, andDecision-Makers are wellversed in the MDMP155 156. 156 157. Guide to aRepeatable ProcessforOntology Creation (v 0.1)157 158. Horizontal Integration of Intelligence158 159. Horizontal Integration ofInformationJoint Publication 6-02Under Development2012A JointPublication forH.I. needs tobe developed159 160. Scoping the Domain160 161. Scope the Domain 1.1 Subject Matter Expert (SME) Interaction 1.2 Identify Authoritative References 1.3 Survey Authoritative References 1.4 Define the Domain 1.5 Describe the Domain 1.6 Devise Metrics161 162. Merriam-Websters CollegiateDictionaryJoint Publication 1-02 DoD Dictionaryof Military and Related TermsJoint Publication 3-0 Joint OperationsJoint Publication 3-13 Joint Commandand ControlJoint Publication 3-24CounterinsurgencyJoint Publication 3-57 Civil-MilitaryOperationsJP 3-10, Joint Security Operations inTheaterJoint Publication 3-16 MultinationalOperationsJoint Publication 5-0 Joint OperationsPlanningAuthoritative Referenceshttp://www.dtic.mil/doctrine/ 162 163. 163 164. What is the baseline definition/description for thisdomain?What are the primary activities involved in this domain?What are the subordinate activities in this domain?Who participates in these activities?What environment do these activities take place in?What are the intended outcomes of these activities?What are the intended products of these activities?What information is consumed in these activities?Who consumes this information?What information is produced by these activities?Where is this information found?Where is this information stored?What organizations are involved in this domain?How are these organizations related?What do these outputs contribute to?What is the relation between agents and organizations in this domain?What are the ultimate goals for the domain?What are the subordinate goals for the domain?What larger enterprise/objective does this domain contribute to?What happens if these activities fail to produce their intended outcomes?Metrics:20 Questions for C2 Related Domains164 165. Create Initial Lexicon165 166. Joint Operation Planning Process: Planning activities associated with joint military operations bycombatant commanders and their subordinate joint force commanders in response tocontingencies and crises. Joint operation planning includes planning for the mobilization,deployment, employment, sustainment, redeployment, and demobilization of joint forces. (JointPublication 3-0 Joint Operations)Planning Activity Joint Military Operation Combatant CommanderSubordinate Joint ForceCommanderContingency Event Crisis EventMobilization Event Deployment Event Employment EventSustainment Event Redeployment Event DemobilizationJoint Force Planning Process ResponseBuilding the Domain Lexicon166 167. Example:Dog: An Animal [parent class]which is a member of the genus Canis, probably descended from the common wolf, that hasbeen domesticated by man since prehistoric times; occurs in many breeds [differentia fromall other animals](Merriam Websters Collegiate Dictionary)Definitions Always make Two-Part Definitions whichinclude:Reference to Parent Class (Genus) &Differentia (Species)167 168. DefinitionsAttack Geography:A description of the geography surrounding the IEDincident, such as roadsegment, buildings, foliage, etc. Understanding thegeography indicates enemy use of landscape tochannel tactical response, slow friendlymovement, and prevent pursuit of enemy forces.IED Attack Geography:A Geospatial Region where some IED Incident takesplace.IED Attack Geography Description:A Description of the physical features of someGeospatial Region where an IED Incident takesplace.Original Definition Improved Definition(s)168 169. Method of Emplacement:A description of where the device was delivered, used, oremployed. (original definition)Original Definition Improved Definition(s)Method of IED Emplacement:A systematic procedure used in the positioning of anImprovised Explosive Device.Method of IED Emplacement Description:A description of the systematic procedure used in thepositioning of an Improvised Explosive Device.Example 2: Method of Emplacement169 170. Example 3: Method of EmploymentMethod of Employment:A description of where the device was delivered, used, oremployed. (original definition)Original Definition Improved Definition(s)Method of IED Employment:A systematic procedure used in the delivery of anImprovised Explosive Device.Method of IED Employment Description:A description of the systematic procedure used in thedelivery of an Improvised Explosive Device.170 171. Doctrinal Definitionsintelligence estimate The appraisal, expressed inwriting or orally, of available intelligence relating to aspecific situation or condition with a view to determiningthe courses of action open to the enemy or adversaryand the order of probability of their adoption. (JP 2-0)171 172. 172 173. Create Initial Ontology173 174. SpecificallyDependentContinuantAgentGenericallyDependentContinuantIndependent ContinuantObject Aggregate/Object/SiteProcessNatural Process- Biological Process- Weather Process- Geological ProcessCapability- Sustained Rate of Fire- Lethal Capability- SkillRole- Geospatial Role-- Area of Interest-- Area of Operations- Personal Role--Key Leader Role--Insurgent Role- Artifact Role--IED Component- Target Role- Equipment Role- Cargo RoleAct- Political Act- Violent Act- Planning ActQualityPhysical Characteristic- Eye Color- Height- WeightPhysical Artifact- Vehicle- -Tractor- Weapon- - Rifle- - Improvised Explosive DeviceSystem-Weapon System- C2 System--Targeting System- Intelligence SystemSite- Geospatial RegionIndependent ContinuantObject Aggregate/Object/SiteOrganization- MilitaryOrganization- PoliticalOrganizationOrganism- Human Being- Non-HumanAnimal- BacteriaInformationArtifact- Directive-- Plan-- Prescription-- Guidance- Description-- Narrative-- Comment-- Remark- Designation--Address-- Grid Coordinate-- Name--Code- Measurement-- Altitude-- Height-- Weight-- Distance174 175. RevisionsProcesswith SMEsSME FeedbackMilitary Ontology is the task of establishing and representing the salient types of entitiesand relations in a given domain (Battlespace)175Ontology Review 176. Publish and Share176 177. Intelligence Ontology SuiteNo. Ontology Prefix Ontology Full Name List of Terms1 AO Agent Ontology2 ARTO Artifact Ontology3 BFO Basic Formal Ontology4 EVO Event Ontology5 GEO Geospatial Feature Ontology6 IIAO Intelligence Information Artifact Ontology7 LOCO Location Reference Ontology8 TARGO Target OntologyHome Introduction PMESII-PT ASCOPE References LinksWelcome to the I2WD Ontology Suite!I2WD Ontology Suite: A web server aimed to facilitate ontology visualization, query, and development for the IntelligenceCommunity. I2WD Ontology Suite provides a user-friendly web interface for displaying the details and hierarchy of a specificontology term.177 178. I2WD Ontology Suite178 179. Practical Introduction to OntologyBuilding: Some SamplesBarry Smith179 180. New York StateCenter of Excellence inBioinformatics & Life SciencesR T UA simple battlefield ontology (from W. Ceusters)building personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-intoOntology 181. New York StateCenter of Excellence inBioinformatics & Life SciencesR T UOntology used for annotating a situationbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-intoOntologySituation 182. New York StateCenter of Excellence inBioinformatics & Life SciencesR T UReferent Tracking (RT) used for representing a situation#1 #2 #3 #4 #10OntologySituationalmodelSituation#5 #6 #8#7usesusesusesusesusesbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-inbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-in 183. New York StateCenter of Excellence inBioinformatics & Life SciencesR T Uuse the same weaponuse the sametype ofweaponReferent Tracking preserves identity#2 #3 #4 #10OntologySituationalmodelSituation#6 #8#7usesusesusesusesbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-inbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-in 184. New York StateCenter of Excellence inBioinformatics & Life SciencesR T UfaithfulSpecific relations versus generic relations#1 #2 #3 #4 #10OntologySituationalmodelSituation#5 #6 #8#7usesusesusesusesusesbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-inbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-in 185. New York StateCenter of Excellence inBioinformatics & Life SciencesR T USpecific relations versus generic relationsOntologySituationalmodelSituationNOT faithfulusesbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-inbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-in 186. New York StateCenter of Excellence inBioinformatics & Life SciencesR T URepresentation of times when relations hold#3OntologySituationalmodelSituationsoldierprivate sergeant sergeant-majorat t1at t2at t3 187. New York StateCenter of Excellence inBioinformatics & Life SciencesR T U#1 #2OntologySituationalmodelSituation#5 #6usesat t1usesat t1building personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-inbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-in 188. New York StateCenter of Excellence inBioinformatics & Life SciencesR T U#1 #2OntologySituationalmodelSituation#5usesat t2after the death of #1 at t2building personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-inbuilding personvehicletank soldierPOWweaponmortarsubmachinegun carobjectcorpseSpatial regionlocated-intransforms-in 189. New York StateCenter of Excellence inBioinformatics & Life SciencesR T URT deals with conflicting representations bykeeping track of sources#1 #2SituationalmodelSituation#5 #6usesat t1usesat t1usesat t2at t3Ontology corpseasserts at t2 190. New York StateCenter of Excellence inBioinformatics & Life SciencesR T U#1 #2SituationalmodelSituation#5 #6usesat t1usesat t1usesat t2at t3Ontology corpseasserts at t4RT deals with conflicting representations bykeeping track of sources 191. New York StateCenter of Excellence inBioinformatics & Life SciencesR T UAdvantages of Referent Tracking Preserves identity Allows to assert relationships amongst entities thatare not generically true Appropriate representation of the time whenrelationships hold Deals with conflicting representations by keepingtrack of sources Mimics the structure of reality 192. Towards a Video OntologyVideo as information artifactProcess by which video is createdVideo content for tagging (man shooting rifle)192 193. How a video ontology can help theprocess of intelligence analysis193i) The video fileii) The content of (what is represented by) the video fileiii) The process of inserting text into the video that couldlater be queriediv) The videos role in the intelligence/decision makingprocessVideoFile has_role IntelligenceProduct 194. Example194An FMV PO focuses the CSP on some suspicious activity. The POzooms in, and identifies three tracked vehicles moving through a NAI.He immediately tags a 15-second clip with the text"3 x tracked vehicles pending ID"That clip is also automatically tagged with MGRS and DTG. Afterreporting this EEI to the supported BCT, an imagery analyst immediatelyconducts a search for "tracked vehicles" within the time window reported.The motion imagery clip in question pops up, and the analystadds/modifiesthe text to"3 x likely Ukrainian T-72 moving from north to south through NAI 9 at lowspeed; tank commanders had hatch open and did not appear to anticipatecontact." 195. Terms195VideoFileActOfTaggingActOfQueryingTextStringGridCoordinatesDatum (MGRS)DateTimeGroupEssentialElementOfInformationRole (EEI)ImageryAnalysisActOfSearchingActOfTextModificationTemporalInterval (TimeWindow)TankCommanderRoleUkranianTank72ManeuverEventEnemyContactEventBrigadeComabtTeam (BCT)TrackedVehicleVehicleCountNamedAreaOfInterest (NAI) 196. Terms196VideoFileActOfTaggingActOfQueryingTextStringGridCoordinatesDatum (MGRS)DateTimeGroupEssentialElementOfInformationRole (EEI)ImageryAnalysisActOfSearchingActOfTextModificationTemporalInterval (TimeWindow) 197. 197 198. Terms198VideoFileActOfTaggingActOfQueryingTextStringGridCoordinatesDatum (MGRS)DateTimeGroupEssentialElementOfInformationRole (EEI)ImageryAnalysisActOfSearchingActOfTextModificationTemporalInterval (TimeWindow)