build dod vocabularies in the cloud 3 rd annual soa & semantic technology symposium:...

20
Build DoD Vocabularies in the Cloud 3 rd Annual SOA & Semantic Technology Symposium: Interoperable Business Operations Through Shared Understanding Dr. Brand Niemann, Director and Senior Data Scientist, Semantic Community July 13 th Competency Track - 11:55am-12:30pm July 13-14, 2011 Waterford, Springfield, Virginia 1

Post on 19-Dec-2015

218 views

Category:

Documents


5 download

TRANSCRIPT

1

Build DoD Vocabularies in the Cloud

3rd Annual SOA & Semantic Technology Symposium:Interoperable Business Operations Through Shared Understanding

Dr. Brand Niemann, Director and Senior Data Scientist, Semantic CommunityJuly 13th Competency Track - 11:55am-12:30pm

July 13-14, 2011Waterford, Springfield, Virginia

2

Semantic Community• So far in 2011, Semantic Community has built Knowledge-Centric Systems in the Cloud for:

– Data Science and Journalism:• Data.gov and Federal Computer Week, Ongoing Since January 2011.• 1105 Government Information Group/FOSE Institute’s KM 2011 Conference, May 4, 2011, and Geospatial

Summit, September 13, 2011.• AOL Government “Show Me The Data” Due to Launch July 11, 2011.

– The Open Group’s TOGAF and UDEF:• The Open Group San Diego Conference, February 7, 2011.• The Open Group London Conference, May 11, 2011.

– Semantic Interoperability:• Keynote at SEMIC.EU Annual Conference, May 18, 2011.• Conference Presentation at SemTech 2011, June 7, 2011.• Federal Data Architecture Subcommittee, June 9, 2011.

– “Big Health Data”:• One of the Top Submissions for HealthyPeople.gov Challenge, March 14, 2011.• Finalist in the Health Data Initiative Forum, June 9, 2011.

– DoD:• RFI for Data Analysis and Collaboration Tool to Support the DoD OIG, June 28, 2011.• 3rd Annual DoD SOA and Semantic Technology Symposium, July 13, 2011.

• This presentation will show examples from simple (e.g. Air Force One Source) to complex (DOD Office of the Inspector General) DoD Vocabularies.

3

Take-Home Message

• Competency: Creating Competency for Shared Understanding and Interoperable Business Operations. – This track focuses on the development of knowledge and skills for

SOA & Semantic projects, the handling of organizational change management, and the governance needed for and associated with such projects and initiatives.

• Semantic Community Knowledge-Centric Systems:– We take the data (and metadata) directly to information modeling and

mashup tools where we then can apply stronger semantic analytics tools. We keep the data (structured and unstructured) and metadata (ontology) together in the knowledgebase in cloud computing tools.

– We use effective standards-based approaches for real-world case studies. This presentation could also be in the other two tracks!

4

Abstract• Several DoD vocabularies have been harvested into the cloud

computing tools used by the author to produce data science products. Those are Air Force OneSource and the DoD Common Vocabulary with two vocabularies, one for the HR community and one for UCORE-SL.

• The purpose of the Semantic Community’s data science products are to show when/where it is practical to insert semantic technologies in support of cross-domain process and analysis, and the value/ease of using other more mature technologies for certain tasks. The practical boundaries we have found supporting data fusion and analysis for information sharing, and when in the process to maximize the value from applying semantic technologies, are discussed.

Note: Credit due to Robert Damashek for suggesting this topic to me.

5

Overview

• 1. Introductions• 2. Background• 3. Semantic Community Apps• 4. DoD Common Vocabulary• 5. Data Analysis and Collaboration Tool to Support the

DoD OIG• 6. Questions and Answers• 7. Supplemental Slides– Recreating Other People’s App the Semantic Community

Way!

6

2. Background

• My Experience with “Handling of organizational change management, and the governance needed for and associated with such projects and initiatives”:– I tried to change EPA from the inside (1980-1996).– I served a detail to the Department of Interior where I was able

to start a new organization (1997-2001).– I tried to change the Federal Government in my Federal CIO

Council (2002-2008) Roles.– I also tried to change EPA from the outside at the same time.– I am now enjoying being free to do what I think is best to

support the Semantic Web/Linked Open Data and Semantic Technologies, but in an easier and simpler way!

7

2. Background• Federal Semantic Interoperability Community of Practice (SICoP) 2003-

2008:– Five Annual Conferences and Four Special Conferences.

• Federal SOA Community of Practices (SOA CoP) 2006-Present:– Eleven Semi-Annual Conferences. 12th October 11th .

• Only Special Recognition for Outstanding Contributions to Both SICoP and SOA CoP:– Arun Majumdar, Cutter Consortium/VivoMind Intelligence for

Operationalizing SOA-Lessons Learned (Take Home Message: Multi-Level Model-Driven Architecture & First Order Logic).

• Now from the pilots at these conference come powerful new semantic analytics tools like VivoMind's Textrium and PrologIKS and Semantic Insights Research Assistant (SIRA) that can be used to mine content to produce data science products that support data journalism!

2. BackgroundProgram Champion CoP Leader Standards

eForms for eGov Mark Forman, OMB Rick Rogers, Fenestra Technologies

eGrants XML Schema and Web Services

Federal SOA CoP Roy Maybury, DoD Cory Casanave, Model Driven Solutions

Web Services and Open Group MDA and SoAML

Federal Semantic Interoperability CoP

David Wennergren, Navy CIO

Rick Morris, US Army, and Mills Davis, Project10X

W3C Semantic Web in Semantic Technologies

Cloud Computing Desktop for OGD & Data.gov/semantic

Vivek Kundra, Federal CIO

Brand Niemann, US EPA and Semantic Community

Web Oriented Architecure (MindTouch)

Gov 2.0 Platform for Data Science Products and 5 Stars of LOD

Aneesh Chopra, Federal CTOTim Berners-Lee, W3C Director

Brand Niemann, US EPA and Semantic Community

Open and Quality Data Visualizations (Spotfire)

8

My Experience with “development of knowledge and skills for SOA & Semantic projects”.

9

3. Semantic Community AppsGeneral Web Site Best Content -

CentralizedBest Content - Distributed

US Federal Government (1)

Community Sandbox (2)

Annual Statistical Abstract (3) and EPA Report on the Environment (4)

FedStats.net (5)

TOGAF (6) EA Principals, Inc. (7)

Training Materials (8)

Ecosystem of Frameworks (9)

SEMIC.EU (10) Web Site (11) EuroStats (12) and European Environment State and Outlook (13)

Global Data Catalog and Data Services (14)

Key: See next slide for Key.

Source: http://semanticommunity.info/Build_SEMIC.EU_in_the_Cloud

Some Best Practice Examples of Semantic Interoperability Interfaces*

*The term "interoperable interface" comes from the recent Report to the President and Congress "Designing a Digital Future: Federally Funded Research and Development in Networking and Information Technology", Executive Office of the President and the President's Council of Advisors on Science and Technology, December 2010 (see excerpts in the wiki).

10

4. DoD Common Vocabulary• The mission of the Enterprise Information Web (EIW) project is to create

an extensible analytical capability built on top of a federation of information systems across the Department of Defense and provide information visibility and access:– Archives: All wikis and vocabularies relevant to the HR EIW project.– Business Process Area: Semantic models for the HRM Domain.– CHRIS Reference Ontology: ?.– Retirements and Separations: DIMHRS Ontology.– HR Analytics: Queries the HR Domain Ontology.– HR Domain Ontology: Central Knowledgebase for Concepts and Terminology

within the DoD HR Domain.– Knowledge Center: EIW Training Materials– ODSE Sample Database: Multiple Vocabularies.– Ontology Repository: An important contribution in the overall goal of data

integration across the HR domain.https://www.commonvocabulary.army.mil/ui/groups/HR_EIW

Sample Content Included in Next Section

11

5. Data Analysis and Collaboration Tool to Support the DoD OIG

• The mission of the Department of Defense, Office of the Inspector General (DOD OIG) is to promote integrity, accountability, and improvement of Department of Defense personnel, programs, and operations to support the Department’s mission and serve the public interest. Each goal of the DOD OIG requires personnel to perform analysis using structured and unstructured data, both government and non-government sources, and in a wide variety of file formats. Personnel and data sources are spread throughout the globe, requiring teams to acquire data in a remote access storage system for use. Personnel access analysis tools remotely using laptops running Windows XP (SP3) with dual core processors, 3GB RAM, and 50GB memory.

• The DOD OIG has recognized a need to improve the efficiency and effectiveness of how data is ingested, shared and analyzed across the organization. As well as the need to explore advanced analysis capabilities to better assist personnel in identifying fraud, waste, and abuse in the Department.

http://semanticommunity.info/Build_DoD_Vocabularies_in_the_Cloud/Proposal_Demo#BACKGROUND

Note: Bolding is mine.

12

5. Data Analysis and Collaboration Tool to Support the DoD OIG

• Semantic Community Workflow:– 5.1 Information Architecture of Public Web Pages in

Spreadsheets as Linked Open Data.– 5.2 Public Reports (Web and PDF) in Wiki as Linked

Open Data.– 5.3 Desktop and Network Databases in Wiki and

Spreadsheets in Linked Open Data Format.– 5.4 Spreadsheets in Spotfire as Linked Open Data.– 5.5 Spreadsheets in Semantic Insights Research

Assistant for Semantic Search, Report Writing, and Ontology Development.

13

5. Data Analysis and Collaboration Tool to Support the DoD OIG

http://semanticommunity.info/@api/deki/files/12769/=DoDOIG.xlsx

5.1 Information Architecture of Public Web Pages in Spreadsheets as Linked Open Data.

Tabs (12):Cover PagePress RoomPublications 2011DoD IGAppendices A, F, & IReport to CongressStatistical HighlightsTable 3.1 & Figures 3.1 & 3.2

14

5. Data Analysis and Collaboration Tool to Support the DoD OIG

• MindTouch makes the world's most respected social knowledge base. They power purpose-built help 2.0 communities that connect companies with their customers. Millions use their software every day.

• Many of the world's most respected brands rely on MindTouch including NASA, SAIC, Booz Allen, Microsoft, Cisco, Washington Post, Viacom, the New York Times, AXA, Timberland and HCA.

• Innovative companies like RightScale, ExactTarget and Mozilla have standardized on MindTouch for their documentation strategy.

• The open source .NET Web Oriented Architecture Framework (WOAF) is redefining how enterprise software is built.

• MindTouch is a recognized expert in both open source and Enterprise 2.0 technologies.• The MindTouch Productivity Tools bridge Microsoft office and your desktop for all

Windows applications. Have your users continue to work with the applications they're familiar with, instead of forcing them to learn a new tool with our document management solution. With the MindTouch Desktop Suite, you'll save time and money by not having to train users on a new system.

http://www.mindtouch.com/

15

5. Data Analysis and Collaboration Tool to Support the DoD OIG

http://semanticommunity.info/Build_DoD_Vocabularies_in_the_Cloud/2011_DOD_IG_Semiannual_Report_to_Congress

5.2 Public Reports (Web and PDF) in Wiki as Linked Open Data.

16

5. Data Analysis and Collaboration Tool to Support the DoD OIG

5.3 Desktop and Network Databases in Wiki and Spreadsheets in Linked Open Data Format.

http://www.mindtouch.com/add-ons/desktop_suite?product-refer=desktop-suite

17

5. Data Analysis and Collaboration Tool to Support the DoD OIG

PC Desktop Spotfire

Spreadsheets in Spotfire as Linked Open Data.

5.4 Spreadsheets in Spotfire as Linked Open Data.

18

5. Data Analysis and Collaboration Tool to Support the DoD OIG

http://www.semanticinsights.com/company/SI%20Fact%20Sheet.pdf

SIRA can be used to find similarity between current and past events that are expressed or hinted at in text. SIRA can be used to find relationships of people, places, things and activities that may be expressed or hinted at in text.

19

6. Questions and Answers

• Sound Byte: Bring the data and the metadata back together and do the data science first to accomplish a business need and lay a solid foundation for integration and application of semantic technologies.

• Questions about the steps I followed?• Questions about the results I produced?• See Supplemental Slides for the Data Science

Approach to Semantic Web/Technology Pilots.

20

7. Supplemental Slides• 7.1 Semantic Technology Training: Building Knowledge-Centric Systems

– KM 2011– SemTech 2011

• 7.2 W3C Government Linked Data Working Group– Clinical Quality Linked Data on Health.data.gov– Build Clinical Quality Linked Data on Health.data.gov in the Cloud– Hospital Compare Downloadable Database Example of "5 Star Government Data“

• 7.3 Library of Congress Project Recollection and Digital Preservation Initiative• 7.4 Elsevier/Tetherless World Health and Life Sciences Hackathon (27-28

June 2011)– Build TWC in the Cloud– Build NCI CLASS in the Cloud– Build the NYC Data Mine Health in the Cloud– Build SciVerse Apps in the Cloud (IN PROCESS)

• 7.5 Be Informed (IN PROCESS)