using semantic technologies in wikis, desktop authoring, & search to improve information sharing...
TRANSCRIPT
Using Semantic Technologies in Wikis, Desktop Authoring, & Search to Improve Information Sharing and
Knowledge Management
Leveraging Information Asset Management
DAMA-NCR // Wilshire Symposium 2008
November 3-7, 2008
Brand Niemann, U.S. EPA
Mills Davis, Project10X
2
Brand Niemann
Mills [email protected]
Brand Niemann is a senior enterprise architect at the U.S. Environmental Protection Agency in Washington, DC, where he works on Web 2.0/3.0 and co-leads with Mills Davis, Semantic Communities dot Net, that provides Web 2.0 and 3.0 infrastructure to a number of communities of practice and hosts a series of workshops/webinars introducing semantic technologies to government in the context of specific problems.
Mills Davis is founder and managing director of Project10X — a Washington, DC based research consultancy specializing in next wave semantic technologies, solutions, and business models. The firm’s clients include technology manufacturers, global 2000 corporations, government agencies, and web 3.0 start-ups.
3
Topics
• Web 2.0 Wikis
• Semantic content tools — desktop apps, databases, web pages, and lenses
• Semantic search — Microformats, SPARQL, natural language & question answering
• Current directions — Combining Web 2.0 wikis, semantic content tools, and semantic search
• Final thoughts
Web 2.0 Wikis
Collaboration, Information Sharing, and Knowledge Management
5
Web 2.0 Wikis
• Web 2.0 wikis allow you to:– Provide a more user-friendly interface than earlier
wikis (like Word processing).– Provide a more powerful platform for business
applications (like Mashups).– Support a form of Service-Oriented Architecture
called Web-Oriented SOA (a substyle of SOA based on the architecture of the World Wide Web).
– Add structure to content that prepares it for use with Semantic Technologies (XML, RSS feeds, and RDF-Dapper.net).
– Provide an API thought which content and functionality can be accessed (see next slide).
6
Web 2.0 Wikis
See http://wiki.mindtouch.com/MindTouch_Deki/Features/Architecture
7
Web 2.0 Wikis:
Mindtouch Deki Wiki Features
• Content Creation:– An editing experience similar to what you would expect from
modern word processor applications.
• Content Management:– Hierarchical page organization: Organize content in an intuitive
hierarchical manner
• Search:– Advanced search: User can view all results or only specific subsets
of the result set
• Attachments:– Users can attach any file or image to any page.
• Versioning and Reversion:– Page versioning: Every page retains a complete history of
changes.
8
Web 2.0 Wikis:
Mindtouch Deki Wiki Features
• Access Control:
– Restrict page editing, Restrict page viewing, Restrict hierarchies
• Alerts and Notifications:
– Watch list Feeds: Every user can create a list of pages to watch.
• Application Administration:
– Site administration: Quickly and easily manage multiple users and users' status
• Miscellaneous:
– Adherence to standards: All content is stored in XML.
9
Web 2.0 Wikis:
Mindtouch Deki Wiki Tutorial
1: Decide on Name
2: Register Name
3: Login
4: Set Preferences
5: Control Panel
6: Design Home Page
7: Create Subtopics
8: Repurpose Web Content Into Wiki
9: Attach Files
10: Insert Images and Links
11: Create Wiki Log (Blog)
12: Set Security
13: Monitor Users
14: Revise/Reorganize
Overview (14 steps):
11
Web 2.0 Wiki Tutorial:
4 Set preferences
12
Web 2.0 Wiki Tutorial:
5 Control panel
http://wilshireconferences.wik.is/Admin:
13
Web 2.0 Wiki Tutorial:
6 Design home page
http://wilshireconferences.wik.is/
14
Web 2.0 Wiki Tutorial:
6 Design home page (cont.)
http://wilshireconferences.wik.is/
15
Web 2.0 Wiki Tutorial:
7 Create subtopics
http://wilshireconferences.wik.is/2008_DAMA-NCR%2f%2fWilshire_Symposium_November_3-7%2c_2008/Anne_Marie_Smith
16
Web 2.0 Wiki Tutorial:
9 Attach files
http://wilshireconferences.wik.is/Enterprise_Information_Management_Conference_June_18-20%2c_2008
17
Web 2.0 Wiki Tutorial:
10 Insert images and links
http://wilshireconferences.wik.is/Enterprise_Information_Management_Conference_June_18-20%2c_2008
18
Web 2.0 Wiki Tutorial:
12 Set security
http://wilshireconferences.wik.is/
19
Web 2.0 Wiki Tutorial:
12 Set security (cont.)
• Public: everybody can view and edit.• Semi-Public: everybody can view, but
only selected users can edit.• Private: only selected users can view
and edit this page.• Note — Deki Wiki has one of the more advanced
permission systems available. Administrators can make wikis public or private, anonymous or not. Also, there is user groups support. Users can permission entire hierarchies to create private or non-editable workspaces or permission single pages.
20
Web 2.0 Wiki Tutorial:
13 Monitor users
http://wilshireconferences.wik.is/Admin:Users
21
Web 2.0 Wiki Tutorial:
Search Results
http://wilshireconferences.wik.is/index.php?title=2008_DAMA-NCR%2F%2FWilshire_Symposium_November_3-7%2C_2008/David_Loshin&highlight=%22master+data+management%22
Semantic Content Tools
Desktop Applications, Content & Data Mashups,
Web Pages & Lenses
23
Semantic Web Publishing
SPARQL
OWL, RDFS, SKOS
RDF
URIs XML
• Data access
• Information organization
• Information format
• Identification
• Serialization
Making information available without knowing all of the eventual use, reuse, collaboration, or reproduction and presentation of results.
24
Semantic Content Tools
Semantic content tools allow you to:
• Work across applications and file formats
• Mash-ups — combine spreadsheets, databases, other information formats
• Work across environments — desktop & webtop
• Present information in different views (using lenses)
25
Semantic Content Tools:
Yahoo! Search Monkey
• Search returns the structure of information (from microformats, RDFa, etc.) as well as content, enabling interesting formatting of results, data services, and applications. Semantics enable more.Before Monkey
Source: Neil Crosby, Yahoo!
26
Semantic Content Tools:
Cambridge Semantic Features
• Using RDF/OWL with desktop and webtop tools to:– Combine spreadsheets
– Combine databases
– Combine spreadsheets and databases
– Combine documents, presentations, and spreadsheets (e.g., MS Office, etc.), and database tables
– Work with linked information and data interchangeably on the desktop, or the web
• Presenting and editing resulting mash-ups in different ways using lenses.
27
Semantic Content Tools:Cambridge Semantics Demo
• 2008 Semantic Technology Conference Agenda in a Spreadsheet
• Three end points for RDF:
– Web Page
– RDF Cloud Data Cloud
– Mobile Device
http://www.cambridgesemantics.com/anzodemo/conference.html (6 minutes)
28
Semantic Content Tools:
Cambridge Semantics Demo
• EPA Fuel Data (1995-2008 first 100) in a Spreadsheet
• Anzo for Excel Process:– Add Semantics to
Columns
– Faceted Visualization of Data (Table and Graph)
– Each View is a Lense
http://www.cambridgesemantics.com/epafuel/ (3 1/2 minutes)
Semantic Search
Microformats, SPARQL,Natural Language, and
Question Answering
30
Semantic Search
Semantic search allows you to:
• Search information structure and retrieve semantic metadata as well as pages, documents, data, and references.
• Search the web as if it were a database — you can think of SPARQL as “SQL for adults.”
• Discover entities, events, patterns, information of interest, meanings, and knowledge from natural language documents, unstructured web content, pictures and graphics, and data.
• Search concepts and relationships (not just key words) across different types of information and file formats, using pattern reasoning, deep linguistics, and symbolic reasoning based on knowledge models.
• Extract intelligence (I.e., relevant information in context of mission, task, interests, and use) and answer questions.
31
Semantic Search:
Query Languages
• SQL—query language for relational databases. Great for finding data from tabular representations, can get complex when many tables are involved in a given query.
• XQuery, XPointer and Xpath — query languages for XML data sources. Great for finding data in tree representations, can get complex when many relationships have to be traversed.
• SPARQL — query language for RDF graphs. Good pattern matching paradigm, especially when relationships have to be used to answer a query. SPARQL can be translated to SQL/relational algebra – and possibly to XQuery!
32
Semantic Search:
SPARQL
• SPARQL is the query language of the Semantic Web. It lets us:
• Pull information from structured and semi-structured data.
• Explore data by discovering unknown relationships.
• Query and search an integrated view of disparate data sources
• Glue separate software applications together by transforming data from one vocabulary to another
33
SPARQL Example
What automobiles get more than 25 miles per gallon, fit within my department’s budget, and can be purchased at a dealer located within 10 miles of one of my employees?
SELECT ?automobileWHERE { ?automobile a ex:Car ; epa:mpg ?mpg ; ex:dealer ?dealer . ?employee a ex:Employee ; geo:loc ?loc . ?dealer geo:loc ?dealerloc . FILTER(?mpg > 25 && geo:dist(?loc, ?dealerloc) <= 10) .}
Web dashboard SPARQL query:
SPARQL Query Engine
EmployeeDirectory
ERP / BudgetSystem
Web
Dealer 1Dealer 2
Dealer 3
EPA Fuel EfficiencySpreadsheet
34
Semantic Search:
Real World Query of Organizational Data• Semantic technology can entirely automatically connect
large sets of data (for example Oracle and dispersed Excel spreadsheets) and then generate an Ontology queryable through a GUI.
• Semantic solutions can offer zero programming to implement — an essential pre-requisite for mainstream adoption.
• The example which follows allows users to query across 300+ EPA spreadsheets and other data sets that are linked to each other using semantic web standards.
• The implication is that the vast amount of data held in spreadsheets throughout all industries and organizations can now be managed by mainstream users as a source of critical new information using distributed query integration across multiple spreadsheets.
Source: Brian Donnelly & Brand Niemann, Presentation submission to the 2009 Semantic Technology Conference, June 14-18, 2009.
35
Semantic Search:
Real World Query of Organizational Data
http://www.InSilicoDiscovery.com (more specific link in process)
Value of semantic browsing — With Google, you receive thousands of ‘links’ which you must traverse one-at-a-time. Here we are browsing through sets-of-related-”links” by drilling sideways through the “arrows”.
36
Semantic Search:
Natural Language
• Search technologies look for relevant information based on some criteria.
• Full-text search is fast, but delivers poor relevance in the absence of an exact keyword match.
• Statistical search uses frequencies of keywords, which depends on the quality of the training set, and algorithm(s) used to determine relevance.
• Natural language search uses linguistic analysis, rules, and reference knowledge to better disambiguate word senses and meanings of texts.
• Semantic search enhances search by understanding the meaning of concepts, relationships between concepts, and context of the query.
37
Semantic Search:
Natural LanguageApproaches Definition Example
Morphological Analysis
Understand word forms dog, dogs, and dog-catcher are closely related
Parsing
(Grammatical Analysis)
Understand the parts of speech
"There are 40 rows in the table" uses rows as a noun, vs. "She rows 5 times a week" uses rows as a verb
Sentence Analysis Understand how words relate to other words
"Jeffrey Skilling, represented by Attonery Daniel Petrocelli, is married to Rebecca Carter". Rebecca is married to Jeffrey not Daniel.
Semantic Analysis (Disambiguation)
Understand the context of key words
"I used beef broth for my soup stock" uses stock in the context of food, vs. "The company keeps lots of stock on hand" uses stock in the context of inventory.
Source: Expert Systems
38
Natural Language Semantic Technology for KM and Competitive Intelligence
ObjectivesMaximize value of intellectual capitalMaximum interoperability (document management systems, PLM, shared drives) and security;Increase effectiveness of strategic marketing and monitoring of competitors’ activities.
Benefits Complete organization of the company's information assets;Increased knowledge-sharing and cooperation among all users;Tangible support for strategic business processes like R&D, Marketing, etc.
Semantic searching and indexing of information from the company intranet, classification and effective content management enable knowledge sharing and cooperation among all users.
Solution Semantic searching and indexing;
Entity extraction;Classification of textual documents;Analysis and correlation of data;Monitoring of external data flows to distill information of strategic interest.
OpportunityLack of interoperability (document management systems, PLM, shared drives) and security;Need of leveraging real time information deriving from external sources to support corporate strategies.
39
Semantic Search:
Question Answering
40
Semantic Search:
Question Answering
Demohttp://217.26.90.202/
Current Directions
Combining Web 2.0 Wikis, Semantic Content Tools, and
Semantic Search
42
Current Directions:
Combining wikis, semantic content tools, and semantic search
SMW+SemanticMediaWiki+
Desktop importOntology mgmtSemantic searchSemantic apps
FutureSemantic Wikis
Natural languageTransemanticsMachine learningMulti-agent apps
SemanticMediaWiki
DbpediaLinked Data
MediaWiki
WikipediaRead/Write
43
MediaWiki — Wikipedia
44
Semantic MediaWiki — DBpedia
http://semantic-mediawiki.org/wiki/Semantic_MediaWiki
http://dbpedia.org/About
45
Semantic MediaWiki+Semantic Toolbar Semantic Annotation
Semantic FormsOntology Browser
Final Thoughts
47
What is Semantic Exchange?A collaborative industry education initiative about all things Web 3.0 and semantic web.
48
Semantic Exchange
• Our mission is to help public and private sector organizations learn about and seize opportunities presented by the next stage of internet evolution. Semantic Exchange activities include:
• Research — co-publish ground-breaking research on semantic technology, applications, and markets.
• Education — present monthly webinars, briefings, publications, and media articles as well as exhibitions and educational programs at industry conferences and events.
• Smart Innovation Laboratory — public and private sector organizations can gain access to expertise, research, and technologies, and get help to conduct pilot tests to prove out the benefits of semantic solutions.
• SemanticExchange.com — open collaborative portal for industry news, research, and education. It’s part semantic community wiki, part internet magazine, part technology showcase for new capabilities, and part knowledge outfitter where anyone can gain access to both commercial and open source tools, widgets, building blocks, and solution blueprints.
For more information, contact Mills Davis, [email protected]
49
What is Semantic Community?A collaborative government education initiative about all things Web 3.0 and semantic web.• The US EPA and the Federal Communities of Practice need
information assets on metadata, data governance, master data management, and enterprise information management like the individual Wilshire Conferences provide.
• This web-services platform pilot with a Wiki interface helps the Wilshire Conferences transform their information asset management processes to provide integrated content across multiple conferences that supports Web 3.0 and Semantic Search.
• Each conference proceedings have been carefully repurposed from the Web to this Web 2.0 Wiki in order to better organize, find, reuse, and search these valuable information assets.
• This sets the stage for semantifying all of the Wiki content (Wiki text, PowerPoint and PDF slides, Excel databases, etc.) using tools from participants in the Semantic Exchange.
For more information, contact Brand Niemann, [email protected]
50
Semantic Community
• Our mission is to enable "semantic interoperability" and "semantic data integration" focused on the government sector through:– Web 2.0/3.0 Community Infrastructure Sandbox 2008 at
Semanticommunity.net - January 7, 2008– Special Conferences – February 5, 2008– Participation in Semantic Technology and Semantic
Web Conferences – May 18-22, 2008, and October 25-28, 2009
– Workshop/Webinar Series – August 8, September 19, and October 16, 2008, and more to come
– Semantics for SOA and Cloud Computing – April 30-May and September 29-30, 2008, and more to come
– “Semantification of Web 2.0/3.0” – This Webcast
For more information, contact Brand Niemann, [email protected]
51
…
Thank you!