driving global revenue through multilingual knowledge systems · 10-07-2015 driving global revenue...
TRANSCRIPT
@wetzelmichael@coreonapp
Michael Wetzel, Coreon GmbH10 July 2015, MLKRep Workshop Vienna
Driving Global Revenue through Multilingual Knowledge Systems (MKS)
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Agenda
What is an MKS?
Three Business Cases
Challenges and Outlook
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
LISE Project Research from 2011-2012:Resources Increase, Quality Decreases
3
Oh, our classification had been built in English, - no, I don‘t know whether there is one in
French
Searching on our intranet is a pain – I‘ve just researched for documents containing LCD screen –nothing found! ... Should have known that they are
all tagged with monitor.
„Manual revision of the keyword lists are not possible any more“
(AUP, AT)
Duplicates, inconsistencies, gaps and content coverage problems after merging two resources (Imaging
company, UK)
„Difficulty to ensure consistency, in terms of
quality, coverage, completeness“ (EU
Representative)
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Divide et Impera!
4
Today:Unstructuredhaystack ofconcepts
Better:Ordered scheme, a taxonomy
Introduce controlTurn informationinto knowledge
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Taxonomy, thesaurus, ontology tools:Good, but not for language needs
5
Though, very helpful
• Classifications, nomenclatures
• Tagging in CMS• Semantic Search• Standards: SKOS, OWL• Conferences: semantics,
KMWorld, SemTechBiz, Wissensmanagementtage
Not for Language
• For trained experts only (!)• Lexically organised
• Weak in managingsynonyms
• Weak in multilingualism• Not for describing
terminology data
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Termbases, Glossaries
• Control language• Focus on translation
• Lack knowledge modelling• Only searching, no exploring
Taxonomies, Thesauri
• For knowledge structuring
• Lack multilingualism• Lack language control
Before MKS: Two Parallel Approaches toInventorise and Leverage Knowledge
6
Huge, unaddressed potential for cross-lingual data analysis, enterprise search, e-discovery and
to facilitate interoperability
would boost with data
would add structure… …
mirror base Spiegelfuß
wing mirror Außenspiegel
… …
mirror
wingmirror
left wingmirror
…
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Unify through a Multidimensional Repository for Knowledge and Language
7
08
45
7635
17
1: Taxonomy:
outputdevices
visualoutputdevices
screen
audiooutput
devices
head-phones
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Unify through a Multidimensional Repository for Knowledge and Language
8
3: Multilingualism:
Synonymy
• screen • monitor
• écran
• Bildschirm• Monitor • Display
08
45
76
rejected
accepted
35
17
2: Synonymy: 1: Taxonomy: 4: Control: 9
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Unify through a Multidimensional Repository for Knowledge and Language
9
Synonymy
• screen • monitor
• écran
• Bildschirm• Monitor • Display
• visual output device
• Optisches Ausgabegerät
08
45
76
rejected
accepted
35
17
3: Multilingualism: 2: Synonymy: 1: Taxonomy: 5: Meaning: ~4: Control: 9
One System, One View, All Languages:Concepts, Relations, Terms
10
Immediate broader / narrower neighborhood
Concept metadata
Terms andsynonyms
Extensive termdescriptors
Location in map
Alphabetic, multilingual list
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Agenda
What is an MKS?
• Fusion of terminology with taxonomy / thesaurus models
• Captures language and knowledge
Three Business Cases
Challenges and Outlook
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
The Business Case for:Terminology and Data Maintainers
What is an MKS?
• Fusion of terminology with taxonomy / thesaurus models
• Captures language and knowledge
Three Business Cases
Challenges and Outlook
Avoid a loss of your investment -apply systematic terminology work
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Dangers when Ad-hoc Collecting Terms
13
Do you feel incomfortable by adding and adding new data without a pause?
Where are the doublettes?Did we translate some records twice?
Can we unify two larger terminology resources?
Visualise and make above navigatable.Avoid noise and redundancies.Achieve clarity and consistency.
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
(Re-)Gain Control through Structure
... ...
… …
… …
… …
... ...
Yesterday:
List of terms
Goal:
Visualised Multilingual Concept Map
Schule school
Oberschule secondary school
Realschule intermediate school
Gymnasium grammar school
Grundschule elementary school
Schuleschool
Oberschulesecondary
school
Gymnasiumgrammar
school
Realschuleintermediate
school
Grundschuleelementary
school
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
One Example: Quality throughConsistency across Records
15
Lexical approach: Inconsistencies hidden
Concept map:Inconsistencies transparent
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael 16
Benefits: Large Terminology Collections underControl, thus Valuable
Trust
ProtectInvest-ment
Quality
Control
Efficiency
SystematicApproach
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
The Business Case for:Cross-border Interoperability
What is an MKS?
• Fusion of terminology with taxonomy / thesaurus models
• Captures language and knowledge
Three Business Cases
Challenges and Outlook
Harmonizing Organisations
and their Systems
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Semantic Interoperability
“We are connecting systems” Vocabularies only for meta
data?
This is rather syntacticinteroperability
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael 19
Problem: Cross-Border Interoperability
Not a translation but a semantic problem:
Is Austria‘s Matura the same as Germany‘s Abitur? And its synonyms?
What if a qualification like Realschulabschluß doesn’t exist in other countries?
„Agree on that we disagree“
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Creating a Unified Resource– across Languages
20
UK SE DE
Multiple terminologies are merged into a unified resource.
Linguistic similarity search:maps units that share
the same meaning.
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Concept Mapping Example:Trademark Classifications
Remote controls for diapositive projectors
Remote controls for projectors
Controls for projectors
Remote controls for slide projectors
Uses advanced linguistic search activating strict metadata rules to assure meaning matching.
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
A Little Protocol: Processing such Large Language Resources
Start
• 110,000 English terms• Other languages ranging from 8,000 to 60,000 terms
Clean
• Reduced English to 93,000 terms
Language coverage
• 93,000 terms in all EU languages too expensive without automation • Therefore: Mining data for translations• DB grew from 400,000 to 2.1m terms
Conceptual clustering
• Non essential entries removed – DB shrunk to 1.5m terms• Semi-automatic creation of 50,000 meaning clusters
Taxonomy
• Built for the top layers and concepts linked
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Benefits: Cross-border Interoperability in Trademark Registrations
23
• Seamless and borderless intellectual property registration, opposition, and legal processing, in all bodies of the EU and the member states
Organizational Benefits
• My trademark, as it is described in my native language, can be applied for globally without changes and is subject to the same legislation all over the EU
Commercial Benefits
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
The Business Case for:Digital Single Market
What is an MKS?
• Fusion of terminology with taxonomy / thesaurus models
• Captures language and knowledge
Three Business Cases
Challenges and Outlook
Cross-lingual semantic search is a pre-requisite for the European
DSM
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
The Vision of a Digital Single Market
“Consumers need to be able to buy the best products at the best prices, wherever they are in Europe.”
Vice-President Ansip, Dec 2014Accelerating growth through a connected Europe: Speech at GSMA Mobile 360 conference in Brusselshttp://europa.eu/rapid/press-release_SPEECH-14-2420_en.htm
25
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
…and the Reality
http://europa.eu/rapid/attachment/IP-15-4653/en/Digital_Single_Market_Factsheet_20150325.pdf
26
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Broken already by a Simple Search
<search string>
“Rasenmäher”
27
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Language Independent Semantic Search:Start in any Language
28
Start in any language
Heisswasserkocher
bouilloires
hervidores
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Language Independent Semantic Search:Decompose the Query String
29
Morphologicalanalysis
Heiss|wasser|kocher
bouilloire|s
hervidor|es
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Language Independent Semantic Search:Iterate via “Normal Language” Synonyms
30
Morphologicalanalysis
Expand paradigm through standard
synonyms
heissheiß
WasserH20
BoilerKocherErhitzer
bouilloire
hervidor
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Language Independent Semantic Search:Now Locate a Concept in the MKS
31
Morphologicalanalysis
Expand paradigm through standard synonyms
Locate concept in MKS
heissheiß
WasserH20
BoilerKocherErhitzer
bouilloire
hervidor
Heißwasserkocher
bouilloire
hervidor
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Language Independent Semantic Search:Realize your Domain Specific Synonyms
32
Morphologicalanalysis
Expand paradigm through standard synonyms
Locate concept in MKS
Realize weighted domain
synonyms
HeißweisserkocherHWK
bouilloirethéière
hervidor
Synonyms are weighted, i.e. carry attributes to help
disambiguating and improve ranking
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Language Independent Semantic Search:Consider Related Concepts
33
Morphologicalanalysis
Expand paradigm through standard synonyms
Locate concept in MKS
Realize weighted domain synonyms
Semantic expansion
through concept map
Samowar
“Caykolik Samowar”
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Language Independent Semantic Search:Now Match the Item in your Warehouse DB
34
CaykolikSamowar
Successful identification of item
in target database
Caykolik 1,5 l Samowar
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Language Independent Semantic Search - SummaryFrom Heisswasserkocher to Caykolik 1,5l Samowar
35
Start in any language
Heisswasserkocher
bouilloires
hervidores
Morphologicalanalysis
Expand paradigm through standard synonyms
Locate concept in MKS
Realize weighted domain synonyms
Semantic expansion through concept map
Caykolik 1,5 l Samowar
Successful identification of item
in target database
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
MKS – Indispensable for Processing Multilingual Data
Insights & Sense & Sentiment
Language Detection
NLP/Tokenization
ML
Tex
t An
alytics
Search
MT Provenance
Multilingual Knowledge System
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael 37
Benefits: Enable the Digital Single Market
Find items
with any word
in any spelling
in any language
Find even related relevant items through concept map(“Did you mean …?”)
Have more first-time purchasers
Increase international business
Increase e-sales
Happy audience
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Agenda
What is an MKS?
• Fusion of terminology with taxonomy• Captures language and knowledge
Three Business Cases
• Systematic terminology• Cross-border Interoperability• Enable European Digital Single Market
Challenges and Outlook
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael 39
Mountains still to Climb
MKS is not yet an established software category
Often embedded in “larger” solutions
Need to raise awareness of the problem
How to measure and monetize the problem?
Overlapping standards: OWL –SKOS – TBX – LEMON …
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
Enterprise Search
Social Media Analysis
Document
Classification
Inter-operability
Staff Training
Globalisation
Multilingual Knowledge System PlaysKey Role in many Business Processes
10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael
A fusion of terminology with taxonomy / thesaurus, to capture language with knowledge in a holistic way
Drives global revenue by improving both top lines and bottom lines
Huge potential in various business processes, industries and segments
41
Three Statements to Remember