driving global revenue through multilingual knowledge systems · 10-07-2015 driving global revenue...

42
@wetzelmichael @coreonapp Michael Wetzel, Coreon GmbH 10 July 2015, MLKRep Workshop Vienna Driving Global Revenue through Multilingual Knowledge Systems (MKS)

Upload: others

Post on 17-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

@wetzelmichael@coreonapp

Michael Wetzel, Coreon GmbH10 July 2015, MLKRep Workshop Vienna

Driving Global Revenue through Multilingual Knowledge Systems (MKS)

Page 2: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Agenda

What is an MKS?

Three Business Cases

Challenges and Outlook

Page 3: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

LISE Project Research from 2011-2012:Resources Increase, Quality Decreases

3

Oh, our classification had been built in English, - no, I don‘t know whether there is one in

French

Searching on our intranet is a pain – I‘ve just researched for documents containing LCD screen –nothing found! ... Should have known that they are

all tagged with monitor.

„Manual revision of the keyword lists are not possible any more“

(AUP, AT)

Duplicates, inconsistencies, gaps and content coverage problems after merging two resources (Imaging

company, UK)

„Difficulty to ensure consistency, in terms of

quality, coverage, completeness“ (EU

Representative)

Page 4: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Divide et Impera!

4

Today:Unstructuredhaystack ofconcepts

Better:Ordered scheme, a taxonomy

Introduce controlTurn informationinto knowledge

Page 5: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Taxonomy, thesaurus, ontology tools:Good, but not for language needs

5

Though, very helpful

• Classifications, nomenclatures

• Tagging in CMS• Semantic Search• Standards: SKOS, OWL• Conferences: semantics,

KMWorld, SemTechBiz, Wissensmanagementtage

Not for Language

• For trained experts only (!)• Lexically organised

• Weak in managingsynonyms

• Weak in multilingualism• Not for describing

terminology data

Page 6: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Termbases, Glossaries

• Control language• Focus on translation

• Lack knowledge modelling• Only searching, no exploring

Taxonomies, Thesauri

• For knowledge structuring

• Lack multilingualism• Lack language control

Before MKS: Two Parallel Approaches toInventorise and Leverage Knowledge

6

Huge, unaddressed potential for cross-lingual data analysis, enterprise search, e-discovery and

to facilitate interoperability

would boost with data

would add structure… …

mirror base Spiegelfuß

wing mirror Außenspiegel

… …

mirror

wingmirror

left wingmirror

Page 7: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Unify through a Multidimensional Repository for Knowledge and Language

7

08

45

7635

17

1: Taxonomy:

outputdevices

visualoutputdevices

screen

audiooutput

devices

head-phones

Page 8: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Unify through a Multidimensional Repository for Knowledge and Language

8

3: Multilingualism:

Synonymy

• screen • monitor

• écran

• Bildschirm• Monitor • Display

08

45

76

rejected

accepted

35

17

2: Synonymy: 1: Taxonomy: 4: Control: 9

Page 9: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Unify through a Multidimensional Repository for Knowledge and Language

9

Synonymy

• screen • monitor

• écran

• Bildschirm• Monitor • Display

• visual output device

• Optisches Ausgabegerät

08

45

76

rejected

accepted

35

17

3: Multilingualism: 2: Synonymy: 1: Taxonomy: 5: Meaning: ~4: Control: 9

Page 10: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

One System, One View, All Languages:Concepts, Relations, Terms

10

Immediate broader / narrower neighborhood

Concept metadata

Terms andsynonyms

Extensive termdescriptors

Location in map

Alphabetic, multilingual list

Page 11: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Agenda

What is an MKS?

• Fusion of terminology with taxonomy / thesaurus models

• Captures language and knowledge

Three Business Cases

Challenges and Outlook

Page 12: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

The Business Case for:Terminology and Data Maintainers

What is an MKS?

• Fusion of terminology with taxonomy / thesaurus models

• Captures language and knowledge

Three Business Cases

Challenges and Outlook

Avoid a loss of your investment -apply systematic terminology work

Page 13: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Dangers when Ad-hoc Collecting Terms

13

Do you feel incomfortable by adding and adding new data without a pause?

Where are the doublettes?Did we translate some records twice?

Can we unify two larger terminology resources?

Visualise and make above navigatable.Avoid noise and redundancies.Achieve clarity and consistency.

Page 14: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

(Re-)Gain Control through Structure

... ...

… …

… …

… …

... ...

Yesterday:

List of terms

Goal:

Visualised Multilingual Concept Map

Schule school

Oberschule secondary school

Realschule intermediate school

Gymnasium grammar school

Grundschule elementary school

Schuleschool

Oberschulesecondary

school

Gymnasiumgrammar

school

Realschuleintermediate

school

Grundschuleelementary

school

Page 15: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

One Example: Quality throughConsistency across Records

15

Lexical approach: Inconsistencies hidden

Concept map:Inconsistencies transparent

Page 16: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael 16

Benefits: Large Terminology Collections underControl, thus Valuable

Trust

ProtectInvest-ment

Quality

Control

Efficiency

SystematicApproach

Page 17: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

The Business Case for:Cross-border Interoperability

What is an MKS?

• Fusion of terminology with taxonomy / thesaurus models

• Captures language and knowledge

Three Business Cases

Challenges and Outlook

Harmonizing Organisations

and their Systems

Page 18: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Semantic Interoperability

“We are connecting systems” Vocabularies only for meta

data?

This is rather syntacticinteroperability

Page 19: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael 19

Problem: Cross-Border Interoperability

Not a translation but a semantic problem:

Is Austria‘s Matura the same as Germany‘s Abitur? And its synonyms?

What if a qualification like Realschulabschluß doesn’t exist in other countries?

„Agree on that we disagree“

Page 20: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Creating a Unified Resource– across Languages

20

UK SE DE

Multiple terminologies are merged into a unified resource.

Linguistic similarity search:maps units that share

the same meaning.

Page 21: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Concept Mapping Example:Trademark Classifications

Remote controls for diapositive projectors

Remote controls for projectors

Controls for projectors

Remote controls for slide projectors

Uses advanced linguistic search activating strict metadata rules to assure meaning matching.

Page 22: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

A Little Protocol: Processing such Large Language Resources

Start

• 110,000 English terms• Other languages ranging from 8,000 to 60,000 terms

Clean

• Reduced English to 93,000 terms

Language coverage

• 93,000 terms in all EU languages too expensive without automation • Therefore: Mining data for translations• DB grew from 400,000 to 2.1m terms

Conceptual clustering

• Non essential entries removed – DB shrunk to 1.5m terms• Semi-automatic creation of 50,000 meaning clusters

Taxonomy

• Built for the top layers and concepts linked

Page 23: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Benefits: Cross-border Interoperability in Trademark Registrations

23

• Seamless and borderless intellectual property registration, opposition, and legal processing, in all bodies of the EU and the member states

Organizational Benefits

• My trademark, as it is described in my native language, can be applied for globally without changes and is subject to the same legislation all over the EU

Commercial Benefits

Page 24: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

The Business Case for:Digital Single Market

What is an MKS?

• Fusion of terminology with taxonomy / thesaurus models

• Captures language and knowledge

Three Business Cases

Challenges and Outlook

Cross-lingual semantic search is a pre-requisite for the European

DSM

Page 25: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

The Vision of a Digital Single Market

“Consumers need to be able to buy the best products at the best prices, wherever they are in Europe.”

Vice-President Ansip, Dec 2014Accelerating growth through a connected Europe: Speech at GSMA Mobile 360 conference in Brusselshttp://europa.eu/rapid/press-release_SPEECH-14-2420_en.htm

25

Page 26: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

…and the Reality

http://europa.eu/rapid/attachment/IP-15-4653/en/Digital_Single_Market_Factsheet_20150325.pdf

26

Page 27: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Broken already by a Simple Search

<search string>

“Rasenmäher”

27

Page 28: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Language Independent Semantic Search:Start in any Language

28

Start in any language

Heisswasserkocher

bouilloires

hervidores

Page 29: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Language Independent Semantic Search:Decompose the Query String

29

Morphologicalanalysis

Heiss|wasser|kocher

bouilloire|s

hervidor|es

Page 30: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Language Independent Semantic Search:Iterate via “Normal Language” Synonyms

30

Morphologicalanalysis

Expand paradigm through standard

synonyms

heissheiß

WasserH20

BoilerKocherErhitzer

bouilloire

hervidor

Page 31: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Language Independent Semantic Search:Now Locate a Concept in the MKS

31

Morphologicalanalysis

Expand paradigm through standard synonyms

Locate concept in MKS

heissheiß

WasserH20

BoilerKocherErhitzer

bouilloire

hervidor

Heißwasserkocher

bouilloire

hervidor

Page 32: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Language Independent Semantic Search:Realize your Domain Specific Synonyms

32

Morphologicalanalysis

Expand paradigm through standard synonyms

Locate concept in MKS

Realize weighted domain

synonyms

HeißweisserkocherHWK

bouilloirethéière

hervidor

Synonyms are weighted, i.e. carry attributes to help

disambiguating and improve ranking

Page 33: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Language Independent Semantic Search:Consider Related Concepts

33

Morphologicalanalysis

Expand paradigm through standard synonyms

Locate concept in MKS

Realize weighted domain synonyms

Semantic expansion

through concept map

Samowar

“Caykolik Samowar”

Page 34: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Language Independent Semantic Search:Now Match the Item in your Warehouse DB

34

CaykolikSamowar

Successful identification of item

in target database

Caykolik 1,5 l Samowar

Page 35: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Language Independent Semantic Search - SummaryFrom Heisswasserkocher to Caykolik 1,5l Samowar

35

Start in any language

Heisswasserkocher

bouilloires

hervidores

Morphologicalanalysis

Expand paradigm through standard synonyms

Locate concept in MKS

Realize weighted domain synonyms

Semantic expansion through concept map

Caykolik 1,5 l Samowar

Successful identification of item

in target database

Page 36: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

MKS – Indispensable for Processing Multilingual Data

Insights & Sense & Sentiment

Language Detection

NLP/Tokenization

ML

Tex

t An

alytics

Search

MT Provenance

Multilingual Knowledge System

Page 37: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael 37

Benefits: Enable the Digital Single Market

Find items

with any word

in any spelling

in any language

Find even related relevant items through concept map(“Did you mean …?”)

Have more first-time purchasers

Increase international business

Increase e-sales

Happy audience

Page 38: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Agenda

What is an MKS?

• Fusion of terminology with taxonomy• Captures language and knowledge

Three Business Cases

• Systematic terminology• Cross-border Interoperability• Enable European Digital Single Market

Challenges and Outlook

Page 39: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael 39

Mountains still to Climb

MKS is not yet an established software category

Often embedded in “larger” solutions

Need to raise awareness of the problem

How to measure and monetize the problem?

Overlapping standards: OWL –SKOS – TBX – LEMON …

Page 40: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

Enterprise Search

Social Media Analysis

Document

Classification

Inter-operability

Staff Training

Globalisation

Multilingual Knowledge System PlaysKey Role in many Business Processes

Page 41: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael

A fusion of terminology with taxonomy / thesaurus, to capture language with knowledge in a holistic way

Drives global revenue by improving both top lines and bottom lines

Huge potential in various business processes, industries and segments

41

Three Statements to Remember

Page 42: Driving Global Revenue through Multilingual Knowledge Systems · 10-07-2015 Driving Global Revenue through Multilingual Knowledge Systems @wetzelmichael LISE Project Research from

Michael Wetzel

m [email protected]

s mlwetzel

t @wetzelmichael

l Berlin-Mitte Thank You42