ias, language and lego -- an introduction to semantic analysis

51
1 SM S M anagem ent& Technology IAs, Language and Lego™ – an Introduction to Semantic Analysis Matthew Hodgson Regional-lead, Web and Information Management, Canberra Australia 12 April 2008

Post on 19-Sep-2014

33 views

Category:

Technology


15 download

DESCRIPTION

This presentation will introduce Semantic Analysis – a way in which content can be analysed and classified through its linguistic basis, rather than through its overt meaning. It will achieve this by using Lego as a metaphor for language and demonstrating that by examining the building blocks of language a deeper understanding of content can be gained.

TRANSCRIPT

Page 1: IAs, Language and Lego -- an introduction to Semantic Analysis

1

SM

S M

anag

emen

t & T

echn

olog

y

IAs, Language and Lego™ – an Introduction to Semantic Analysis

Matthew HodgsonRegional-lead, Web and Information Management, Canberra Australia

12 April 2008

Page 2: IAs, Language and Lego -- an introduction to Semantic Analysis

2

SM

S M

anag

emen

t & T

echn

olog

y

Page 3: IAs, Language and Lego -- an introduction to Semantic Analysis

3

SM

S M

anag

emen

t & T

echn

olog

y

Page 4: IAs, Language and Lego -- an introduction to Semantic Analysis

4

SM

S M

anag

emen

t & T

echn

olog

y

IA Tools for understanding content

Page 5: IAs, Language and Lego -- an introduction to Semantic Analysis

5

SM

S M

anag

emen

t & T

echn

olog

y

Content analysis…

Page 6: IAs, Language and Lego -- an introduction to Semantic Analysis

6

SM

S M

anag

emen

t & T

echn

olog

y

We all:Think about information in different waysWrite about information in different ways

Information: we all think differently …

Page 7: IAs, Language and Lego -- an introduction to Semantic Analysis

7

SM

S M

anag

emen

t & T

echn

olog

y

… we all even write differently …

Page 8: IAs, Language and Lego -- an introduction to Semantic Analysis

8

SM

S M

anag

emen

t & T

echn

olog

y

Jeffrey Veen on analysing content

“a mind-numbingly detailed odyssey through your web site...

…this process…is a relatively straightforward process of clicking through your web site and recording what you find.”

Source: http://www.adaptivepath.com/ideas/essays/archives/000040.php

Page 9: IAs, Language and Lego -- an introduction to Semantic Analysis

9

SM

S M

anag

emen

t & T

echn

olog

y

When analysing content …

Page 10: IAs, Language and Lego -- an introduction to Semantic Analysis

10

SM

S M

anag

emen

t & T

echn

olog

y

An extract of medical restrictions text

Page 11: IAs, Language and Lego -- an introduction to Semantic Analysis

11

SM

S M

anag

emen

t & T

echn

olog

y

What is this content?! Medical restrictions text Free-text built in Word and hand-crafted (*grrr*) Unclassified Varied consistency within and between texts Highly complex sentence structures in pseudo-legalese Style reflects the author rather than

the meaning in the communication

Content needed for re-use Content output was needed for reuse by others Multiple audiences Multiple purposes for re-use

Codification Codification by 3rd parties (after authoring) takes too long Need to reduce timeframes!

Page 12: IAs, Language and Lego -- an introduction to Semantic Analysis

12

SM

S M

anag

emen

t & T

echn

olog

y

The task . . .analyse and codify

Concept 1

Concept 2Concept 3

Concept 4 Concept 5

Concept 5

Page 13: IAs, Language and Lego -- an introduction to Semantic Analysis

13

SM

S M

anag

emen

t & T

echn

olog

y

What tools would be appropriate?

?

Page 14: IAs, Language and Lego -- an introduction to Semantic Analysis

14

SM

S M

anag

emen

t & T

echn

olog

y Linguistics…a whole discipline devoted to the

study of language…

preposition

verb adjective

noun

determiner

subjectobject

conjunction

semantics

sentence structure

all language has structure

Page 15: IAs, Language and Lego -- an introduction to Semantic Analysis

15

SM

S M

anag

emen

t & T

echn

olog

y

Language is like Lego™

Building blocks Subject (S) Verb (V) Object (O)

Order of blocks Differs depending on the language

Page 16: IAs, Language and Lego -- an introduction to Semantic Analysis

16

SM

S M

anag

emen

t & T

echn

olog

y

Language is like Lego™

SVO languages English, French, Chinese, Bulgarian, SwahiliSOV Japanese, Turkish, KoreanVSO Classical Arabic, Celtic and HawaiianVOS Fijian, Yoda’s amusing phrases

Page 17: IAs, Language and Lego -- an introduction to Semantic Analysis

17

SM

S M

anag

emen

t & T

echn

olog

y

Lego bricks: subjects, verbs and objects

Subject Verb Object

Those Lego bricks are [some] Lego bricksred

Sometimes, though, the SVO structure is hidden: “The Lego is red” or “Those Lego bricks are [some] red Lego bricks” ?

Uncovering the hidden structure helps to differentiate between the subject and the object and identify the who and what

Page 18: IAs, Language and Lego -- an introduction to Semantic Analysis

19

SM

S M

anag

emen

t & T

echn

olog

y

Lego trees…

OBJECTVERBSUBJECT

Those Lego bricks are [some] Lego bricksred

SentenceRoot

Adj Adj Adj

NounPhrase

VerbPhrase

NounPhrase

NounDetVerbDet Noun

Page 19: IAs, Language and Lego -- an introduction to Semantic Analysis

20

SM

S M

anag

emen

t & T

echn

olog

y

Semantic analysis

Medical restrictions wording:

Restricted benefitGastro-oesophageal reflux disease; Scleroderma oesophagus;

Authority requiredPeptic ulcer

Page 20: IAs, Language and Lego -- an introduction to Semantic Analysis

21

SM

S M

anag

emen

t & T

echn

olog

y

Semantic analysis (cont.)

Actual sentencePeptic ulcer

Implied sentenceThe prescription of medicine is restricted to

the initial treatment of patients with peptic ulcer

Page 21: IAs, Language and Lego -- an introduction to Semantic Analysis

22

SM

S M

anag

emen

t & T

echn

olog

y

Semantic structure of ‘peptic ulcer’

OBJ ECTVERBSUBJ ECT

ofthe prescription

DETVNDET PN

(SUBJECT)AUX

VAUX NP P ADJ NN

NounPhrase

PreposPhrase

NounPhrase

Root VerbPhrase

NounPhrase

PreposPhrase

NounPhrase

medicine is restricted to the initial

Adj

treatment of patients with peptic ulcer

Page 22: IAs, Language and Lego -- an introduction to Semantic Analysis

23

SM

S M

anag

emen

t & T

echn

olog

y

Semantic model for restrictions textWHO

TREATED?

treatment of patientsinitial

Initi

al o

r co

ntin

uin

g

70 year old

mother

pregnant

Co

ndi

tion

be

ing

tre

ate

d

form

Pra

ctic

al a

spe

cts

Ob

ject

the prescription of medicine is restricted to the

Su

bje

ct

Ve

rb

femalecontinuing

other ADJ

male

Pa

tient

des

crip

tors

(p

op

ulat

ion/

gro

up

)

details of doctorrecord

daterecord

sign

receivingdBMARD treatment

previouslyPBS-

subsidised

PB

S s

ubs

idis

ed

receivingPBS-

subsidiseddBMARD treatment

treated immunologistclinical

Lim

itatio

n o

fP

resc

ribin

g to

a s

pec

ific

spe

cia

list

grou

p

withnausea and

vomiting

advanced psoriasis

peptic ulcerwith

tumorwith malignant

scleroderma oesphaguswith

with

with chronic pain

chemotherapycytotoxic

receivingA 5HT3

antagonist

radiotherapyreceiving

Exi

stin

g t

reat

me

nt

de

scri

pto

rs

of

po

pu

latio

n

not toresponding anelgesics

not

ADJ

receiving

treated dermatologist

WHATCONDITION?

+

ADJ

NOUN

PREP

VERB

by

by

KEY

not previously

ACTIONREQUIRED

=complete

Authority action sheet

includewhole body

area diagrams

treat for period of time

provide historypreivous

prescribe repeatsnumber

with seizures

not toother

anti-epilepticdrugs

receiving treatment2 years

incomplete resolution

ADJ/PP

of

no indication of

surgeryhaving

responding

unable take of topiramatesolid form

partial

hormone dependent metastatic

cancerwith

Me

asu

res/

de

scri

pto

rsof

Co

nd

itio

n s

eve

rity

(AD

J)

breast

contact Medicare

obtainAuthority number

Page 23: IAs, Language and Lego -- an introduction to Semantic Analysis

24

SM

S M

anag

emen

t & T

echn

olog

y

Semantics describing “Who Treated”

Age

Patient Group

Documented history

[mg ...etc]

[CLINICIAN] Requiring special expertise in

Requiring no special expertise

[EXPERTISE]

[SEVERITY] [CONDITION]

Sex

PBS subsidised

PBS non-subsidised

At a dose of

Weekly

Daily

Monthly

Yearly

Fortnightly

Hourly

Hours

Days

Weeks

Months

Years

Vocation Veteran

Male

Female

All

Ethnicity [ETHNICITY]

Entitlement [?]

[LIST]

[LIST]

Pregnant

Breastfeeding

[ADJECTIVE]

Veteran

?

[MEASURED AS]?

Co-administered with

That meet a specific definition/criteria as set out in [LIST of references]

General schedule of Lipid-lowering Drugs

and

[DEFINED BY]

Treatments

Within timeframe of

Over a period of

Trials

Treatment with

Treatment of

Treatment for

Initial

Continuing

Maintenance

Effective

Ineffective

Inappropriate

Initiation

Stabilisation

In conjunction with

Not in conjunction with

Following

Preceeding

Received

Has not received

Not responding

Responding

Failed to qualify for

Qualified for

Not indicated

Indicated

Has had

Has not had

Can have

Can not have

Can not receive

Disease progression

Disease regression

Treated by

Diagnosis confirmed by

=

[NUMBER]Over

Under

Exactly

Between

At least

[DRUG]

[TREATMENT]

Diet

Exercise

Surgery [TYPE]

[THERAPY]

Evidence of

[PROCEDURE]

in

[DISORDER]

Symptoms?

Clinical findings

Starts new prepositional-phrase in the same text-block

Starts new prepositional-phrase in the same text-block

Starts new prepositional-phrase

in the same text-block

As measured by?

As evidenced by

Starts new prepositional-phrase

in the same text-block

Page 24: IAs, Language and Lego -- an introduction to Semantic Analysis

25

SM

S M

anag

emen

t & T

echn

olog

y Authority Action

(allow) Maximum

Therapy

Supply

(allow) Minimum

In writing

By telephone

[TIME]

days

weeks

months

Therapy

Supply[AMOUNT]

Repeats[AMOUNT]

Repeats[AMOUNT]

Initial

Subsequent

Ongoing

Initial

Subsequent

Ongoing

Initial

Subsequent

Ongoing In writing

By telephone

To complete

Followed by

In writing

By telephone Within timeframe of [TIME]

days

weeks

months

Treatment

Treatment

Electronically

Electronically

Electronically

Remaining

Remaining

Remaining

In writing

By telephone

Electronically

Initial

Subsequent

Ongoing

Remaining

Where approval

[TIMEFRAME]

To [AUTHORITY]

Medicare

To [AUTHORITY]

Medicare

To [AUTHORITY]

Medicare

...etc...

...etc...

...etc...

Repeats[AMOUNT]

Starts new prepositional-phrase

in the same text-block

Starts new prepositional-phrase

in the same text-block

Starts new prepositional-phrase

in the same text-block

Semantics describing “Authority Action”

Page 25: IAs, Language and Lego -- an introduction to Semantic Analysis

26

SM

S M

anag

emen

t & T

echn

olog

y

High-level semantic overview

HOWAUTHORISED

WHATCONDITION

WHO TREATED

Notes and Cautions + + + + =

Age limitations

Clinical initiation or

continuation criteria

Prescribing clinicians

Prescribing adviceCondition

Contact information

Grandfathering clauses Patient

groups

Prior treatments Severity

Patient GroupDefinitions Condition Authority ActionForeword

Page 26: IAs, Language and Lego -- an introduction to Semantic Analysis

27

SM

S M

anag

emen

t & T

echn

olog

y

Yes, it can be codified!

Medical restrictions: Did have structure Did have underlying logic Were based on repeatable business processes Could be codified

Could we make a ‘system’ to reinforce the structure at the point of authoring?

Page 27: IAs, Language and Lego -- an introduction to Semantic Analysis

28

SM

S M

anag

emen

t & T

echn

olog

y

Demo

Putting it together in a system: Supporting building of content restrictions in a

codified way Protyotyping with Axure

Page 28: IAs, Language and Lego -- an introduction to Semantic Analysis

29

SM

S M

anag

emen

t & T

echn

olog

y

Page 29: IAs, Language and Lego -- an introduction to Semantic Analysis

30

SM

S M

anag

emen

t & T

echn

olog

y

Page 30: IAs, Language and Lego -- an introduction to Semantic Analysis

31

SM

S M

anag

emen

t & T

echn

olog

y

Page 31: IAs, Language and Lego -- an introduction to Semantic Analysis

32

SM

S M

anag

emen

t & T

echn

olog

y

Page 32: IAs, Language and Lego -- an introduction to Semantic Analysis

33

SM

S M

anag

emen

t & T

echn

olog

y

Page 33: IAs, Language and Lego -- an introduction to Semantic Analysis

34

SM

S M

anag

emen

t & T

echn

olog

y

Page 34: IAs, Language and Lego -- an introduction to Semantic Analysis

35

SM

S M

anag

emen

t & T

echn

olog

y

Page 35: IAs, Language and Lego -- an introduction to Semantic Analysis

36

SM

S M

anag

emen

t & T

echn

olog

y

Page 36: IAs, Language and Lego -- an introduction to Semantic Analysis

37

SM

S M

anag

emen

t & T

echn

olog

y

The semantic analysis advantage

vsIdentifies:• Themes in content

Identifies:• Themes in content• Work processes• Folk taxonomies used• ‘Things’ written about

Page 37: IAs, Language and Lego -- an introduction to Semantic Analysis

38

SM

S M

anag

emen

t & T

echn

olog

y

What else could you use it for?

When you need to understand: Business processes that create content

When you want to disassemble content for: FAQs A-Z indexes Help files

Page 38: IAs, Language and Lego -- an introduction to Semantic Analysis

39

SM

S M

anag

emen

t & T

echn

olog

y

How can I add this to my toolbox??!

Theory is important An understanding of semantics - sentence trees

and grammar Text books by authors like Fromkin and Rodman

can help through the tricky bits

Need good tools Connexor:

http://www.connexor.eu/technology/machinese/demo/

Big sheets of paper (and an electronic whiteboard) Visio (not PowerPoint!)

Page 39: IAs, Language and Lego -- an introduction to Semantic Analysis

40

SM

S M

anag

emen

t & T

echn

olog

y

Demo

Connexor:http://www.connexor.eu/technology/machinese/demo/

Page 40: IAs, Language and Lego -- an introduction to Semantic Analysis

41

SM

S M

anag

emen

t & T

echn

olog

y

Connexor

Page 41: IAs, Language and Lego -- an introduction to Semantic Analysis

42

SM

S M

anag

emen

t & T

echn

olog

y

Connexor – machine tagger

Page 42: IAs, Language and Lego -- an introduction to Semantic Analysis

43

SM

S M

anag

emen

t & T

echn

olog

y

Connexor – machine syntax

Page 43: IAs, Language and Lego -- an introduction to Semantic Analysis

44

SM

S M

anag

emen

t & T

echn

olog

y

Why should I care about this? Google uses semantic analysis to index content

Translation software uses semantic analysis to identify ‘components’ for translation

Good sentence structure equals: Accurate indexing Higher rank relevance of content Happy people (they find what they’re looking for)

Page 44: IAs, Language and Lego -- an introduction to Semantic Analysis

45

SM

S M

anag

emen

t & T

echn

olog

y

Why should I care about this?

Page 45: IAs, Language and Lego -- an introduction to Semantic Analysis

46

SM

S M

anag

emen

t & T

echn

olog

y

‘Calais’ by Reuters

Page 46: IAs, Language and Lego -- an introduction to Semantic Analysis

47

SM

S M

anag

emen

t & T

echn

olog

y

Summing up

Content is still king!

But how can you tell if your content: Is of good quality? Matches your website’s categories? Accurately reflects your metadata? Can be found by people?

Semantic analysis can: Make your content audits more objective Inform processes to improve the quality of the content Inform processes to improve search engine indexing Inform metadata creation Inform choice of taxonomy

Page 47: IAs, Language and Lego -- an introduction to Semantic Analysis

48

SM

S M

anag

emen

t & T

echn

olog

y

Take-home message

Semantic analysis can help IAs:Infer How people think about, and structure, their informationDescribe Business processes that produce contentIdentify Where content quality is poor so it can be improved Critical components of the sentence for codificationDesign Taxonomies and describe folk taxonomiesBuild Systems to help bring some structure to content authoring

Page 48: IAs, Language and Lego -- an introduction to Semantic Analysis

49

SM

S M

anag

emen

t & T

echn

olog

y

Fin

Page 49: IAs, Language and Lego -- an introduction to Semantic Analysis

50

SM

S M

anag

emen

t & T

echn

olog

y

IAs, Language and Lego™

an Introduction to Semantic Analysis

Page 50: IAs, Language and Lego -- an introduction to Semantic Analysis

51

SM

S M

anag

emen

t & T

echn

olog

y

by

Matthew Hodgson

Regional-lead, Web and Information Management

SMS Management & Technology Canberra Australia

Page 51: IAs, Language and Lego -- an introduction to Semantic Analysis

52

SM

S M

anag

emen

t & T

echn

olog

y

by

Matthew Hodgson

Email [email protected] magia3e.wordpress.com

Slideshare www.slideshare.net/magia3e

Twitter magia3e