termset metadata tagging presentation - taxonomy bootcamp london 2016

21
TAGGING DOCUMENTS MADE EASY, USING MACHINE LEARNING Brendan Clarke [email protected] www.termSet.com

Upload: brendan-clarke

Post on 15-Apr-2017

85 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

TAGGING DOCUMENTS MADE EASY, USING MACHINE LEARNINGBrendan [email protected]

Page 2: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

BRENDAN CLARKE• A Microsoft ECM expert

• Co-Founded TermSet three years ago

• Got the scars from real world IA projects

Page 3: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

Creating Tax-

ononomies; 7

NLP; 3

Demo; 10Tagging; 10

Demo; 10Agenda

Page 4: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

PART ONE – APPROACHES FOR BUILDING TAXONOMIES

Page 5: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

TOP DOWN - APPROCH• Defines top level

containers and work downwards.

• Usually broad (3-10 wide) and shallow (3-4 deep)

• Simple, high level classification (functional)

Page 6: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

TOP DOWN – TERMS

• Manually defined or replicated from existing structures

• Imported from other systems

• Industry standards / purchased taxonomies

Page 7: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

TOP DOWN – SUMMARY

• People / Committee Driven approach

• Some guesswork of what terms should be

• Simple, high level classification (functional) – Way better than folders!

Page 8: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

BOTTOM UP - APPROCH• Terms driven by the

words and phrases within your content

• More complex taxonomies

• Detailed, accurate terms that are subject or facet level

Page 9: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

BOTTOM UP - TERMS• Manual analysis of

the documents

• Statistical analysis of terms and phrases

• Natural Language processing

Page 10: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

BOTTOM UP - SUMMARY• Technology driven

approach (or a very tough people process)

• Produces detailed taxonomies that reflect the actual content

• Extra granulation of tagging

Page 11: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

AND THE WINNER IS…

• Combining top down and bottom up is the best approach

• Top down classifies the type of documents

• Bottom up classifies the subject of the document

• New technology allows bottom up to be realistic

Page 12: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

TermSet adds accurate consistent metadata without placing any burden on end users or your IT team.

Builds taxonomies (bottom up) using NLPApplies tagsMetadata as a service TM

Page 13: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

WHAT EXACTLY IS NLP ?

Page 14: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

DEMO – CREATING TERMS FROM YOUR DOCUMENTS USING NLP

Page 15: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

PART TWO – APPLYING YOUR TAGS

Page 16: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

MANUAL TAGGING • Adoption problem

• Asbestos problem / GIGO

• Challenging to do retrospectively (migration tools can help)

Page 17: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

MANUAL TAGGING • Infer as many terms as possible from:

Document types, Location, Function

• Mandate as few tags as possible

• Stay shallow or flat with hierarchies

Page 18: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

MACHINE TAGGING • Simple machine tagging can use search

to match taxonomy terms to the content of documents

• More advanced taggers allow rules or weights to be assigned to each tag (tags not context aware)

• New technologies (NLP) provide a new approach to creating taxonomies

Page 19: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

TERMSET TAGGING • TermSet recommends the right

taxonomies for each library (context aware tagging)

• TermSet automates building the underlying IA in SharePoint

• Extra cool NLP tags can be added (Summaries, Sentiment and Language)

• Monitors for new documents and terms arriving into your world

Page 20: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

DEMO – TAGGING DOCUMENTS

Page 21: TermSet metadata tagging presentation - taxonomy bootcamp london 2016

WRAP UP• TermSet automates a bottom up

approach to create and use taxonomies for SharePoint

• Visit www.termset.com or e-mail [email protected] for a free licence

• If you need assistance with top down taxonomies or you use a different DMS e-mail me to join the beta program for www.taxononica.com