integration issues imt 589 february 4, 2006. 2/4/2006imt 589-applied and structural metadata2

34
Integration Issues IMT 589 February 4, 2006

Post on 22-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

Integration Issues

IMT 589February 4, 2006

Page 2: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 2

Page 3: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 3

Solving the Integration Problem

Need to find a way to bring multiple metadata schemas togetherExamples from readings this week show several approaches

Tannenbaum has examples of standalone and distributed repositories, metadata interexchange, and enterprise portals

Hunter discusses mapping elements through a shared term thesaurus

Bedford talks about metadata integration at the World Bank

After looking at some lessons from Bedford, we’ll discuss in more depth the MSWeb example in Rosenfeld and Morville’s article

Page 4: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 4

Enterprise Architecture Basics

Design your Enterprise Architecture to support your goalsEnterprise implies integration and contextHigh level reference model must take into account the following Functional Architecture Technical Architecture Content Architecture Presentation Architecture

From Bedford, Denise. Presentation to American Society of IndexersAnnual Conference – Arlington Virginia – May 15, 2004

Page 5: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 5

Facilitate integration and Facilitate integration and repurposing of contentrepurposing of content

- Provide broad search and retrieval capabilities

- Increase reuse and decrease redundancy across content providers

Increase the value and quality Increase the value and quality of contentof content

- Build intelligent relationships among disparate content sources using concepts and metadata

- Define, enforce, monitor processes/procedures on content collections to ensure quality

Consistent information security Consistent information security and disclosure enforcementand disclosure enforcement

- Bank records must be consistent in order to facilitate disclosure policy compliance and information sharing for partners

Simplify and complete the Simplify and complete the content life-cyclecontent life-cycle- Reduce the number of user-facing content entry points by using already existent business processes- Manage content end-to-end from initial inception to final disposition

What are the Goals of the World Bank Enterprise Architecture?

From Bedford, Denise. Presentation to American Society of IndexersAnnual Conference – Arlington Virginia – May 15, 2004

Page 6: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 6

The ECA TaxonomyView

Thesaurus

Topics Language

From Bedford, Denise. Presentation to American Society of IndexersAnnual Conference – Arlington Virginia – May 15, 2004

Page 7: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 7

Bank Metadata – Purpose & Taxonomies

Agent Country Authorized By

Record Identifier

Title Region Rights Management

Disposal Status

Date Abstract/ Summary

Access Rights

Disposal Review Date

Format Keywords Location Management History

Publisher Subject-Sector- Theme-Topic

Use History Retention Schedule/ Mandate

Language Business Function

Disclosure Status Preservation History

Version Disclosure Review Date

Aggregation Level

Series & Series #

Relation

Content Type

Identification/ Distinction

Search & Browse

Use Management Compliant Document Management

Flat Taxonomy Hierarchical Taxonomy

Network Taxonomy

Faceted Taxonomy

From Bedford, Denise. Presentation to American Society of IndexersAnnual Conference – Arlington Virginia – May 15, 2004

Page 8: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 8

Agent Country Authorized By

Record Identifier

Title Region Rights Management

Disposal Status

Date Abstract/ Summary

Access Rights

Disposal Review Date

Format Keywords Location Management History

Publisher Subject-Sector- Theme-Topic

Use History Retention Schedule/ Mandate

Language Business Function

Preservation History

Version Aggregation Level

Series & Series #

Relation

Content Type

Identification/ Distinction

Use Management Compliant Document Management

Human Capture

Inherit from Structured Content

Programmatic Capture

Inherit from System Context

Extrapolate from Business Rules

Search & Browse

Metadata Capture Methods

From Bedford, Denise. Presentation to American Society of IndexersAnnual Conference – Arlington Virginia – May 15, 2004

Page 9: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 9

Search As a Service

A project to formalize and productize MSWeb offerings in the enterprise search arenaIncluded search, metadata support, search metrics, and optional UIAlso included search and vocabulary management tools, formal documentation and processes for customer support and change controlThe first step toward an object-based portal

Page 10: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 10

Metadata in Search

Indexing

User

Other Users

Query Preprocessing

Result Set Manipulation

Searching Index(es)User

Interface

Indexer

Independent Metadata

Data Stores

Data Analysis

Index Metadata

database schemasthesauri

file systemhttpmessaging storesDocument storeDatabasesDirectory stores

string manipulationsynonym sets &thesauristemmingwordbreaking

adaptive crawlingword breakingword stemmingNLP

dedupingconcatenationranking

Result Refining

User Metadata

Page 11: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 11

Preliminary Architecture

Organizations

Audience

Core vocabulary repositoryNon-core repositories

Metadata Tag Set Registry

Tag Set 1 Tag Set 2 Core Tag Set Tag Set 3 Tag Set 4 Tag Set 5

Registry canbe used tosegment

results throughmatches to

predeterminedtags and

vocabularies

results can besegmented

throughpulldownmenus

exposing tagsets

intermediatedisambiguation

of searchterms

vocabulariescan be

exposed basedon user-defined

metadata,allowing

customizedviews ofcontent

EmbeddedSearch

ExplicitSearch

Browsing

Title

Author

Category

Audience

Location

Title

Group

Term List

term a (AT)

term b (AT)

term c (ET)

term d (ET)

term e (AT)

term f (ET)

term g (ET)

term h (AT)

term j (AT)

term i (AT)

Subject

Org.

Geography

VocabularySet 2

Vocabularyset 1

Title

Author

Keyword

Org.

Product

Title

Author

Keyword

Org.

Geography

Title

Author

Keyword

Org.

Audience

Headline

Analyst

Subject

Service

Region

Organizationorg group

Locationnorth

americaunited stateswashington

seattle

term j

Developersorg 1.1

Editors

org 1.1

org 1.1.1

org 1.1.2

term j

2

3

4

6

1

911

8

Registry canbe used to

expose tags(and

vocabulariesthrough them)for authoring

tools

Tagging

12

7

org 1.1.1org 1.1.2

5

10

term k (ET)

org 1.5term j

8

88

There are three layers of possible interaction between vocabularies, metadata tag sets and the search/tagging interface. Examples of these interactions are shown in Figure 1, and described briefly below. Using a common data model for vocabulary construction allows the most powerful interactions between the layers; however, any vocabulary registering its metadata tag set can be exposed to a user at the time of search or browse.

 Vocabularies using this repository can:

 1)       Reuse terms in other vocabularies.

 2)       Reuse terms with structure in other vocabularies.

 3)       Map entire vocabularies to metadata tags.

 4)       Map parts of vocabularies to metadata tags.

 5)       Expose their terms and synsets to non-core users.

 Vocabularies not using this repository can:

 6)       Register metadata tags in the metadata registry.

 7)       Use terms and synsets from core vocabularies.

 All vocabularies associated with metadata tags can:

 8)       Be linked to tags from other tag sets through registry mapping.

 All registered metadata tag sets can:

 9)       Expose metadata tags and associated vocabularies for publishing tools.

 10)   Segment results in searches through matches to predetermined tags and vocabularies at time of query.

 11)   Expose metadata tags before search as an advanced user interface, or during search as an intermediate query refinement assist.

 12)   Expose vocabularies and tags in whole or in part as browsing structures for content (as in Yahoo!).

Page 12: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 12

Products Included

Search Query and results Best Bets Catalogs

Metadata Management Tools Metadata Registry Unified Catalog

Service

Vocabularies Core vocabularies Other vocabularies Categories I Need Tos

Showcase Support Documentation Demos, screenshots Code

Page 13: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 13

Quick Return

In order to show customers and users what could be done, the MSWeb team focused on creating a service that used editorial selection, tagging and the core vocabularies to create a small set of highly relevant content for MSWeb and sub-portals This was leveraged in: Navigation through categories Search results (Best Bets) Customizable versions of both for sub-portals

Page 14: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 14

What Did This Involve?

Less than 100 categoriesFewer than 1000 tagged surrogate records for high demand search contentThose 1000 records are used in categories, Best Bets, and other areas on siteTakes only a few hours a week to maintain and update databaseResults enabled users to directly navigate and search for high-use content

Page 15: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 15

Common Elements

Page 16: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 16

How It Works

From Peter Morville- http://semanticstudios.com/events/iacm1102.ppt

Page 17: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 17

Moving to the Enterprise

Once MSWeb had shown what was possible, the next step was to deploy to the sub-portals. To make this happen, they:

Turned user’s (whether an end-user or a sub-portal search page) query into an XML request which returns an XML response of search results.

Leveraged taxonomy work to provide a deep resource for search and browse

Enabled rapid customization by embedding parameters into XML schema

Provided consultation and assistance with category building and tagging skills

Result- everyone was on the same platform, and using core vocabularies

Page 18: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 18

How It Works

Vocabulary and Schema

Database

Site Server indexes

Search DLL

Modifiedstring

Searchresults

Input query

XML

XML

Page 19: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 19

Query String ParsingQuery: XML in

SQL2000

Page 20: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 20

MSWeb Search

Page 21: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 21

MSW All-Intranet Schema

Query type & schema(from query parser)

Results properties definition

Collection definition

Sort parameters

Page 22: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 22

Category Schema

Results properties definition

Collection definition (Category set)

Hierarchy display parameters

Sort parameters

Page 23: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 23

What This Does

Puts all processing on server side- client (the sub-portal server) just needs a few lines of code to pass and receive XML streamsClient (sub-portal server) site is insulated from code changesSimple parameter changes allow customization of collections, query type, indexed properties, etc.

Page 24: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 24

The Old Way

Page 25: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 25

The New Way

Page 26: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 26

NTServer

Page 27: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 27

ITGWeb

Page 28: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 28

WordTest

Page 29: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 29

PGPortal

Page 30: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 30

Measures for Success

For MSWeb, the goal was to increase user’s ability to navigate easily and find information more quicklyFor sub-portals, the goal was to have the ability to leverage MSWeb’s resources locally in a sustainable wayHere’s some results

Page 31: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 31

Results for End-UsersKey measure Q4 99 Q1 00 Q2 00

Total number of registered sites 834 858 808

Average # Best Bets returned with 20 top search strings 3.6 2.75 4.35

Modal # BB with top 20 1 5 1

Median # BB with top 20 2.5 3 3

Percentage of all top search strings that return Best Bets 69% 85% 98%

Percentage of 50 top search strings that return BBs 82% 84% 98%

Percentage of 20 top search strings that return BBs 90% 80% 100%

Number of all top search strings returning 10 or more Best Bets

18 12 5

Number of top50 search strings returning 10 or more BB 6 10 5

Number of top 20 search strings returning 10 or more BB 3 6 4

Page 32: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 32

User Satisfaction

Usability testing provided the following before and after numbers: A 62% reduction in the number of clicks An average of 16 seconds saved per

task An 11% increase in task success rate High employee satisfaction with the site

42% VSAT in year 2000 field survey Only 4% DSAT on same survey

Page 33: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 33

Portals Using MSWeb Services

In the first three months of the offering, 9 sub-portals implemented search on their sites2 of those created site-specific categories for their navigationAll leveraged the MSWeb Best Bets results in their custom searchNo increase in staff at MSWebEquivalent to a cost savings of 45 person years in avoided work

Page 34: Integration Issues IMT 589 February 4, 2006. 2/4/2006IMT 589-Applied and Structural Metadata2

2/4/2006 IMT 589-Applied and Structural Metadata 34

What Worked?

Providing a clear example of taxonomy value in MSWeb search and navigationBuilding a taxonomy management tool that used an extensible data modelEmpowering portal owners to do their own management of site navigation, and separating from the core shared taxonomyDivorcing presentation from delivery through use of XML and XSLLeveraging the taxonomy through tools to support all of the above