let search power your intranet!

Post on 19-Mar-2017

96 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Copyright © President & Fellows of Harvard College

Let Search Power Your Intranet!

Ravi Mynampaty

About Ravi

A hustler making a living by pretending to know more about

Enterprise Search than he actually does...

“I can live on a good compliment two weeks with nothing else to eat...”

@RaviMynampaty

Why the heck should I listen to Ravi?

Agenda

CASH

Architecture

Demo

Search Index

Content UI

Umbrella Policy

What's this talk about?

Many metaphors...

Data lakes

Warehouses

Silos

Cleanse

Assemble

Supplement

Harmonize

Content

Search Index

CMS Records

Web pages RDBMS

Assemble HarmonizeCleanse Supplement

Learning ManagementSystems

Service Management Systems

CRM Systems etc.

Why?

OptionsWhat? VF ⇒ PF

Replace variant forms (VF) with Preferred form (PF)

E.g., All variant forms of “Apple...” ⇒ “Apple Inc.”

Where?Source UISearch Index

Metadata

PDF

PDF

Metadata

Search IndexSingle Record

Why?

Document

Analytics(Popularity)

Search Index

Single Record

- Postal code ⇒ City, State- Implicit metadata- Link depth

Supplemented RecordsId <Record 1>

URL www.hbs.edu/faculty/

Popularity 950218

LinkDepth 1

Id <Record 2>

URL www.hbs.edu/mba/academic-experience/blog/post/hbs-global-initiative-research-centers

Popularity 2493

LinkDepth 6

For the sake of Relevance!

http://www.hbs.edu/search.aspx?q=finance&.....&bboost=sum( product

(sub(10,LinkDepth),0.1), max(log(Popularity),1) )

&...

Standardized Field Names

Record 3

Field: Webaddress

Record 1

Field: URL

Record 2

Field: Link

Record 3

Field: HBSLink

Record 1

Field: HBSLink

Record 2

Field: HBSLink

Search Index

Content

CMS Records

Web pages RDBMS

Learning ManagementSystems

Service Management Systems

CRM Systems etc.

Content

Search Index

CMS Records

Web pages RDBMS

Assemble HarmonizeCleanse Supplement

Learning ManagementSystems

Service Management Systems

CRM Systems etc.

OTC: One True Collection

Why?

Users

Federated Search

One search box to rule them all

Multiple search tool federation

Harmony at the SERP level

What we thought was the Holy Grail

One True Collection (OTC)

One search box to rule them all

Single search index

Harmony at the result level

What the users wanted

Content

Search Index

Your Bank!● Fast lookup● Web services for

all our content

CMS Records

Web pages RDBMS

Assemble HarmonizeCleanse Supplement

Learning ManagementSystems

Service Management Systems

CRM Systems

● Normalize data● Remove special chars● …

● Standard fields● One True Collection● ...

● Analytics (popularity)● Postal Code⇒ city,state● Implicit metadata● LinkDepth● ...

Intranet Websites etc.

Start making withdrawals !!

Joins● PDF Full-text● Person record● ...

etc.

CASH

Web

Ser

vice

s A

PIs

Solr

CMSWeb

Legacy 1

ITAssets

Legacy 2

Assets

Collections

Oracle DB

Java DBImport

Loader

Web ServicesAPIs

Web Connector

Solr XML

XML Connector Informatica

Web Connector

Crawl Pages

CMS

Web Services

APIs

Web Loader

Intranet, Apps, Portals

Websites

Websites

HBS Search Service

Enterprise Search Architecture

Some Architectural Considerations

Search index design

Hardware, Scalability

Query optimization

High Availability (HA), Disaster Recovery (DR)

Analytics, Ongoing relevance tuning

Security

Some Security Considerations

Logged In/Out

Repository level, Document level, Field level

Group-based, Role-based, Individual-based

Index-time vs. Query-time

Demo!

Thank you!Questions?

searchguy@hbs.edu@RaviMynampatylinkedin.com/in/mynampatyfacebook.com/ravi.mynampaty

top related