gregory grefenstette exalead exalead s.a. © 2009 search-based applications: the maturation of...

Post on 29-Mar-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Gregory GrefenstetteExalead

Exalead S.A. © 2009

Search-Based Applications:the Maturation of Search

Maturation of Search

2

3

www.exalead.com/search 8 billion URLS, 2 billion images, 200 million videosWikipedia, cloud tags also Labs.exalead.com

Two ways to find information

44

DATABASESDATABASES

SEARCH ENGINESSEARCH ENGINESVSVS

Recent Past

5

SEARCH ENGINESSEARCH ENGINESDATABASESDATABASES

• Structured Data

• Transaction• Precise• All tuples

• SQL• Slow

• Structured Data

• Transaction• Precise• All tuples

• SQL• Slow

• Text • Similarity• Ranking

• Intuitive• Fast• Partial

• Text • Similarity• Ranking

• Intuitive• Fast• Partial

More Recent

6

DATABASESDATABASES

• Structured Data

• Transactions

• Precise• All tuples

• SQL• Slow

• Structured Data

• Transactions

• Precise• All tuples

• SQL• Slow

• Text • Similarity• Ranking

• Intuitive• Fast• Partial

• Text • Similarity• Ranking

• Intuitive• Fast• Partial

• Top-K• Column store• Map Reduce• Data Cube

• Top-K• Column store• Map Reduce• Data Cube

• Connectors• Facets• Map Reduce• Tables

• Connectors• Facets• Map Reduce• Tables

SEARCH ENGINESSEARCH ENGINES

NOW

DATABASESDATABASES SEARCH BASED SEARCH BASED APPLICATIONSAPPLICATIONSSEARCH BASED SEARCH BASED APPLICATIONSAPPLICATIONS SEARCH ENGINESSEARCH ENGINES

Search based Application

An application which uses a search engine component, but whose final purpose is not searching for a document, but rather a domain-oriented process result

– Examples: • Custom response management• Logistic tracking and tracing• Contextual Advertising• Database reporting after offloading

8

Databases are the backbone of search in information systemsDatabases are the backbone of search in information systems

Current situation

Front-officeusers

DatabaseDatabase

DataDataWarehouseWarehouse

DataMartDataMart

BIreports

Businessprocesses

Search-enabled applicationOptimized solution for information accessOptimized solution for information access

DatabaseDatabase

DataDataWarehouseWarehouse

SearchSearchEngineEngine

Front-officeusers

BIreports

Businessprocesses

Drawbacks of Using

Database Search

As aComponent

12

Search Based ArchitectureSearch Based ArchitectureStandard ArchitectureStandard Architecture

How does a Search Based Application work?

14

• Business items are concrete objects directly understandable by end-users– Product, Customer, Purchase order, Technical support call

• Each business item becomes a document• Straightforward and simple format of the document index

allows performance and ease-of-use• Search engine can offer rich and powerful query language that

allows to make queries as complex and advanced as SQL despite the flat data model

• Search Engine must support – typed fields, intra field scope search, category/facets

15

Database converted to Business ItemsDatabase converted to Business ItemsStored as structured documentsStored as structured documents

Product_ID Product_Name Manufacturer_Names

123 control switch ACME Inc ; The Control Switch Company; Karl GmbH

124 red warning light …

Database into structured documentsDatabase into structured documents

Scope Search

Product_ID Product_Name

123 control switch

124 red warning light

Product_ID Manufacturer_ID

123 345

123 8574

123 4483

Manufacturer_ID Manufacturer_NAME

345 ACME Inc.

8574 The Control Switch Company

4483 Karl GmbH

Product_ID Product_Name Manufacturer_Names

123 control switch ACME Inc ; The Control Switch Company; Karl GmbH

124 red warning light …

… but the manufacturer namescan still be searched as individual records with scope search "ACME GmbH"does not match the document here)

Hierarchical categories

18

Product_ID Color Brand Fragile Nb of wheels

Wheel type

123 Red ACME Y 3 2

Product_ID Country

123 France

123 UK

123 Germany

Product_ID Attributes

123 Color/Red ; Brand/ACME ; Fragile/Y ; Nb_wheels/3 ; Wheel_type/2;Country/France ; Country/UK; Country/Germany

124 …

Multiple kinds of attributes can be mixed in a same category field. The hierarchical tree structure of

the categories preserves the differences between attribute

types

Multiple kinds of attributes can be mixed in a same category field. The hierarchical tree structure of

the categories preserves the differences between attribute

types

Multi-valued attributes can also be represented by categories. A single category field can be used

to store hundreds or thousands of attribute columns.

Multi-valued attributes can also be represented by categories. A single category field can be used

to store hundreds or thousands of attribute columns.

Multi-dimensional facets

19

Multi-dimensional facets• Search results facets provide aggregate values computed on-

the-fly with the search results list– One single search query can return the equivalent of dozens of

“GROUP BY” SQL clauses– Numerical values associated with facets (count, score, …) can be used

to perform complex computations on the results list

20

• Search performance is not affected by the size of the category tree– Thousands of attribute types can be represented by categories– Facets are dynamically selected by the search results: the displayed

attributes are always consistent with the search query (e.g. “color” and “engine type” when searching for a car, “screen size” and “CPU speed” when searching for a laptop)

CASE STUDY

LOGISTICS TRACK & TRACE

21

Gefco overview• A subsidiary of French car maker PSA (Peugeot, Citroën)

– Now does most of its business outside of PSA• Logistics operator

– Carries cars from factories to dealers (road, rail)– Carries freight (parcels ; originally spare parts)– Supply chain and logistic platform design

• 3.5B€, 10 000 employees, 100 countries

The original pain

• Classical multi-criteria search over Oracle, 2 million rows• Poor performance despite 2 years of optimization

– Minute response times– Ask users to do simple queries and preferably at some given hours

From forms to a search box

24

25

New application With operational reporting

French Post Office

28

Partner

• Tracing of incidents• Real-time system• Used as an internal

audit tool for the mail• Suggestion of addresses

for customers• Search in file numbers,

addresses, names, etc.

• Tracing of incidents• Real-time system• Used as an internal

audit tool for the mail• Suggestion of addresses

for customers• Search in file numbers,

addresses, names, etc.

Case Study: RightMove

31

Rightmove: Reduce Costs and Improve Performance through Database

32

Advantages of Search Based Applications

33

35

Conclusions• Search engines mature

– Structured data, high volume, high speed• Search based Applications offer

– Usage: Search interface familiar to user– Performance: Search engine geared to search,

eases load on database platform– Agility: Original database design untouched,

reconfiguring output lightweight

36

top related