getting the most out of type-ahead/autocomplete - lavacon 2015 propsoal by brian eisenberg

33
1 Copyright © 2013 Earley & Associates, Inc. All Rights Reserved. Essential site search features and functionality and how they can be used to deliver an improved search experience Advances in Search & Findability

Upload: jack-molisani

Post on 15-Jul-2015

113 views

Category:

Technology


0 download

TRANSCRIPT

1Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Essential site search features and

functionality and how they can be used

to deliver an improved search experience

Advances in Search & Findability

2Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Outline

• Introduction

• What’s this webinar about and why should I care?

Leveraging taxonomies for search

Search issues on LOC.GOV

• Overview of essential search features and functionality

Setting up search

Search analytics

Faceted search

Auto complete

Redirects

Auto Correct

Sort

Compare

• Q&A

3Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Getting the most out of type-ahead/autocomplete

In this lecture, attendees will learn about the latest and greatest in type-

ahead and autocomplete technology and functionality.

Brian Eisenberg, Associate Search & Taxonomy Consultant – Earley & Associates

5 years experience leading search & taxonomy

Experience with Endeca, ATG, Omniture

Introduction

4Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Why is this important?

• Identify some search issues we see on special library sites

• Show the features and tools we use on popular search engine platforms to

address relevancy

• Talk about how search benefits from well designed taxonomies

• Hopefully you’ll get some ideas on ways these features can be leveraged in your

organizations

5Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Integrating a taxonomy for search can improve the results and experience in several ways:

Auto-completion using taxonomy entities

Refinement of results using the full taxonomy (faceted search/browse)

Synonym expansion of content based on taxonomy

Ability to expand results or begin navigation of the taxonomy

Leveraging Taxonomies for Search

Pre search filtering in auto

complete based on taxonomy

Post

search

filters

(faceted

search)

6Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Conducting a misspelled search query on loc.gov

User is prevented from seeing the thousands of great results available at LOC

because there is not a simple spellcheck feature in place.

Opportunity is lost to teach the searcher the correct spelling.

Search on LOC.GOV

7Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Search on LOC.GOV

• Selecting the top result from LOC SERP shows that even that was not relevant,

a ‘false positive’, which would have been eliminated by using some of the core

relevancy ranking features we’ll be discussing.

8Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Query was automatically corrected based on probability algorithm and desired results

delivered

• Similar tools available on leading open source and commercial search engines

Autocorrect on Google

9Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

A brief overview in setting up search and measuring quality

Setting up Site Search

10Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Most search engines come out of the box with relevancy scoring based on a popular model, like

TF/IDF which extends a simple Boolean model.

Search is sometimes set up by IT alone which often leads to poor results in terms of relevancy

and UX.

Relevant search results happen when understanding the business and user needs, content

available, and customizing the search experience to support those goals.

Test, review, iterate.

A few search ranking factors to highlight:

Date of publication

Boost documents that have been published more recently

Number of matching terms (Min Match)

Can define the number and/or percent of terms from a multi-term query that must match document to be

considered relevant (e.g if 1 to 2 terms in query both must match, if 3 terms than 2 of 3 must match, if 4 or

more than 75% or greater must match)

Field weight boosts

Can give preference to matches in title or header of page over matches against body which is more indicative

of the ‘aboutness’ of the document

Document Type boosts

Can give preference to certain types of content (e.g. buying guides over products over photos)

Term proximity

Determine how far apart terms should appear for the document to be considered relevant (ex. Franklin

rosevelt)

Developing the search algorithm

11Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Once documents are indexed we can begin to customize the search algorithm by

defining the fields that are searched and the relative importance of the fields via boosts.

• When a search is run the algorithm scores documents and results are returned based on

score.

<<field name="merchantName" type="text" indexed="true" stored="true" omitNorms="false" boost="20.0"/>

<field name="displayName" type="text" indexed="true" stored="true" omitNorms="false" boost=“10.0"/>

<field name="merchantMetaKeywords" type="text" indexed="true" stored="true" omitNorms="false" boost="0.5"/>

<field name="protectedKeywords" type="text" indexed="true" stored="true"/>

<field name="keywordPrefixes" type="text" indexed="true" stored="true"/>

<field name="merchantAdditionalKeywords" type="text" indexed="true" stored="true"/>

<field name="merchantSearchKeywords" type="text" indexed="true" stored="true" omitNorms="false" boost="5.0"/>

Solr algorithm example

Field searched Field boost

12Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Easy to use interfaces to review and edit relevancy factors and control search features.

• Solr Relevancy Workbench Endeca Workbench

Solr Relevancy Workbench

Search manager UIs for relevancy tuning

13Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Search analytics

• Great analytics tools out there, any of which should be used to gain insight.

• Google Analytics is free, easy to install, and provides robust, actionable data:

• How often are users searching and what are they searching for?

• What searches are leading to 0 results?

• What are users doing following their search? Exiting or clicking through?

14Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Search analytics – null results

• Below is a set of queries which led to zero or null results pulled from search analytics for

an online coupon site.

• Searches were then categorized as to whether a synonym for thesaurus expansion was

needed, or there is a content gap, or other.

15Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Features and functionality we use most often to improve the

relevancy of search results delivered.

Key Search Features

16Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• http://www.musiciansfriend.com/search?sB=r&Ntt=tambourine

• A site search usually results in thousands of results, and one of the best ways a

user can sort through them is faceted search, aka refinement types.

• These filters are usually present in the left rail on search results pages. Notice the

“subcategories” and “narrow by” options in the left rail, they are all ways to refine

the search results:

Faceted search - a.k.a Search Refinements:

17Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Faceted search can also be applied to static pages, such as this category page. A

deeper level of detail can be applied to refinement types that are specific to the

category. Notice at the bottom of the left rail, shell material and snare size:

• Global refinements:

• Local refinements:

Static page refinements: www.musiciansfriend.com/snare-drums

18Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Auto complete - LOC

• Autocomplete available on the redesigned LOC site but a number of the suggestions are confusing and it doesn’t

appear to have been optimized before rollout.

• Goal of auto complete is not only to help users avoid misspellings and get results more quickly but guide them to

better queries.

?

19Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• There are different approaches to type-ahead/auto complete. Results sorted by

matching brands, products, and top searches (taken from internal search logs):

Auto complete - Guitar Center

20Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Type-ahead list that includes top searches, results for each of the top searches with

thumbnails, matching categories, brands, buying guides, and installations:

Auto complete – Home Depot

21Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Auto complete - LinkedIn

Results

clustered by

type with

images and

logos.

People in

your network

are listed

first.

22Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Redirect to any landing page- Creates controlled custom experience, can be

applied to any keywords: e.g. “guitar” on Musicians Friend does not complete a

search but rather redirects to the category page (doubles the conversion rate):

Auto redirects to landing pages: www.musiciansfriend.com/guitars

23Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• http://www.musiciansfriend.com/search?sB=r&Ntt=chmes

Auto correct misspellings and approximate matches:

“Chmes”

becomes

“chimes” and

provides the

same search

results

automatically.

24Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• http://www.musiciansfriend.com/search?sB=r&Ntt=cord

Thesaurus entries, e.g. “cord” equals “cable”:

Querying “cord” or

“cable” provide the

same search results.

These are manually

entered versus auto-

correction which is

automatic.

25Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Endeca workbench thesaurus entry: one way

Notice here that “oysters”, “lobster”,

and “shellfish” are entered as a

subset of “seafood”. Only “seafood”

is searched for all terms.

26Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• User can sort search results based on any specified metadata so they have control

in seeing search results ordered by their desired criteria

Search results page sorts, AKA “SERP sorts”

27Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Start with search for “lawn” on Home Depot and choose to “compare” two items:

Compare functionality

28Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Show search results side by side to compare specs. These attributes are any

defined metadata fields that can be global or unique to the category:

Compare functionality

29Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Scroll down for a deeper comparison:

Compare item records

side by side. Any

specified metadata can

be displayed here.

30Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Products, articles, media, reviews all in one search. Search and web

platforms are able to create their own indices and display results from all

sources in all formats.

• Notice buying guides and products guides mixed in with products,

categories, and brands:

Aggregate content

31Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Choose your store to see custom results:

Personalization and contextualization

32Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

• Searches another site on what you just searched:

New development: Search ad

33Copyright © 2013 Earley & Associates, Inc. All Rights Reserved.

Conclusion

Questions?