boss open hack day, bangalore

Post on 22-Nov-2014

3.699 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

An introduction to BOSS API

TRANSCRIPT

Open Hack Day 2009 - Bangalore

Chris Heilmann Saurabh Sahni

Build your Own Search Service

http://www.slideshare.net/saurabhsahni/

- 2 -

Outline

•  Search engines using BOSS •  About BOSS API

–  What? –  Why? –  Features

•  How to use it –  BOSS API –  Code example –  BOSS Mashup framework

- 3 -

Search engines using BOSS

- 4 -

hakia: http://hakia.com/

- 5 -

hakia: http://hakia.com/

- 6 -

hakia: http://hakia.com/

- 7 -

Cluuz: http://cluuz.com

- 8 -

Cluuz: http://cluuz.com

- 9 -

Cluuz: http://cluuz.com

- 10 -

Keyword finder - http://keywordfinder.org/

- 11 -

askBOSS: http://ask-boss.appspot.com/

- 12 -

askBOSS: http://ask-boss.appspot.com/

- 13 -

askBOSS: http://ask-boss.appspot.com/

- 14 -

askBOSS: http://ask-boss.appspot.com/

- 15 -

askBOSS: http://ask-boss.appspot.com/

- 16 -

About BOSS API

- 17 -

What?

•  Open Yahoo’s core search features via web services to let 3rd parties revolutionize Search

http://developer.yahoo.com/search/boss

- 18 -

Usage

Opening the search technology stack

50B pages * 20ms page download = 31 years

CRAWL

EXTRACT

SPAM <-> Gold

Analyze

Index

Rank Assist

Index

Web Map

Retrieve

- 19 -

Usage

Opening the search technology stack

50B pages * 20ms page download = 31 years

CRAWL

EXTRACT

SPAM <-> Gold

Analyze

Index

Rank Assist

Index

Web Map

Retrieve

WEB API

Your App here

- 20 -

Why?

•  Removes entry barriers •  Asset to Innovate

–  Develop new relevance models –  Change presentation style

•  Search anywhere –  Improve Vertical Quality w/ Web comprehensiveness

- 21 -

BOSS API features

•  No branding or attribution •  Ability to change presentation stlye •  Ability to re-order results and blend-in additional content •  Access to multiple verticals (web search, image, news) •  Keyword suggestions, spell checks •  Semantic data, in-links, abstracts •  Ability to monetize

- 22 -

How to use it?

- 23 -

Get Started

•  Register for an application id http://developer.yahoo.com/wsregapp/

•  Documentation http://developer.yahoo.com/search/boss/boss_guide/

•  Code samples: Javascript, PHP and Python http://www.saurabhsahni.com/boss-examples.zip

- 24 -

BOSS API

Searching Slumdog Millionaire

(Source: http://en.wikipedia.org/wiki/File:Slumdog_Millionaire_poster.jpg)

- 25 -

BOSS API

•  Search for slumdog millionaire: – http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire?appid=xyz&format=xml

- 26 -

BOSS API: XML response

http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire?appid=xyz&format=xml

- 27 -

Site Restrict Search

•  Search for slumdog millionaire on selected movie sites –  Add param sites=indiatimes.com,movies.yahoo.com,imdb.com –  http://boss.yahooapis.com/ysearch/web/v1/slumdog

+millionaire?appid=xyz&sites=indiatimes.com%2Cmovies.yahoo.com&format=xml

- 28 -

http://boss.yahooapis.com/ysearch/web/v1/slumdog+millionaire? appid=xyz&sites=indiatimes.com%2Cmovies.yahoo.com&format=xml

- 29 -

Search images

•  http://boss.yahooapis.com/ysearch/images/v1/slumdog +millionaire?dimensions=large

- 30 -

http://boss.yahooapis.com/ysearch/images/v1/ slumdog +millionaire

- 31 -

Search News

•  http://boss.yahooapis.com/ysearch/news/v1/slumdog +millionaire?age=15d

- 32 -

http://boss.yahooapis.com/ysearch/news/v1/ slumdog + millionaire?age=15d

- 33 -

Movie Search Code Example

- 34 -

- 35 -

Movie Search Code Example

- 36 -

http://www.saurabhsahni.com/boss-examples.zip

- 37 -

More with BOSS API

- 38 -

Related keywords

Add parameter view=keyterms –  http://boss.yahooapis.com/ysearch/web/v1/slumdog

+millionaire?appid=xyz&view=keyterms&format=xml

- 39 -

http://boss.yahooapis.com/ysearch/web/v1/slumdog +millionaire?appid=xyz&view=keyterms&format=xml

- 40 -

•  Access structured data acquired through SearchMonkey

Semantic Data

- 41 -

Semantic Data

view=searchmonkey_feed view=searchmonkey_rdf

http://developer.yahoo.com/search/boss/stuctureddata.html

- 42 -

http://boss.yahooapis.com/ysearch/web/v1/slumdog +millionaire?appid=xyz& view=searchmonkey_feed&format=xml

- 43 -

Long abstracts

•  Add parameter abstract=long –  get up to 300 characters instead of 130

- 44 -

Spell Check

http://boss.yahooapis.com/ysearch/spelling/v1/milionare?format=xml

Response

- 45 -

BOSS Search API REST Interface

•  {query}: term to look for (url-encoded) •  {vert} := {web, news, images, spelling} •  @ required

–  appid

•  @ optional –  start, count, lang, region, format, callback, sites, view

http://boss.yahooapis.com/ysearch/{vert}/v1/{query}

- 46 -

Site Explorer

•  Get page inlinks –  http://boss.yahooapis.com/ysearch/se_inlink/v1/{URL}

?appid={APPID}

•  Page data: collection of subpages in a domain –  http://boss.yahooapis.com/ysearch/se_pagedata/v1/{URL}

?appid={APPID}

- 47 -

BOSS Mashup Framework

•  Python (v2.5+) library

•  BOSS Search SDK plus …

•  SQL for remixing arbitrary XML/JSON sources

http://developer.yahoo.com/search/boss/mashup.html

- 48 -

BMF + Google App Engine

•  Enhanced version of BMF to GAE platform

•  http://zooie.wordpress.com/2008/08/04/yahoo-boss-google-app-engine-integrated/

•  Enables quick deployment of BOSS applications online

- 49 -

More BOSS Implementations

•  http://mashable.com/boss/ •  http://delicious.com/tag/bossmashup •  Add yours by tagging it with “bossmashup” on Del.icio.us!

- 50 -

One more thing…

- 51 -

BOSS Custom

Usage

50B pages * 20ms page download = 31 years

CRAWL

EXTRACT

SPAM <-> Gold

Analyze

Index

Retrieve

Rank Assist

Web Map

WEB API

Your App here

- 52 -

Questions?

Thank You

More: http://developer.yahoo.com/search/boss/

Slides: http://www.slideshare.net/saurabhsahni/

- 53 -

Appendix

- 54 -

http://www.yahoo.com

Search UI Templates are Included in the BOSS Mashup Framework

BOSS Mashup Framework simplifies aggregating and presenting multiple data sources

- 55 -

BMF Features

•  select, group, sort, union, joins, udfs, where •  Text normalization and duplicate removal •  Auto-transformation of resource-oriented API results

into tables w/o parsing •  All-in-memory storage and retrieval operations •  Ability to join lists of tables via an arbitrary predicate

function (map-like)

•  Search UI template framework •  Single search function provides total access to

BOSS REST API

- 56 -

BOSS in Academic Research

•  The biggest dataset available on web •  Very useful for Web-mining research experiments

–  Natural language processing –  Semantic extraction –  Related keywords –  Similarity detection –  Clustering algorithms –  Spelling corrections

top related