real-time web search: the road ahead

Post on 16-Apr-2017

596 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Real-Time Web SearchThe Road Ahead

Sep. 2009

Jonghun Parkjonghun@snu.ac.kr Seoul National Univ.

going real-time

what is real-time search?

small delay between data creation & indexing

microblog search

status search

search against current hot queries

search in a small time window

controversial examples

1 year old photo just uploaded

first report on Michael Jackson’s death

“At work...wish I was home..love my family”

“IRS” query on Apr. 14th

real-time monitoring

real-time search, redefined

Retrieving Information with Time Value at Right Time

sports game scoresstock prices

celebrity updatesnew products

deals

home pageswikipedia pages

recipesold news articles

regulations

opportunity

“what’s going on right now?”

$92B market

underlying technologies

- RSS- Atom- ping - SUP- API- information extraction- crawling

pull

- XMPP- PubSubHubBub- tornado- comet - Six Apart Update Stream

push

real-time search landscape

not so informative

lots of dittos and spams

danger of drowning

coverage of microblogs twitter: 23M+ monthly UV (compete.com) only 5% of twitter users accounts for 75% of all activity one quarter of all tweets are generated by bots twitter users are biased in terms of demographics

lack of quality quality assessment through

identifying most popular, most authoritative, most linked-to, or most re-tweeted items

such kind of filtering will require further processing that will decrease the freshness of information

balancing the tension among recency, relevance, and quality is not an easy problem

what’s desired measuring the quality of streaming sources instead of posts broader coverage

microblogs, blogs, public media, social media, and various casts beyond the simple buzz monitoring tool

topic focused, informative search results faster information discovery balancing users' needs to see results in real-time with

necessity to discover information from spam-free, quality sources

feedmil.com

feedmil approach po

pula

rity

key characteristics

well knownsurprising

feedmil.jp

what’s next: from pull web to push web real-location real-event information filtering personalization intelligent stream discovery more breakthroughs in stream publishing & consumption

Thank You

coming soon in oct. 2009

top related