advanced topics in search theory 1 - introduction
Post on 21-Dec-2015
222 views
TRANSCRIPT
In Today’s ClassIn Today’s Class
Course proceduresWhat is economic search?Characteristics of economic searchClassical models in Search Theory:
– One Sided– Two-Sided– Mediated Search
Reservation-Value based search2
GoalGoal
Get familiar with the concept of “economic search”
Learn and master the main principles of economic search:– One-sided– Two-sided
3
Course ProceduresCourse Procedures
Course web-site can be found here:http://www.cs.biu.ac.il/~sarned/Courses/search/
Teacher: David Sarne ([email protected])
Office hours: Thu 15:00-16:00 (building 216, room 2)
Course exercises – 20%Course final exam – 80%
4
Course PlanCourse Plan
5
Week Topic Readings
1 Introduction to Search Theory
2 Pandora’s Problem
3 One-Sided Search – principles and optimal strategy
4 One sided search with unknown distribution
5 Concurrent search
6 Cooperative Search
7 The secretary Problem
8 Market throughput in one-sided search
9 Two-Sided Search with no search costs
10 Two-Sided Search with search costs multi-type
11 Two-Sided Search with search costs with one and two types
12 Throughput in two-sided search
13 Two-sided search with mediators
Disclaimer…Disclaimer…
Search in AI: deals with finding nodes having certain properties in a graph (find an optimal path from the initial node to a goal node if one exists)
– Branch and bound– A*– Hill climbing– …
This is not what we are interested in (at least in this course)
We deal with economic search
6
Have you searched for Have you searched for something lately?something lately?
Can you give examples for what you’ve searcher for?
7
8
Searching What?Searching What?
Everything!– Searching for a partner– Searching for a job– Searching for a product– Searching for a parking space– Searching for a java class (reuse)– Search for a thesis advisor– …
The goal here is to optimize the process rather than ending up with the optimal search object
How about the How about the “secretary problem”?“secretary problem”?
(also known as the marriage problem, the sultan's (also known as the marriage problem, the sultan's dowry problem, the fussy suitor problem)dowry problem, the fussy suitor problem)
There is a single secretarial position to fill. There are n applicants for the position, and the value of n is
known. The applicants can be ranked from best to worst with no ties. The applicants are interviewed sequentially in a random order,
with each order being equally likely. After each interview, the applicant is accepted or rejected. The decision to accept or reject an applicant can be based only
on the relative ranks of the applicants interviewed so far. Rejected applicants cannot be recalled. The object is to select the best applicant. The payoff is 1 for the
best applicant and zero otherwise. 9
Example - Marriage MarketExample - Marriage Marketlegacy domain (search “pioneers”)legacy domain (search “pioneers”)
Lifetime Utility
f(x)
Statistics ReminderStatistics Reminder
given a continuous random variable X, we denote:– The probability density function, pdf as f(x).
(also known as the probability distribution function and the probability mass function)
– The cumulative distribution function, cdf, as F(x).
The pdf and cdf give a complete description of the probability distribution of a random variable
11
PDFPDF
The pdf of X, is a function f(x) such that for two numbers, a and b with a≤b:
That is, the probability that X takes on a value in the interval [a, b] is the area under the density function from a to b.
12
CDFCDF
Thecdf is a function F(x), defined for a number x by:
That is, for a given value x, F(x) is the probability that the observed value of X will be at most x.
13
אחידה: התפלגות אחידה: דוגמה התפלגות דוגמה
14
200 300
f(x)=0.01
300 x 1
200 x 0
300x200 200*01.0
)(
x
xF
Sampling from the Sampling from the distributiondistribution
Draw a random value from a uniform distributionTake the value for which the CDF equals the
value drawn
16
P1
P2
P3
P4
t
f(t)
f4
f3
f1
f2
x1 x2 x3 x4 x5x
Fitting a DistributionFitting a Distribution
Visualize the Observed Data (decide on how to divide date to bins)
Come up with possible theoretical distributions
Test goodness-of-fit and p-values based on the empirical distribution function (EDF):– Kolmogorov-Smirnov– Chi-Square– Anderson-Darling
17
measures of discrepancy between the empirical distribution function and the cumulative distribution function based on a specified distribution
Comparison Shopping Agents Comparison Shopping Agents (CSAs)(CSAs)
Shopbots and Comparison Shopping– automatically query
multiple vendors for price information
– Growing market, growing interest
comparison-shopping agents
Comparison Shopping Agents Comparison Shopping Agents (CSAs)(CSAs)
Offline - central DB of prices (daily updated):
DB RequestsUIQuery
Timely Updates
Timely Updates
Timely Updates
Timely Updates
Real-time querying upon receiving a request:
RequestsUI
Query
Query
Query
Query
Real-Time Querying (CSAs)Real-Time Querying (CSAs)• Ever-increasing frequency of price updates
• Dynamic pricing theories (based on competitors’ prices) [Greenwald and Kephart, 1999]
• “Hit and run” sales strategies (short term price promotions at unpredictable intervals) [Baye et al, 2004]
Assumption: Future CSAs will use real-time (costly) querying
ExerciseExercise
Select 5 different products (preferable electronics, computers etc.)
Collect Prices for these products over the internet – build their empirical distribution (at least 50 prices for each)
Fit to a know distribution or describe the empirical distribution obtained
Calculate the optimal search ruleSend all the data with your file
22
23
Example - Marriage MarketExample - Marriage Marketlegacy domain (search “pioneers”)legacy domain (search “pioneers”)
Lifetime Utility
Should I try to do better?
f(x)
24
Can we do better?Can we do better?
Yes we can!However, it has a costThus a search strategy is needed
Strategy: (opportunities, time, cost)->(terminate, resume)
Search CharacteristicsSearch Characteristics
A distribution of plausible opportunities
The searcher is interested in exploiting one opportunity
Unknown value of specific opportunities
Search costs
Searching What?Searching What?Application Cost Opportunity
Marriage Market Time / money / loneliness
Better partner
Job Market Time / money / confidence
Better job
Product Time / money Better price / performance
Parking time Closer parking space
Looking for a thesis advisor
Working with him a little
More interesting thesis
…
Anyone searched for an apartment in her life? What made you take the one you are living in?
Anyone searched for an apartment in her life? What made you take the one you are living in?
Anyone sold an apartment in her life? What made you accept the “winning” bid?
Anyone sold an apartment in her life? What made you accept the “winning” bid?
The key concept – don’t attempt to find the best opportunity, instead find the best policy
The search strategyThe search strategy
After each draw, the searcher has a choice:– Keep what he has– Draw another opportunity from the
distribution F(), at a cost c
Notice: the net profit is a random variable whose value depends both on the actual draws and on his decisions to accept or reject particular opportunities
27
The GoalThe Goal
Maximize the expected value of the net profit
28
Application Cost Opportunity
Marriage Market Time / money / loneliness
Better partner
Job Market Time / money / confidence
Better job
Product Time / money Better price / performance
Parking time Closer parking space
The optimal strategyThe optimal strategy
Let V* be the expected profit if following the optimal strategy
Clearly the searcher should never accept an opportunity with a value less than V*
If he rejects the opportunity, he is in the same situation as a searcher who is starting anew: expect profit V*
Therefore:
29
y
dyyfVycV )(*],max[*
30
Example - Marriage MarketExample - Marriage Market
Lifetime Utility
Should I try to do better?
f(x)
Reservation V
alue - x
In a simple infinite horizon model - doesn’t depend on history
What is a reservation value?What is a reservation value?
It’s a threshold for decision making!
Example: “Krovim Krovim”
The reservation property of the optimal search rule is a consequence of the stationarity of the search problem (a searcher discarding an opportunity is in exactly the same position as before starting the search)
31
32
Example - Marriage MarketExample - Marriage Market
Lifetime Utility
Should I try to do better?
f(x)
Reservation V
alue - x
Terminate Search
Resume Search - sample one more
In a simple infinite horizon model - doesn’t depend on history
33
Terminate Search
Resume Search - sample one more
The optimal Reservation ValueThe optimal Reservation Value
Lifetime Utility
f(x)
V (x) c yf (y)dyyx
F(x)V (x)
x
V (x)
c yf (y)dyyx
1 F(x)
Distribution of utilities in the environment (p.d.f / c.d.f)Search
costExpected utility when using reservation value x
)(xfF(x)
34
The Reservation Value ConceptThe Reservation Value Concept
V (x) c yf (y)dyyx
F(x)V (x)
V (x)
c yf (y)dyyx
1 F(x)
Distribution of utilities in the environment (p.d.f / c.d.f)Search
costExpected utility when using reservation value x
What is x that maximizes V(x)?
)(xfF(x)
35
The Reservation Value ConceptThe Reservation Value Concept
V (x)
c yf (y)dyyx
1 F(x)
dV (x)dx
xf (x) 1 F(x) f (x) c yf (y )dyyx
1 F(x) 20
V (x) 1 F(x)
dV (x)dx
xf (x) 1 F(x) f (x)V (x) 1 F(x)
1 F(x) 20
xV (x)
36
Example - Marriage MarketExample - Marriage Market
Lifetime Utility
Should I try to do better?
f(x)
Reservation V
alue - x
Terminate Search
Resume Search - sample one more
The expected utility from accepting only “better” partner than the optimal reservation value woman will yield an expected overall utility equal to the “lowest’ partner I’m willing to accept
Some more interesting Some more interesting interpretationsinterpretations
37
*)(*)()(*)(*
xVxFdyyyfcxVxy
**)()(**
xxFdyyyfcxxy
**)( xxV
*
*
)(*)(*x
yxy
dyyfxdyyyfcx
*
)(*)(**xy
dyyfxyxcx
Some more interesting Some more interesting interpretations (2)interpretations (2)
38
*
)(*)(**xy
dyyfxyxcx
Stop searching and keeping x*
Searching exactly one more time
Myopic ruleMyopic rule
Important property of the optimal search rule – myopic:– The searcher will never decide to accept
an opportunity he has rejected beforehandSearcher cares only about whether or
not he wants the opportunity nowTherefore, we don’t care for the recall
option
39
Also notice that…Also notice that…
and:
40
0*
dc
dx
V (x)
c yf (y)dyyx
1 F(x)
Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, "success" and "failure".
Calculating the optimal RVCalculating the optimal RV
41
dV (x)dx
xf (x) 1 F(x) f (x) c yf (y )dyyx
1 F(x) 20
xy
dyyyfcxfxFxxf )()()(1)(
xy
dyyyfcxFx )()(1
xy
xy
xy
dyyFyyFdyyyf )()()(
Notice that:
Calculating the optimal RVCalculating the optimal RV
42
xy
xy
xy
dyyFyyFdyyyf )()()(
xy
xy dyyFyyFcxFx )()()(1
Therefore:
xy
y dyyFxyyFc )()(
xy
dyyFc )(1
CS economic search domainsCS economic search domains
CSAsJob schedulingSearching for free space in disksSearching for media in P2P
Classical tradeoff – time it takes to process vs. time it takes to find a strong processor
43
The Scheduling ProblemThe Scheduling Problem
Proxy
Price quote (q)Processor 1
Processor 2
Processor N)(qf
)(qf
)(qf
Price quote (q)
Price quote (q)
Scheduling
Process
c1
c2
cN
WorkFlowWorkFlow
Receive a job Contact proxy to learn about available
processors Query processors by using the proxy
– Each query delays you in c_i seconds– Each query will return the temporary load on
the server (this value will not change as long as current job is not scheduled)
Keep on querying until you are ready to schedule your job
The Goal is…The Goal is…
To schedule the job in a way that minimizes the EXPECTED overall delay– Overall delay = all delays due to queries +
the time job waits in queue of the selected processor
Problem 1Problem 1
You are about to purchase an iPod touch over the internet
You estimate the price distribution of the product over the different sellers to be uniform between 200-300 dollars
You can search by yourself, by visiting different web-sites – the cost of time for obtaining a price quote is $1
How will you search? What will be your expected cost? What’s the mean of the number of merchants you’ll visit?
SolutionSolution
200 300
f(x)
0.01
• Sequential search:
x
y
dyyFc0
)(
x
cost of search
marginal benefit
300 x 1
200 x 0
300x200 200*01.0
)(
x
xF
)()(1)()(200
xVxFdyyyfcxVx
y
x
y
ydyxVxF200
01.01)()(
Find the minimum costFind the minimum cost
x
y
ydyxVxF200
01.01)()(
2*01.0
200005.01)(
2
x
xxV
2*01.0
005.01)( 200
2
x
yxV
x
y
0
2*01.0
199005.001.02*01.001.0)('
2
x
xxxxV
199005.02*01.0 2 xxx
214.14 185.8,x
300 x 1
200 x 0
300x200 200*01.0
)(
x
xF
VerificationVerification
V(x)=x?
Mean number of merchants visited:
Mean payment to merchant: 214.14-7.14=207 (notice it’s less than minimum of sampling 7 merchants)
14.214
2*01.0
200005.01)(
2
x
xxV V
14.72*01.0
1
xN