tddd10 ai programming multiagent decision...
TRANSCRIPT
TDDD10AIProgrammingMultiagentDecisionMaking
CyrilleBerger
2/83
Labsnewmap:Kobe2013-stationsChangeSetcontainsalltheproperties,notjustnewonesInAbstractAgentclass:protectedvoidprocessSense(KASensesense){model.merge(sense.getChangeSet());Collection<Command>heard=sense.getHearing();think(sense.getTime(),sense.getChangeSet(),heard);}Youcanoverrideit:protectedvoidprocessSense(KASensesense){//sendupdatetootheragent//usingworldmodelbeforemergesuper.processSense(sense);}
3/83
Lectures1AIProgramming:Introduction2IntroductiontoRoboRescue3AgentsandAgentsArchitecture4Multi-AgentandCommunication5Multi-AgentDecisionMaking
6CooperationAndCoordination17CooperationAndCoordination28MachineLearning9AutomatedPlanning
10PuttingItAllTogether
4/83
Lecturegoals
Multi-agentdecisioninacompetitiveenvironmentLearnabouttheconceptofutility,rationalagents,votingandauctioning
5/83
Lecturecontent
Self-InterestedAgentsSocialChoiceAuctionsSingleDimensionAuctionsCombinatorialAuctions
Self-InterestedAgents
7
UtilitiesandPreferencesAssumewehavejusttwoagents:Ag={i,j}Agentsareassumedtobeself-interested:theyhavepreferencesoverhowtheenvironmentisAssumeΩ={ω₁,ω₂,…}isthesetof“outcomes”thatagentshavepreferencesoverWecapturepreferencesbyutilityfunctions:u i=Ω→ℝuⱼ=Ω→ℝ
Utilityfunctionsleadtopreferenceorderingsoveroutcomes:ω⪰ω’meansuᵢ(ω)≥uᵢ(ω’)ω⪲ω’meansuᵢ(ω)>uᵢ(ω’)
8
Whatisutility?Utilityisnotmoney,butsimilar
9
Self-InterestedAgentsIfagentsrepresentindividualsororganizationsthenwecannotmakethebenevolenceassumption.
Agentswillbeassumedtoacttofurtherthereowninterests,possiblyatexpenseofothers.
Potentialforconflict.Maycomplicatethedesigntaskenormously.
10
MultiagentEncounters(1/2)
Weneedamodeloftheenvironmentinwhichtheseagentswillact…agentssimultaneouslychooseanactiontoperform,andasaresultoftheactionstheyselect,anoutcomeinΩwillresulttheactualoutcomedependsonthecombinationofactionsassumeeachagenthasjusttwopossibleactionsthatitcanperform,C(“cooperate”)andD(“defect”)
Environmentbehaviorgivenbystatetransformerfunction:τ:Acⁱ⨯Acʲ→Ω
11
MultiagentEncounters(2/2)
ExamplesofastatetransformerfunctionThisenvironmentissensitivetoactionsofbothagents:τ(D,D)=ω₁τ(D,C)=ω₂τ(C,D)=ω₃τ(C,C)=ω₄Neitheragenthasanyinfluenceinthisenvironment:τ(D,D)=ω₁τ(D,C)=ω₁τ(C,D)=ω₁τ(C,C)=ω₁Thisenvironmentiscontrolledbyjτ(D,D)=ω₁τ(D,C)=ω₂τ(C,D)=ω₁τ(C,C)=ω₂
12
CoordinationgameSupposewehavethecasewherebothagentscaninfluencetheoutcome,andtheyhaveutilityfunctionsasfollows:uᵢ(ω₁)=2uᵢ(ω₂)=1uᵢ(ω₃)=3uᵢ(ω₄)=4uⱼ(ω₁)=2uⱼ(ω₂)=3uⱼ(ω₃)=1uⱼ(ω₄)=4
Thisenvironmentissensitivetoactionsofbothagents:τ(D,D)=ω₁τ(D,C)=ω₂τ(C,D)=ω₃τ(C,C)=ω₄
Withabitofabuseofnotation:uᵢ(D,D)=2uᵢ(D,C)=1uᵢ(C,D)=3uᵢ(C,C)=4uⱼ(D,D)=2uⱼ(D,C)=3uⱼ(C,D)=1uⱼ(C,C)=4
Thenagenti’spreferencesare:C,C⪰ᵢC,D≻ᵢD,C⪰ᵢD,D
“C”istherationalchoicefori.
13
PayoffMatrices
Wecancharacterizethepreviousscenarioinapayoffmatrix:
AgentiisthecolumnplayerAgentjistherowplayer
14
DisgraceofGijón(WorldCup1982)
Onegameleft:Germany-Austriauᵢ(≥3-0)=2uⱼ(≥3-0)=-1uᵢ(2-0)=uᵢ(1-0)=2uⱼ(2-0)=uⱼ(1-0)=1uᵢ(a-a)=-1uⱼ(a-a)=2uᵢ(0-a)=-1uⱼ(0-a)=2(a>1)Finalscore:Germany1-0Austria
15
ThePrisoner’sDilemmaTwomenarecollectivelychargedwithacrimeandheldinseparatecells,withnowayofmeetingorcommunicating.Theyaretoldthat:ifoneconfessesandtheotherdoesnot,theconfessorwillbefreed,andtheotherwillbejailedforthreeyearsIfbothconfess,theneachwillbejailedfortwoyears
Bothprisonersknowthatifneitherconfesses,thentheywilleachbejailedforoneyear
16
ThePrisoner’sDilemmaPayoffmatrixforprisoner’sdilemma:
Topleft:Ifbothdefect,thenbothgetpunishmentformutualdefectionTopright:Ificooperatesandjdefects,igetssucker’spayoffof1,whilejgets4Bottomleft:Ifjcooperatesandidefects,jgetssucker’spayoffof1,whileigets4Bottomright:Rewardformutualcooperation
17
SolutionConcepts
Howwillarationalagentbehaveinanygivenscenario?Answeredinsolutionconcepts:dominantstrategy;Nashequilibriumstrategy;Paretooptimalstrategies;strategiesthatmaximizesocialwelfare.
18
DominantStrategies(1/2)
Givenanyparticularstrategy(eitherCorD)ofagenti,therewillbeanumberofpossibleoutcomesWesays₁dominatess₂ifeveryoutcomepossiblebyiplayings₁ispreferredovereveryoutcomepossiblebyiplayings₂ArationalagentwillneverplayadominatedstrategySoindecidingwhattodo,wecandeletedominatedstrategies
Unfortunately,thereisnotalwaysauniqueundominatedstrategy
19
DominantStrategies(2/2)
Coordinationgame:
Prisoner'sDilemna:
20
(PureStrategy)NashEquilibrium(1/2)
Ingeneral,wewillsaythattwostrategiess1ands2areinNashequilibriumif:undertheassumptionthatagentiplayss₁,agentjcandonobetterthanplays₂;andundertheassumptionthatagentjplayss₂,agenticandonobetterthanplays₁.
NeitheragenthasanyincentivetodeviatefromaNashequilibriumUnfortunately:NoteveryinteractionscenariohasaNashequilibriumSomeinteractionscenarioshavemorethanoneNashequilibrium
21
(PureStrategy)NashEquilibrium(2/2)
Coordinationgame:
Prisoner'sDilemna:
22
ParetoOptimality(1/2)AnoutcomeissaidtobeParetooptimal(orParetoefficient)ifthereisnootheroutcomethatmakesoneagentbetteroffwithoutmakinganotheragentworseoff.IfanoutcomeisParetooptimal,thenatleastoneagentwillbereluctanttomoveawayfromit(becausethisagentwillbeworseoff).
IfanoutcomeωisnotParetooptimal,thenthereisanotheroutcomeω’thatmakeseveryoneashappy,ifnothappier,thanω.“Reasonable”agentswouldagreetomovetoω’inthiscase.(EvenifIdon’tdirectlybenefitfromω,youcanbenefitwithoutmesuffering.)
23
ParetoOptimality(2/2)Coordinationgame:
Prisoner'sDilemna:
24
SocialWelfare(1/2)Thesocialwelfareofanoutcomeωisthesumoftheutilitiesthateachagentgetsfromω:
Thinkofitasthe“totalamountofutilityinthesystem”.Asasolutionconcept,maybeappropriatewhenthewhole
system(allagents)hasasingleowner(thenoverallbenefitofthesystemisimportant,notindividuals).
25
SocialWelfare(2/2)Coordinationgame:
Prisoner'sDilemna:
26
ThePrisoner’sDilemmaSolutionconceptsDisadominantstrategy.(D,D)istheonlyNashequilibrium.Alloutcomesexcept(C,C)areParetooptimal.(C,C)maximizessocialwelfare.
Theindividualrationalactionisdefect
Thisguaranteesapayoffofnoworsethan2,whereascooperatingguaranteesapayoffofatmost1.Sodefectionisthebestresponsetoallpossiblestrategies:bothagentsdefect,andgetpayoff=2Butintuitionsaysthisisnotthebestoutcome:Surelytheyshouldbothcooperateandeachgetpayoffof3!
27
ThePrisoner’sDilemma
Thisapparentparadoxisthefundamental
problemofmulti-agentinteractions.
Itappearstoimplythatcooperationwillnot
occurinsocietiesofself-interestedagents.Realworldexamples:nucleararmsreduction(“whydon’tIkeepmine...”)freeridersystems—publictransport;televisionlicenses.
Canwerecovercooperation?
28
TheIteratedPrisoner’sDilemma
Oneanswer:playthegamemorethanonceIfyouknowyouwillbemeetingyouropponentagain,thentheincentivetodefectappearstoevaporateCooperationistherationalchoiceintheinfinitelyrepeatedprisoner’sdilemma
29
BackwardsInductionBut…,supposeyoubothknowthatyouwillplaythegameexactlyntimesOnroundn-1,youhaveanincentivetodefect,togainthatextrabitofpayoff…Butthismakesroundn–2thelast“real”,andsoyouhaveanincentivetodefectthere,too.Thisisthebackwardsinductionproblem.
Playingtheprisoner’sdilemmawithafixed,finite,pre-determined,commonlyknownnumberofrounds,defectionisthebeststrategy
30
Axelrod’sTournament
Supposeyouplayiteratedprisoner’sdilemmaagainstarangeofopponents…Whatstrategyshouldyouchoose,soastomaximizeyouroverallpayoff?Axelrod(1984)investigatedthisproblem,withacomputertournamentforprogramsplayingtheprisoner’sdilemma
31
StrategiesinAxelrod’sTournament
RANDOMALLD:“Alwaysdefect”—thehawkstrategy;TIT-FOR-TAT:Onroundu=0,cooperateOnroundu>0,dowhatyouropponentdidonroundu–1
TESTER:On1stround,defect.Iftheopponentretaliated,thenplayTIT-FOR-TAT.Otherwiseinterspersecooperationanddefection.
JOSS:AsTIT-FOR-TAT,exceptperiodicallydefect
32
Axelrod’sTournamentresults
TIT-FOR-TATwonthefirsttournamentAsecondtournamentwascalledTIT-FOR-TATwonthesecondtournamentaswell
33
RecipesforSuccessinAxelrod’sTournament
Axelrodsuggeststhefollowingrulesforsucceedinginhistournament:Don’tbeenvious:Don’tplayasifitwerezerosum!Benice:Startbycooperating,andreciprocatecooperationRetaliateappropriately:Alwayspunishdefectionimmediately,butuse“measured”force—don’toverdoitDon’tholdgrudges:Alwaysreciprocatecooperationimmediately
34
CompetitiveandZero-SumInteractions
WherepreferencesofagentsarediametricallyopposedwehavestrictlycompetitivescenariosZero-sumencountersarethosewhereutilitiessumtozero:uᵢ(ω)+u (jω)=0forallω∊Ω
ZerosumimpliesstrictlycompetitiveZerosumencountersinreallifeareveryrare,butpeopletendtoactinmanyscenariosasiftheywerezerosum
35
MatchingPennies
Playersiandjsimultaneouslychoosethefaceofacoin,either“heads”or“tails”.Iftheyshowthesameface,theniwins,whileiftheyshowdifferentfaces,thenjwins.
36
MixedStrategiesforMatchingPennies
NopairofstrategiesformsapurestrategyNashEquilibrium:whateverpairofstrategiesischosen,somebodywillwishtheyhaddonesomethingelse.Thesolutionistoallowmixedstrategies:play“heads”withprobability0.5play“tails”withprobability0.5.
ThisisaNashEquilibriumstrategy.
37
MixedStrategies
Amixedstrategyhastheformplayα₁withprobabilityp₁playα₂withprobabilityp2₂...playαkwithprobabilitypk.thatp₁+p₂+…+pₖ=1.NashprovedthateveryfinitegamehasaNashequilibriuminmixedstrategies.
SocialChoice
39
SocialChoice
Socialchoicetheoryisconcernedwithgroupdecisionmaking.Classicexampleofsocialchoicetheory:voting.Formally,theissueiscombiningpreferencestoderiveasocialoutcome.
40
ComponentsofaSocialChoiceModel
AssumeasetAg={1,…,n}ofvoters.Thesearetheentitieswhoexpressespreferences.VotersmakegroupdecisionswrtasetΩ={ω₁,ω₂,…}ofoutcomes.Thinkoftheseasthecandidates.If|Ω|=2,wehaveapairwiseelection.
41
Preferences
EachvoterhaspreferencesoverW:anorderingoverthesetofpossibleoutcomesΩ.Example,Suppose:Ω={gin,rum,brandy,whisky}thenwemighthaveagentiwithpreferenceorder:ωᵢ=(brandy,rum,gin,whisky)
meaning:brandy>ᵢrum>ᵢgin>ᵢwhisky
42
PreferenceAggregationThefundamentalproblemofsocialchoicetheory:Givenacollectionofpreferenceorders,oneforeachvoter,howdowecombinethesetoderiveagroupdecision,thatreflectsascloselyaspossiblethepreferencesofvoters?variantsofpreferenceaggregation:socialwelfarefunctions;socialchoicefunctions.
43
SocialWelfareFunctionsLetП(Ω)bethesetofpreferenceorderingsoverΩ.Asocialwelfarefunctiontakesthevoterpreferencesandproducesasocialpreferenceorder:
Wedefine≻*astheoutcomeofasocialwelfarefunctionwhisky≻*gin≻*brandy≻*rum≻*ginS≻*M≻*SD≻*MP≻*C≻*V≻*FP≻*KD≻*FI≻*PP
44
SocialChoiceFunctions
Sometimes,wewantjusttoselectoneofthepossiblecandidates,ratherthanasocialorder.Thisgivessocialchoicefunctions:
Example:presidentialelection.
45
VotingProcedures:Plurality
Socialchoicefunction:selectsasingleoutcome.Eachvotersubmitspreferences.Eachcandidategetsonepointforeverypreferenceorderthatranksthemfirst.Winneristheonewithlargestnumberofpoints.Example:PoliticalelectionsinUK,France,USA...
Ifwehaveonlytwocandidates,thenpluralityisasimplemajorityelection.
46
AnomalieswithPlurality
Suppose|Ag|=100andΩ={ω₁,ω₂,ω₃}with:40%votersvotingforω₁30%ofvotersvotingforω₂30%ofvotersvotingforω₃
Withplurality,ω₁getselectedeventhoughaclearmajority(60%)preferanothercandidate!
47
StrategicManipulationbyTacticalVoting
Supposeyourpreferencesareω₁≻ω₂≻ω₃
whileyoubelieve49%ofvotershavepreferencesω₂≻ω₁≻ω₃
andyoubelieve49%havepreferenceω₃≻ω₂≻ω₁
Youmaydobettervotingforw2,eventhoughthisisnotyourtruepreferenceprofile.Thisistacticalvoting:anexampleofstrategicmanipulationofthevote.Especiallyaproblemintwolegselections
48
Condorcet’sParadoxSupposeAg={1,2,3}andΩ={ω₁,ω₂,ω₃}with:ω₁≻₁ω₂≻₁ω₃ω₂≻₂ω₃≻₂ω₁ω₃≻₃ω₁≻₃ω₂
Foreverypossiblecandidate,thereisanothercandidatethatispreferredbyamajorityofvoters!ThisisCondorcet’sparadox:therearesituationsinwhich,nomatterwhichoutcomewechoose,amajorityofvoterswillbeunhappywiththeoutcomechosen.
49
Applicationsofsocialchoicetheory
MainapplicationisforhumanchoiceanddecisionmakingResultsaggregationaggregatetheoutputofseveralsearchengines Auctions
51
ApplicationofauctionsWiththeriseoftheInternet,auctionshavebecomepopularinmanye-commerceapplications(e.g.eBay)Auctionsareanefficienttoolforreachingagreementsinasocietyofself-interestedagentsForexample,bandwidthallocationonanetwork,sponsorlinks
AuctionscanbeusedforefficientresourceallocationwithindecentralizedcomputationalsystemsFrequentlyutilizedforsolvingmulti-agentandmulti-robotcoordinationproblemsForexample,team-basedexplorationofunknownterrain
52
WhatisanAuction?
AnauctiontakesplacebetweenanagentknownastheauctioneerandacollectionofagentsknownasthebiddersThegoaloftheauctionisfortheauctioneertoallocateallgoodstothebiddersTheauctioneerdesirestomaximizethepriceandbiddersdesiretominimizetheprice
53
LimitPriceEachtraderhasavalueorlimitpricethattheyplaceonthegood.Abuyerwhoexchangesmorethantheirlimitpriceforagoodmakesaloss.Asellerwhoexchangesagoodforlessthantheirlimitpricemakesaloss.
Limitpricesclearlyhaveaneffectonthebehavioroftraders.Thereareseveralmodels,embodyingdifferentassumptionsaboutthenatureofthegood.
54
LimitPricePrivatevalueGoodhasanvaluetomethatisindependentofwhatitisworthtoyou.TextbookgivestheexampleofJohnLennon’slastdollarbill.
CommonvalueThegoodhasthesamevaluetoallofus,butwehavedifferingestimatesofwhatitis.Winner’scurse
CorrelatedvalueOurvaluesarerelated.Themoreyouarepreparedtopay,themoreIshouldbepreparedtopay.
55
Winner'scurseTermedinthe1950s:OilcompaniesbidfordrillingrightsintheGulfofProblemwasthebiddingprocessgiventheuncertaintiesinestimatingthepotentialvalueofanoffshoreoilfieldCompetitivebiddinginhighrisksituations,byCapen,ClappandCampbell,JournalofPetroleumTechnology,1971
ForexampleAnoilfieldhadanactualintrinsicvalueof$10Oilcompaniesmightguessitsvaluetobeanywherefrom$5millionto$20Thecompanywhowronglyestimatedat$20millionandplacedabidatthatlevelwouldwintheauction,andlaterfindthatitwasnotworththatmuch
Inmanycasesthewinneristhepersonwhohasoverestimatedthemost⇒“TheWinner’scurse”BidShading:Offerbidbelowacertainamountofthevaluation
56
AuctionCharacteristicsAuctionprocedureOneshot:OnlyonebiddingAscending:Auctioneerbeginsatminimumprice,biddersincreaseDescending:Auctioneerbeginsatpriceovervalueofgoodandlowersthepriceateachround
Continuous:Internet
AuctionsmaybeStandardAuction:OnesellerandmultipleReverseAuction:OnebuyerandmultipleDoubleAuction:Multiplesellersandmultiple
CombinatorialAuctionsBuyersandsellersmayhavecombinatorialvaluationsforbundlesof
57
SingleversusMulti-dimensional
SingledimensionalauctionsTheonlycontentofanofferarethepriceandquantityofsomespecifictypeofgood.“I’llbid$200forthose2chairs”
MultidimensionalauctionsOfferscanrelatetomanydifferentaspectsofmanydifferentgoods.“I’mpreparedtopay$200forthosetworedchairs,but$300ifyoucandeliverthemtomorrow.”Frequencyrangesforcellphones
SingleDimensionAuctions
59
EnglishAuctionAnexampleoffirst-priceopen-cryascendingauctionsProtocol:AuctioneerstartsbyofferingthegoodatalowAuctioneeroffershigherpricesuntilnoagentiswillingtopaytheproposedlevel
Thegoodisallocatedtotheagentthatmadethehighest
PropertiesGeneratescompetitionbetweenbidders(generatesrevenueforthesellerwhenbiddersareuncertainoftheirvaluation)
Dominantstrategy:Bidslightlymorethancurrentbit,withdrawifbidreachespersonalvaluationofgood
Winner’scurse(forcommonvaluegoods)
60
DutchAuctionDutchauctionsareexamplesoffirst-priceopen-crydescendingauctionsProtocol:AuctioneerstartsbyofferingthegoodatartificiallyhighvalueAuctioneerlowersofferpriceuntilsomeagentmakesabidequaltothecurrentofferpriceThegoodisthenallocatedtotheagentthatmadetheoffer
PropertiesItemsaresoldrapidly(cansellmanylotswithinasingleday)Intuitivestrategy:waitforalittlebitafteryourtruevaluationhasbeencalledandhopenooneelsegetsintherebeforeyou(nogeneraldominantstrategy)Winner’scursealsopossible
61
First-PriceSealed-BidAuctions
First-pricesealed-bidauctionsareone-shotauctions:Protocol:WithinasingleroundbidderssubmitasealedbidforthegoodThegoodisallocatedtotheagentthatmadehighestbidWinnerpaysthepriceofhighestbidOftenusedincommercialauctions,e.g.,publicbuildingcontractsetc.
Problem:thedifferencebetweenthehighestandsecondhighestbidis“wastedmoney”(thewinnercouldhaveofferedless)Intuitivestrategy:bidalittlebitlessthanyourtruevaluation(nogeneraldominantstrategy)Asmorebiddersassmallerthedeviationshouldbe!
62
VickreyAuctionsProposedbyWilliamVickreyin1961(NobelPrizeinEconomicSciencesin1996)
Vickreyauctionsareexamplesofsecond-pricesealed-bidone-shotProtocol:withinasingleroundbidderssubmitasealedbidforthegoodgoodisallocatedtoagentthatmadehighestbidwinnerpayspriceofsecondhighestbid
Dominantstrategy:bidyourtrueifyoubidmore,yourisktopaytoomuchifyoubidless,youloweryourchancesofwinningwhilestillhavingtopaythesamepriceincaseyouwin
Antisocialbehavior:bidmorethanyourtruevaluationtomakeopponentssuffer(not“rational”)
Forprivatevalueauctions,strategicallyequivalenttotheEnglishauctionmechanism
63
Generalizedfirstpriceauctions
UsedbyYahoofor“sponsoredlinks”auctionsIntroducedin1997forsellingInternetadvertisingbyYahoo/Overture(beforetherewereonly“bannerads”)Advertiserssubmitabidreportingthewillingnesstopayonaper-clickbasisforaparticularkeywordCost-Per-Click(CPC)bid
Advertiserswerebilledforeach“click”onsponsoredlinksleadingtotheirpageThelinkswerearrangedindescendingorderofbids,makinghighestbidsthemostprominentAuctionstakeplaceduringeach
However,auctionmechanismturnedouttobeunstable!Biddersrevisedtheirbidsasoftenaspossible
64
Generalizedsecondpriceauctions
IntroducedbyGoogleforpricingsponsoredlinks(AdWordsSelect)Observation:BiddersgenerallydonotwanttopaymuchmorethantherankbelowthemTherefore:2ndpriceauctionFurthermodifications:AdvertisersbidforkeywordsandkeywordcombinationsRank:CPC_BIDXqualityscorePrice:withrespecttolowerranks
http://www.chipkin.com/google-adwords-actual-cpc-calculation/AfterseeingGoogle’ssuccess,Yahooalsoswitchedtosecondpriceauctionsin2002
CombinatorialAuctions
66
CombinatorialAuctionsInacombinatorialauction,theauctioneerputsseveralgoodsonsaleandtheotheragentssubmitbidsforentirebundlesofgoodsGivenasetofbids,thewinnerdeterminationproblemistheproblemofdecidingwhichofthebidstoacceptThesolutionmustbefeasible(nogoodmaybeallocatedtomorethanoneagent)Ideally,itshouldalsobeoptimal(inthesenseofmaximizingrevenuefortheauctioneer)Achallengingalgorithmicproblem
67
ComplementsandSubstitutes
ThevalueanagentassignstoabundleofgoodsmaydependonthecombinationComplements:ThevalueassignedtoasetisgreaterthanthesumofthevaluesassignstoitselementsExample:„apairofshoes”(leftshoeandarightshoe)
Substitutes:ThevalueassignedtoasetislowerthanthesumofthevaluesassignedtoitselementsExample:atickettothetheatreandanotheronetoafootballmatchforthesamenight
Insuchcasesanauctionmechanismallocatingoneitematatimeisproblematicsincethebestbiddingstrategyinoneauctionmaydependontheoutcomeofotherauctions
68
ProtocolOneauctioneer,severalbidders,andmanyitemstobesoldEachbiddersubmitsanumberofpackagebidsspecifyingthevaluation(price)thebidderispreparedtopayforaparticularbundleTheauctioneerannouncesanumberofwinningThewinningbidsdeterminewhichbidderobtainswhichitem,andhowmucheachbidderhastopayNoitemmaybeallocatedtomorethanonebidder
Examplesofpackagebids:Agent1:({a,b},5),({b,c},7),({c,d},6)
Agent2:({a,d},7),({a,c,d},8)
Agent3:({b},5),({a,b,c,d},12)
Generally,thereare2n−1non-emptybundlesfornitems,howtocomputetheoptimalsolution?
69
OptimalWinnerDeterminationAlgorithm
AnauctioneerhasasetofitemsM={1,2,…,m}toThereareN={1,2,…,n}buyersplacingbidsBuyerssubmitasetofpackagebidsB={B1,B2,…,Bn}ApackagebidisatupleB=[S,v(S)],whereS⊆Misasetofitems(bundle)andvi(S)>0buyer’sitruevaluationxS,i∈{0,1}isadecisionvariableforassigningbundleStobuyeriThewinnerdeterminationproblem(WDP)istolabelthebidsaswinningorlosing(bydecidingeachxs,isoastomaximizethesumofthetotalacceptedbidprice)ThisisNP-Complete!Canbesolvedwithanintegerprogramsolver,orheuristicsearch
70
SolvingWDPsbyHeuristicSearch
TwowaysofrepresentingthestateBranch-on-items:AstateisasetofitemsforwhichanallocationdecisionhasalreadybeenmadeBranchingiscarriedoutbyaddingafurtheritem
Branch-on-bids:AstateisasetofbidsforwhichanacceptancedecisionhasalreadybeenmadeBranchingiscarriedoutbyaddingafurtherbid
71
Branch-on-ItemsBranchingbasedonthequestion:“Whatbidshouldthisitembeassignedto?”EachpathinthesearchtreeconsistsofasequenceofdisjointbidsBidsthatdonotshareitemswitheachotherApathendswhennobidcanbeaddedtoit
Costsateachnodearethesumofthepricesofthebidsacceptedonthepath
72
Problemwithbranch-on-items
Whatiftheauctioneer'srevenuecanincreasebykeepingitems?Example:
Thereisnobidfor1,$5bidfor2,$3bidfor{1;2}
Thus,bettertokeep1andsell2thansellingTheauctioneer'spossibilityofkeepingitemscanbeimplementedbyplacingdummybidsofpricezeroonthoseitemsthatreceivedno1-itembids(Sandholm2002)
73
Exampleofbranch-on-items
Bids:{1,2},{2,3},{3},{1;3}WeaddDummyBids:{1},{2}
74
Branch-on-bidsBranchingisbasedonthequestion:“Shouldthisbidbeacceptedorrejected?“Binarytree
Whenbranchingonabid,thechildreninthesearchtreearetheworldwherethatbidisaccepted(IN),andtheworldwherethatbidisrejected(OUT)NodummybidsareneededFirstabidgraphisconstructedthatrepresentsallconstraintsbetweenthebidsThen,bidsareaccepted/rejecteduntilallbidshavebeenhandledOnaccept:removeallconstrainedbidsfromthegraph
Onreject:removebiditselffromthegraph
75
Branch-on-bids-Example
Bids:{1,2},{2,3},{3},{1;3}
76
HeuristicFunctionForanynodeNinthesearchtree,letg(N)betherevenuegeneratedbybidsthatwereacceptedaccordinguntilNTheheuristicfunctionh(N)estimatesforeverynodeNhowmuchadditionalrevenuecanbeexpectedongoingfromNAnupperboundonh(N)isgivenbythesumoverthemaximumcontributionofthesetofunallocateditemsA:
Tighterboundscanbeobtainedbysolvingthelinearprogramrelaxationoftheremainingitems(Sandholm2006)
77
AuctionsforMulti-RobotExploration
Considerateamofmobilerobotsthathastovisitanumberofgiventargets(locations)ininitiallypartiallyunknownterrainExamplesofsuchtasksarecleaningmissions,space-exploration,surveillance,andsearchandrescueContinuousre-allocationoftargetstorobotsisnecessaryForexample,robotsmightdiscoverthattheyareseparatedbyablockagefromtheirtarget
Toallocateandre-allocatethetargetsamongthemselves,therobotscanuseauctionswheretheysellandbuytargetsTeamobjectivecanbetominimizethesumofallpathcosts,hence,biddingpricesareestimatedtravelcostsThepathcostofarobotisthesumoftheedgecostsalongitspath,fromitscurrentlocationtothelasttargetthatitvisits
78
Multi-RobotExploration
ThreerobotsexploringMars.Therobots’taskistogatherdataaroundthefourcraters,e.g.tovisitthehighlightedtargetsites.Source:N.Kalra
79
GeneralExplorationRobotalwaysfollowaminimumcostpaththatvisitsallallocatedtargetsWheneverarobotgainsmoreinformationabouttheterrain,itsharesthisinformationwiththeotherrobotsIftheremainingpathofatleastonerobotisblocked,thenallrobotsputtheirunvisitedtargetsupforauctionTheauction(s)closeafterapredeterminedamountoftimeConstraints:eachrobotwinsatmostonebundleandeachtargetiscontainedinexactlyonebundle
Aftereachauction,robotsgainednewtargetsorexchangedtargetswithotherrobotsThen,thecyclerepeats
80
Single-RoundCombinatorialAuction
Protocol:EveryrobotbidsallpossiblebundlesoftargetsThevaluationistheestimatedsmallestpathcostneededtovisitalltargetsinthebundle(TSP)Acentralauctioneerdeterminesandinformsthewinningrobotswithinoneround
Optimalteamperformance:Combinatorialauctionstakeallpositiveandnegativesynergiesbetweentargetsintoaccount
MinimizationofthetotalpathcostsDrawbacks:RobotscannotbidonallpossiblebundlesoftargetsbecausethenumberofpossiblebundlesisexponentialinthenumberoftargetsTocalculatecostsforeachbundlerequirestocalculatethesmallestpathcostforvisitingasetoftargets(TravelingSalesmanProblem)WinnerdeterminationisNP-hard
81
ParallelSingle-ItemAuctions
Protocol:Everyrobotbidsoneachtargetinparalleluntilalltargetsareasigned
Thevaluationisthesmallestpathcostfromtherobotscurrentpositiontothetarget
SimilartoTargetClustering
Advantage:Simpletoimplementandcomputationandcommunicationefficient
Disadvantage:Theteamperformancecanbehighlysuboptimalsinceitdoesnottakeanysynergiesbetweenthetargetsintoaccount
82
SequentialSingle-ItemAuctions
Protocol:TargetsareauctionedafterthesequenceT1,T2,T3,T4,…ThevaluationistheincreaseinitssmallestpathcostthatresultsfromwinningtheauctionedtargetTherobotwiththeoverallsmallestbidisallocatedthecorrespondingtargetFinally,eachrobotcalculatestheminimum-costpathforvisitingallofitstargetsandmovesalongthispath
Advantages:Hillclimbingsearch:somesynergiesbetweentargetsaretakenintoaccount(butnotallofthem)SimpletoimplementandcomputationandcommunicationefficientSincerobotscandeterminethewinnersbylisteningtothebids(andidentifyingthesmallestbid)themethodcanbeexecuteddecentralized
Disadvantages:Orderoftargetschangetheresult
83/83
SummaryUtilitiesandcompetitiveVotingmechanismWediscussedEnglish,Dutch,First-PriceSealed-Bid,andVickreyauctionsGeneralizedsecondpriceauctionshaveshowngoodpropertiesinpractice,however,“truthtelling”isnotadominantstrategyCombinatorialauctionsareamechanismtoallocateanumberofgoodstoanumberofagents